entropy-logo

Journal Browser

Journal Browser

Information Theory in Molecular Evolution: From Models to Structures and Dynamics

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory, Probability and Statistics".

Deadline for manuscript submissions: closed (1 December 2020) | Viewed by 27553

Special Issue Editor


E-Mail Website
Guest Editor
1. Evolutionary Information Laboratory, Department of Biological Sciences, University of Texas at Dallas, Richardson, TX, USA
2. Center for Systems Biology, University of Texas at Dallas, Richardson, TX, USA
3. Department of Bioengineering, University of Texas at Dallas, Richardson, TX, USA
Interests: statistical inference; information theory; molecular evolution; structural bioinformatics; protein folding and function; biological networks; systems biology

Special Issue Information

Dear Colleagues,

The modern biological sciences are driven by information. Large amounts of experimental data are collected and synthetized to create models to explain the complexity of biological systems. In addition, inter- and intra-cellular information processing is key to understanding cellular physiology and disease. The study of evolution, and in particular molecular evolution, has benefited from information theoretical insights since the foundational work of Ronald A. Fisher. In recent years, there has been a growing interest in using tools from information theory and statistical physics to quantify and model evolutionary processes. An integration of quantitative evolutionary models with structural aspects of biomolecules has energized scientific contributions and discovery. Applications include the fields of protein structure prediction; protein folding; conformational plasticity in molecules; chromosome architecture and epistasis. Modern approaches also look at the study of dynamics and interactions within complexes that facilitate molecular recognition and catalytic specificity.

This Special Issue aims to collect novel contributions in this interdisciplinary field. We are especially interested in submissions that use information theoretical concepts as a core but are tightly integrated with the study of molecular processes. Applications may include novel evolutionary models, the application of phylogenetic signals to elucidate the biomolecular structure and function, and biomolecule engineering inspired by evolutionary cues. Also of interest for this issue are applications of entropy to the study of de novo gene birth, including the emergence of essential, taxonomically-restricted genes, as well as the dynamics of biomolecules, including molecular dynamics and biophysical modeling. Finally, biomedical applications related to mutational change and the use of statistical techniques to study viral evolution and disease are encouraged.

Appropriate submissions are encouraged from scientists with diverse and interdisciplinary backgrounds. Submissions should include examples of current or potential applications of entropy in biology or medicine with a special focus on molecular evolution. We encourage authors to make their contribution accessible to a wide range of science graduates, without compromising scientific content or flow. We also encourage the addition of a supplementary short (e.g. three minute) video that explains in plain language the general significance of the major finding(s).

Dr. Faruck Morcos
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Evolutionary models
  • Information metrics in evolution
  • Amino acid coevolution
  • Information content in molecular sequences
  • Evolutionary landscapes
  • Structure and dynamics of biomolecules
  • Protein engineering
  • Information processing in biological networks

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

3 pages, 184 KiB  
Editorial
Information Theory in Molecular Evolution: From Models to Structures and Dynamics
by Faruck Morcos
Entropy 2021, 23(4), 482; https://doi.org/10.3390/e23040482 - 19 Apr 2021
Cited by 1 | Viewed by 1847
Abstract
Historically, information theory has been closely interconnected with evolutionary theory [...] Full article

Research

Jump to: Editorial

19 pages, 15942 KiB  
Article
ELIHKSIR Web Server: Evolutionary Links Inferred for Histidine Kinase Sensors Interacting with Response Regulators
by Claude Sinner, Cheyenne Ziegler, Yun Ho Jung, Xianli Jiang and Faruck Morcos
Entropy 2021, 23(2), 170; https://doi.org/10.3390/e23020170 - 30 Jan 2021
Cited by 3 | Viewed by 2409
Abstract
Two-component systems (TCS) are signaling machinery that consist of a histidine kinases (HK) and response regulator (RR). When an environmental change is detected, the HK phosphorylates its cognate response regulator (RR). While cognate interactions were considered orthogonal, experimental evidence shows the prevalence of [...] Read more.
Two-component systems (TCS) are signaling machinery that consist of a histidine kinases (HK) and response regulator (RR). When an environmental change is detected, the HK phosphorylates its cognate response regulator (RR). While cognate interactions were considered orthogonal, experimental evidence shows the prevalence of crosstalk interactions between non-cognate HK–RR pairs. Currently, crosstalk interactions have been demonstrated for TCS proteins in a limited number of organisms. By providing specificity predictions across entire TCS networks for a large variety of organisms, the ELIHKSIR web server assists users in identifying interactions for TCS proteins and their mutants. To generate specificity scores, a global probabilistic model was used to identify interfacial couplings and local fields from sequence information. These couplings and local fields were then used to construct Hamiltonian scores for positions with encoded specificity, resulting in the specificity score. These methods were applied to 6676 organisms available on the ELIHKSIR web server. Due to the ability to mutate proteins and display the resulting network changes, there are nearly endless combinations of TCS networks to analyze using ELIHKSIR. The functionality of ELIHKSIR allows users to perform a variety of TCS network analyses and visualizations to support TCS research efforts. Full article
Show Figures

Figure 1

17 pages, 3541 KiB  
Article
Allostery and Epistasis: Emergent Properties of Anisotropic Networks
by Paul Campitelli and S. Banu Ozkan
Entropy 2020, 22(6), 667; https://doi.org/10.3390/e22060667 - 16 Jun 2020
Cited by 11 | Viewed by 3483
Abstract
Understanding the underlying mechanisms behind protein allostery and non-additivity of substitution outcomes (i.e., epistasis) is critical when attempting to predict the functional impact of mutations, particularly at non-conserved sites. In an effort to model these two biological properties, we extend the framework of [...] Read more.
Understanding the underlying mechanisms behind protein allostery and non-additivity of substitution outcomes (i.e., epistasis) is critical when attempting to predict the functional impact of mutations, particularly at non-conserved sites. In an effort to model these two biological properties, we extend the framework of our metric to calculate dynamic coupling between residues, the Dynamic Coupling Index (DCI) to two new metrics: (i) EpiScore, which quantifies the difference between the residue fluctuation response of a functional site when two other positions are perturbed with random Brownian kicks simultaneously versus individually to capture the degree of cooperativity of these two other positions in modulating the dynamics of the functional site and (ii) DCIasym, which measures the degree of asymmetry between the residue fluctuation response of two sites when one or the other is perturbed with a random force. Applied to four independent systems, we successfully show that EpiScore and DCIasym can capture important biophysical properties in dual mutant substitution outcomes. We propose that allosteric regulation and the mechanisms underlying non-additive amino acid substitution outcomes (i.e., epistasis) can be understood as emergent properties of an anisotropic network of interactions where the inclusion of the full network of interactions is critical for accurate modeling. Consequently, mutations which drive towards a new function may require a fine balance between functional site asymmetry and strength of dynamic coupling with the functional sites. These two tools will provide mechanistic insight into both understanding and predicting the outcome of dual mutations. Full article
Show Figures

Figure 1

26 pages, 9455 KiB  
Article
Dynamical Behavior of β-Lactamases and Penicillin- Binding Proteins in Different Functional States and Its Potential Role in Evolution
by Feng Wang, Hongyu Zhou, Xinlei Wang and Peng Tao
Entropy 2019, 21(11), 1130; https://doi.org/10.3390/e21111130 - 19 Nov 2019
Cited by 6 | Viewed by 3432
Abstract
β-Lactamases are enzymes produced by bacteria to hydrolyze β-lactam-based antibiotics, and pose serious threat to public health through related antibiotic resistance. Class A β-lactamases are structurally and functionally related to penicillin-binding proteins (PBPs). Despite the extensive studies of the structures, catalytic mechanisms and [...] Read more.
β-Lactamases are enzymes produced by bacteria to hydrolyze β-lactam-based antibiotics, and pose serious threat to public health through related antibiotic resistance. Class A β-lactamases are structurally and functionally related to penicillin-binding proteins (PBPs). Despite the extensive studies of the structures, catalytic mechanisms and dynamics of both β-lactamases and PBPs, the potentially different dynamical behaviors of these proteins in different functional states still remain elusive in general. In this study, four evolutionarily related proteins, including TEM-1 and TOHO-1 as class A β-lactamases, PBP-A and DD-transpeptidase as two PBPs, are subjected to molecular dynamics simulations and various analyses to characterize their dynamical behaviors in different functional states. Penicillin G and its ring opening product serve as common ligands for these four proteins of interest. The dynamic analyses of overall structures, the active sites with penicillin G, and three catalytically important residues commonly shared by all four proteins reveal unexpected cross similarities between Class A β-lactamases and PBPs. These findings shed light on both the hidden relations among dynamical behaviors of these proteins and the functional and evolutionary relations among class A β-lactamases and PBPs. Full article
Show Figures

Graphical abstract

14 pages, 1914 KiB  
Article
Coevolutionary Analysis of Protein Subfamilies by Sequence Reweighting
by Duccio Malinverni and Alessandro Barducci
Entropy 2019, 21(11), 1127; https://doi.org/10.3390/e21111127 - 16 Nov 2019
Cited by 10 | Viewed by 3927
Abstract
Extracting structural information from sequence co-variation has become a common computational biology practice in the recent years, mainly due to the availability of large sequence alignments of protein families. However, identifying features that are specific to sub-classes and not shared by all members [...] Read more.
Extracting structural information from sequence co-variation has become a common computational biology practice in the recent years, mainly due to the availability of large sequence alignments of protein families. However, identifying features that are specific to sub-classes and not shared by all members of the family using sequence-based approaches has remained an elusive problem. We here present a coevolutionary-based method to differentially analyze subfamily specific structural features by a continuous sequence reweighting (SR) approach. We introduce the underlying principles and test its predictive capabilities on the Response Regulator family, whose subfamilies have been previously shown to display distinct, specific homo-dimerization patterns. Our results show that this reweighting scheme is effective in assigning structural features known a priori to subfamilies, even when sequence data is relatively scarce. Furthermore, sequence reweighting allows assessing if individual structural contacts pertain to specific subfamilies and it thus paves the way for the identification specificity-determining contacts from sequence variation data. Full article
Show Figures

Figure 1

20 pages, 1740 KiB  
Article
Toward Inferring Potts Models for Phylogenetically Correlated Sequence Data
by Edwin Rodriguez Horta, Pierre Barrat-Charlaix and Martin Weigt
Entropy 2019, 21(11), 1090; https://doi.org/10.3390/e21111090 - 07 Nov 2019
Cited by 13 | Viewed by 3651
Abstract
Global coevolutionary models of protein families have become increasingly popular due to their capacity to predict residue–residue contacts from sequence information, but also to predict fitness effects of amino acid substitutions or to infer protein–protein interactions. The central idea in these models is [...] Read more.
Global coevolutionary models of protein families have become increasingly popular due to their capacity to predict residue–residue contacts from sequence information, but also to predict fitness effects of amino acid substitutions or to infer protein–protein interactions. The central idea in these models is to construct a probability distribution, a Potts model, that reproduces single and pairwise frequencies of amino acids found in natural sequences of the protein family. This approach treats sequences from the family as independent samples, completely ignoring phylogenetic relations between them. This simplification is known to lead to potentially biased estimates of the parameters of the model, decreasing their biological relevance. Current workarounds for this problem, such as reweighting sequences, are poorly understood and not principled. Here, we propose an inference scheme that takes the phylogeny of a protein family into account in order to correct biases in estimating the frequencies of amino acids. Using artificial data, we show that a Potts model inferred using these corrected frequencies performs better in predicting contacts and fitness effect of mutations. First, only partially successful tests on real protein data are presented, too. Full article
Show Figures

Figure 1

16 pages, 372 KiB  
Article
Phylogenetic Weighting Does Little to Improve the Accuracy of Evolutionary Coupling Analyses
by Adam J. Hockenberry and Claus O. Wilke
Entropy 2019, 21(10), 1000; https://doi.org/10.3390/e21101000 - 12 Oct 2019
Cited by 10 | Viewed by 3851
Abstract
Homologous sequence alignments contain important information about the constraints that shape protein family evolution. Correlated changes between different residues, for instance, can be highly predictive of physical contacts within three-dimensional structures. Detecting such co-evolutionary signals via direct coupling analysis is particularly challenging given [...] Read more.
Homologous sequence alignments contain important information about the constraints that shape protein family evolution. Correlated changes between different residues, for instance, can be highly predictive of physical contacts within three-dimensional structures. Detecting such co-evolutionary signals via direct coupling analysis is particularly challenging given the shared phylogenetic history and uneven sampling of different lineages from which protein sequences are derived. Current best practices for mitigating such effects include sequence-identity-based weighting of input sequences and post-hoc re-scaling of evolutionary coupling scores. However, numerous weighting schemes have been previously developed for other applications, and it is unknown whether any of these schemes may better account for phylogenetic artifacts in evolutionary coupling analyses. Here, we show across a dataset of 150 diverse protein families that the current best practices out-perform several alternative sequence- and tree-based weighting methods. Nevertheless, we find that sequence weighting in general provides only a minor benefit relative to post-hoc transformations that re-scale the derived evolutionary couplings. While our findings do not rule out the possibility that an as-yet-untested weighting method may show improved results, the similar predictive accuracies that we observe across conceptually distinct weighting methods suggests that there may be little room for further improvement on top of existing strategies. Full article
Show Figures

Figure 1

20 pages, 4300 KiB  
Article
Non-Linear Dynamics Analysis of Protein Sequences. Application to CYP450
by Xavier F. Cadet, Reda Dehak, Sang Peter Chin and Miloud Bessafi
Entropy 2019, 21(9), 852; https://doi.org/10.3390/e21090852 - 31 Aug 2019
Cited by 4 | Viewed by 4206
Abstract
The nature of changes involved in crossed-sequence scale and inner-sequence scale is very challenging in protein biology. This study is a new attempt to assess with a phenomenological approach the non-stationary and nonlinear fluctuation of changes encountered in protein sequence. We have computed [...] Read more.
The nature of changes involved in crossed-sequence scale and inner-sequence scale is very challenging in protein biology. This study is a new attempt to assess with a phenomenological approach the non-stationary and nonlinear fluctuation of changes encountered in protein sequence. We have computed fluctuations from an encoded amino acid index dataset using cumulative sum technique and extracted the departure from the linear trend found in each protein sequence. For inner-sequence analysis, we found that the fluctuations of changes statistically follow a −5/3 Kolmogorov power and behave like an incremental Brownian process. The pattern of the changes in the inner sequence seems to be monofractal in essence and to be bounded between Hurst exponent [1/3,1/2] range, which respectively corresponds to the Kolmogorov and Brownian monofractal process. In addition, the changes in the inner sequence exhibit moderate complexity and chaos, which seems to be coherent with the monofractal and stochastic process highlighted previously in the study. The crossed-sequence changes analysis was achieved using an external parameter, which is the activity available for each protein sequence, and some results obtained for the inner sequence, specifically the drift and Kolmogorov complexity spectrum. We found a significant linear relationship between activity changes and drift changes, and also between activity and Kolmogorov complexity. An analysis of the mean square displacement of trajectories in the bivariate space (drift, activity) and (Kolmogorov complexity spectrum, activity) seems to present a superdiffusive law with a 1.6 power law value. Full article
Show Figures

Figure 1

Back to TopTop