# Optimal Relabeling of Water Molecules and Single-Molecule Entropy Estimation

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. The K-th Nearest Neighbour Method for Entropy Estimation

#### 2.2. Metric in Translational-Rotational Space

#### 2.3. Distance in the Configurational Space

#### 2.4. Optimal Relabeling of Molecules

#### 2.5. The Hungarian Algorithm

#### 2.6. Entropy Calculation after Relabeling

#### 2.7. Single-Molecule Entropy Calculation after Relabeling

#### 2.8. Molecular Dynamics Simulations

## 3. Results and Discussion

#### 3.1. Optimal Labeling of Water Molecules

#### 3.2. Single-Molecule Entropy from Single-Molecule Distributions

#### 3.3. Accuracy

#### 3.4. Computational Time

#### 3.5. Application Example: Waters around a Fixed Water Molecule

## 4. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

kNN | k-th nearest neighbour |

MIE | Mutual information expansion |

MIST | Maximum information spanning tree |

## References

- Gilson, M.K.; Given, J.A.; Bush, B.L.; McCammon, J.A. The statistical-thermodynamic basis for computation of binding affinities: A critical review. Biophys. J.
**1997**, 72, 1047–1069. [Google Scholar] [CrossRef] [Green Version] - Roux, B.; Simonson, T. Implicit solvent models. Biophys. Chem.
**1999**, 78, 1–20. [Google Scholar] [CrossRef] - Kollman, P.A.; Massova, I.; Reyes, C.; Kuhn, B.; Huo, S.; Chong, L.; Lee, M.; Lee, T.; Duan, Y.; Wang, W.; et al. Calculating structures and free energies of complex molecules: Combining molecular mechanics and continuum models. Acc. Chem. Res.
**2000**, 33, 889–897. [Google Scholar] [CrossRef] - Wereszczynski, J.; McCammon, J.A. Statistical mechanics and molecular dynamics in evaluating thermodynamic properties of biomolecular recognition. Q. Rev. Biophys.
**2012**, 45, 1–25. [Google Scholar] [CrossRef] - Polyansky, A.A.; Zubac, R.; Zagrovic, B. Estimation of conformational entropy in protein-ligand interactions: A computational perspective. Methods Mol. Biol.
**2012**, 819, 327–353. [Google Scholar] - Suarez, D.; Diaz, N. Direct methods for computing single-molecule entropies from molecular simulations. WIREs Comput. Mol. Sci.
**2015**, 5, 1–26. [Google Scholar] [CrossRef] - Kassem, S.; Ahmed, M.; El-Sheikh, S.; Barakat, K.H. Entropy in bimolecular simulations: A comprehensive review of atomic fluctuations-based methods. J. Mol. Graph. Model.
**2015**, 62, 105–117. [Google Scholar] [CrossRef] - Fogolari, F.; Corazza, A.; Esposito, G. Free energy, enthalpy and entropy from implicit solvent end-point simulations. Front. Mol. Biosci.
**2018**, 5, 11. [Google Scholar] [CrossRef] [Green Version] - Beveridge, D.; diCapua, L. Free energy via molecular simulation: Applications to chemical and biomolecular systems. Annu. Rev. Biophys. Chem.
**1989**, 18, 431–492. [Google Scholar] [CrossRef] - Zwanzig, R.W. High temperature equation of state by a perturbation method. I. Nonpolar gases. J. Chem. Phys.
**1954**, 22, 1420–1426. [Google Scholar] [CrossRef] - Straatsma, T.P.; McCammon, J.A. Multiconfiguration thermodynamic integration. J. Chem. Phys.
**1954**, 95, 1175–1188. [Google Scholar] [CrossRef] - Wan, S.; Stote, R.H.; Karplus, M. Calculation of the aqueous solvation energy and entropy, as well as free energy, of simple polar solutes. J. Chem. Phys.
**2004**, 121, 9539–9548. [Google Scholar] [CrossRef] [PubMed] - Lazaridis, T. Inhomogeneous fluid approach to solvation thermodynamics. 1. Theory. J. Phys. Chem. B
**1998**, 102, 3531–3541. [Google Scholar] [CrossRef] - Lazaridis, T. Inhomogeneous fluid approach to solvation thermodynamics. 2. Applications to simple fluids. J. Phys. Chem. B
**1998**, 102, 3542–3550. [Google Scholar] [CrossRef] - Young, T.; Abel, R.; Kim, B.; Berne, B.J.; Friesner, R.A. Motifs for molecular recognition exploiting hydrophobic enclosure in protein-ligand binding. Proc. Natl. Acad. Sci. USA
**2007**, 104, 808–813. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Abel, R.; Young, T.; Farid, R.; Berne, B.J.; Friesner, R.A. Role of the active-site solvent in the thermodynamics of factor Xa ligand binding. J. Am. Chem. Soc.
**2008**, 130, 2817–2831. [Google Scholar] [CrossRef] [Green Version] - Nguyen, C.N.; Young, T.K.; Gilson, M.K. Grid inhomogeneous solvation theory: Hydration structure and thermodynamics of the miniature receptor cucurbit[7]uril. J. Chem. Phys.
**2012**, 137, 044101. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Ramsey, S.; Nguyen, C.; Salomon-Ferrer, R.; Walker, R.C.; Gilson, M.K.; Kurtzman, T. Solvation thermodynamic mapping of molecular surfaces in AmberTools: GIST. J. Comput. Chem.
**2016**, 37, 2029–2037. [Google Scholar] [CrossRef] [Green Version] - Haider, K.; Cruz, A.; Ramsey, S.; Gilson, M.K.; Kurtzman, T. Solvation structure and thermodynamic mapping (SSTMap): An open-source, flexible package for the analysis of water in molecular dynamics trajectories. J. Chem. Theory Comput.
**2018**, 14, 418–425. [Google Scholar] [CrossRef] - Ross, G.A.; Bodnarchuk, M.S.; Essex, J.W. Water sites, networks, and free energies with Grand Canonical Monte Carlo. J. Am. Chem. Soc.
**2015**, 137, 14930–14943. [Google Scholar] [CrossRef] [Green Version] - Kovalenko, A. Three-dimensional RISM theory for molecular liquids and solid-liquid interfaces. In Molecular Theory of Solvation; Hirata, F., Ed.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2003; Chapter 4; pp. 169–275. [Google Scholar]
- Bodnarchuk, M.S. Water, water, everywhere... It’s time to stop and think. Drug Discov. Today
**2016**, 21, 1139–1146. [Google Scholar] [CrossRef] - Singh, H.; Misra, N.; Hnizdo, V.; Fedorowicz, A.; Demchuk, E. Nearest neighbor estimate of entropy. Am. J. Math. Manag. Sci.
**2003**, 23, 301–321. [Google Scholar] [CrossRef] - Hnizdo, V.; Darian, E.; Fedorowicz, A.; Demchuk, E.; Li, S.; Singh, H. Nearest-neighbor nonparametric method for estimating the configurational entropy of complex molecules. J. Comput. Chem.
**2007**, 28, 655–668. [Google Scholar] [CrossRef] - Numata, J.; Wan, M.; Knapp, E.W. Conformational entropy of biomolecules: Beyond the quasi-harmonic approximation. Genome Inform.
**2007**, 18, 192–205. [Google Scholar] - Hnizdo, V.; Tan, J.; Killian, B.J.; Gilson, M.K. Efficient calculation of configurational entropy from molecular simulations by combining the mutual-information expansion and nearest-neighbor methods. J. Comput. Chem.
**2008**, 29, 1605–1614. [Google Scholar] [CrossRef] [Green Version] - Wang, L.; Abel, R.; Friesner, R.A.; Berne, B.J. Thermodynamic properties of liquid water: An application of a nonparametric approach to computing the entropy of a neat fluid. J. Chem. Theory Comput.
**2009**, 5, 1462–1473. [Google Scholar] [CrossRef] [Green Version] - Misra, N.; Singh, H.; Hnizdo, V. Nearest neighbor estimates of entropy for multivariate circular distributions. Entropy
**2010**, 12, 578–590. [Google Scholar] [CrossRef] [Green Version] - Mukherjee, A. Entropy Balance in the Intercalation Process of an Anti-Cancer Drug Daunomycin. J. Phys. Chem. Lett.
**2011**, 2, 3021–3026. [Google Scholar] [CrossRef] - Fenley, A.T.; Killian, B.J.; Hnizdo, V.; Fedorowicz, A.; Sharp, D.S.; Gilson, M.K. Correlation as a determinant of configurational entropy in supramolecular and protein systems. J. Phys. Chem. B
**2014**, 118, 6447–6455. [Google Scholar] [CrossRef] [Green Version] - Nguyen, C.N.; Cruz, A.; Gilson, M.K.; Kurtzman, T. Thermodynamics of water in an enzyme active site: Grid-based hydration analysis of coagulation factor Xa. J. Chem. Theory Comput.
**2014**, 10, 2769–2780. [Google Scholar] [CrossRef] - Fogolari, F.; Corazza, A.; Fortuna, S.; Soler, M.A.; VanSchouwen, B.; Brancolini, G.; Corni, S.; Melacini, G.; Esposito, G. Distance-based configurational entropy of proteins from molecular dynamics simulations. PLoS ONE
**2015**, 10, e0132356. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Huggins, D.J. Comparing distance metrics for rotation using the k-nearest neighbors algorithm for entropy estimation. J. Comput. Chem.
**2014**, 35, 377–385. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Huggins, D.J. Estimating translational and orientational entropies using the k-nearest neighbors algorithm. J. Chem. Theory Comput.
**2014**, 10, 3617–3625. [Google Scholar] [CrossRef] [Green Version] - Huggins, D.J. Quantifying the entropy of binding for water molecules in protein cavities by computing correlations. Biophys. J.
**2014**, 35, 377–385. [Google Scholar] [CrossRef] [Green Version] - Sasikala, W.D.; Mukherjee, A. Single water entropy: Hydrophobic crossover and application to drug binding. J. Phys. Chem. B
**2014**, 118, 10553–10564. [Google Scholar] [CrossRef] - Fogolari, F.; Dongmo Foumthuim, C.J.; Fortuna, S.; Soler, M.A.; Corazza, A.; Esposito, G. Accurate Estimation of the Entropy of Rotation-Translation Probability Distributions. J. Chem. Theory Comput.
**2016**, 12, 1–8. [Google Scholar] [CrossRef] - Fogolari, F.; Esposito, G.; Tidor, B. Entropy of two-molecule correlated translational-rotational motions using the kth nearest neighbor method. J. Chem. Theory Comput.
**2021**, 17, 3039–3051. [Google Scholar] [CrossRef] - Huggins, D.J. Studying the role of cooperative hydration in stabilizing folded protein states. J. Struct. Biol.
**2016**, 196, 394–406. [Google Scholar] [CrossRef] [Green Version] - Irwin, B.W.J.; Huggins, D.J. On the accuracy of one- and two-particle solvation entropies. J. Chem. Phys.
**2017**, 146, 194111. [Google Scholar] [CrossRef] [Green Version] - Heinz, L.P.; Grubmüller, H. Computing spatially resolved rotational hydration entropies from atomistic simulations. J. Chem. Theory Comput.
**2020**, 16, 108–118. [Google Scholar] [CrossRef] [Green Version] - Heinz, L.P.; Grubmüller, H. Per|Mut: Spatially resolved hydration entropies from atomistic simulations. J. Chem. Theory Comput.
**2021**, 17, 2090–2091. [Google Scholar] [CrossRef] [PubMed] - Killian, B.J.; Yundenfreund Kravitz, J.; Gilson, M.K. Extraction of configurational entropy from molecular simulations via an expansion approximation. J. Chem. Phys.
**2007**, 127, 024107. [Google Scholar] [CrossRef] [PubMed] [Green Version] - King, B.M.; Tidor, B. MIST: Maximum Information Spanning Trees for dimension reduction of biological data sets. Bioinformatics
**2009**, 25, 1165–1172. [Google Scholar] [CrossRef] [Green Version] - King, B.M.; Silver, N.W.; Tidor, B. Efficient calculation of molecular configurational entropies using an information theoretic approximation. J. Phys. Chem. B
**2012**, 116, 2891–2904. [Google Scholar] [CrossRef] [Green Version] - Silver, N.W.; King, B.M.; Nalam, M.N.L.; Cao, H.; Ali, A.; Kiran Kumar Reddy, G.S.; Rana, T.M.; Schiffer, C.A.; Tidor, B. Efficient Computation of Small-Molecule Configurational Binding Entropy and Free Energy Changes by Ensemble Enumeration. J. Chem. Theory Comput.
**2013**, 9, 5098–5115. [Google Scholar] [CrossRef] [Green Version] - Fenley, A.T.; Muddana, H.S.; Gilson, M.K. EEntropy–enthalpy transduction caused by conformational shifts can obscure the forces driving protein–ligand binding. Proc. Natl. Acad. Sci. USA
**2012**, 109, 20006–20011. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Fleck, M.; Polyansky, A.A.; Zagrovic, B. PARENT: A parallel software suite for the calculation of configurational entropy in biomolecular systems. J. Chem. Theory Comput.
**2016**, 12, 2055–2065. [Google Scholar] [CrossRef] - Fogolari, F.; Maloku, O.; Dongmo Foumthuim, C.J.; Corazza, A.; Esposito, G. PDB2ENTROPY and PDB2TRENT: Conformational and translational-rotational entropy from molecular ensembles. J. Chem. Inf. Model.
**2018**, 58, 1319–1324. [Google Scholar] [CrossRef] - Dongmo Foumthuim, C.J.; Corazza, A.; Berni, R.; Esposito, G.; Fogolari, F. Dynamics and Thermodynamics of Transthyretin Association from Molecular Dynamics Simulations. BioMed Res. Int.
**2018**, 2018, 7480749. [Google Scholar] [CrossRef] [Green Version] - Lazaridis, T.; Karplus, M. Orientational correlations and entropy in liquid water. J. Chem. Phys.
**1996**, 105, 4294–4316. [Google Scholar] [CrossRef] - Singer, A. Maximum entropy formulation of the Kirkwood superposition approximation. J. Chem. Phys.
**2004**, 121, 3657–3666. [Google Scholar] [CrossRef] [PubMed] - Wallace, D.C. On the role of density fluctuations in the entropy of a fluid. J. Chem. Phys.
**1987**, 87, 2281–2284. [Google Scholar] [CrossRef] - Tan, Z.; Gallicchio, E.; Lapelosa, M.; Levy, R.M. Theory of binless multi-state free energy estimation with applications to protein-ligand binding. J. Chem. Phys.
**2012**, 136, 144102. [Google Scholar] [CrossRef] [Green Version] - Zhang, B.W.; Cui, D.; Matubayasi, N.; Levy, R.M. The Excess Chemical Potential of Water at the Interface with a Protein from End Point Simulations. J. Phys. Chem. B
**2018**, 122, 4700–4707. [Google Scholar] [CrossRef] - Reinhard, F.; Grubmüller, H. Estimation of absolute solvent and solvation shell entropies via permutation reduction. J. Chem. Phys.
**2007**, 126, 014102. [Google Scholar] [CrossRef] - Kozachenko, L.F.; Leonenko, N.N. Sample estimates of entropy of a random vector. Probl. Inf. Transm.
**1987**, 23, 95–101. [Google Scholar] - Huynh, D.Q. Metrics for 3D rotations: Comparison and analysis. J. Math. Imaging Vis.
**2009**, 35, 155–164. [Google Scholar] [CrossRef] - Ò Searcòid, M. Metric Spaces; Springer: London, UK, 2007. [Google Scholar]
- Miles, R.E. On random rotations in R
^{3}. Biometrika**1965**, 52, 636–639. [Google Scholar] [CrossRef] - Kuhn, H.W. The hungarian method for the assignment problem. Naval Res. Log. Quart.
**1955**, 2, 83–97. [Google Scholar] [CrossRef] [Green Version] - Knuth, D.E. Stanford GraphBase; ACM Press: New York, NY, USA, 1986. [Google Scholar]
- Jorgensen, W.L.; Chandrasekhar, J.; Madura, J.D.; Impey, R.W.; Klein, M.L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys.
**1983**, 79, 926–935. [Google Scholar] [CrossRef] - Martyna, G.; Tobias, D.; Klein, M. Constant pressure molecular dynamics algorithms. J. Chem. Phys.
**1994**, 101, 4177–4189. [Google Scholar] [CrossRef] - Feller, S.; Zhang, Y.; Pastor, R.; Brooks, B. Constant pressure molecular dynamics simulation: The Langevin piston method. J. Chem. Phys.
**1995**, 103, 4613–4621. [Google Scholar] [CrossRef] - Miyamoto, S.; Kollman, P.A. Settle: An analytical version of the SHAKE and RATTLE algorithm for rigid water models. J. Comput. Chem.
**1992**, 13, 952–962. [Google Scholar] [CrossRef] - Kale, L.; Skeel, R.; Bhandarkar, M.; Brunner, R.; Gursoy, A.; Krawetz, N.; Phillips, J.; Shinozaki, A.; Varadarajan, K.; Schulten, K. NAMD2: Greater scalability for parallel molecular dynamics. J. Comput. Phys.
**1999**, 151, 283–312. [Google Scholar] [CrossRef] - Humphrey, W.; Dalke, A.; Schulten, K. VMD Visual Molecular Dynamics. J. Mol. Graph.
**1996**, 14, 33–38. [Google Scholar] [CrossRef]

**Figure 1.**A sketch of some steps of the Hungarian algorithm. (

**A**) The initial cost matrix (e.g., distances among five water molecules in the sample and reference configuration) is shown. The minima in each row are underlined and subtracted from the relative rows to yeld a new cost matrix. (

**B**) The minimima different from zero in each column are underlined and subtracted from each column to yeld a new cost matrix. (

**C**) In the new cost matrix, some “matching” zeroes (shown in bold) match one column and one row uniquely. Non-matching zeroes are underlined. Paths starting and ending on an underlined zero and alternating matching and non-matching zeroes, such as the one highlighted by the thick dashed line, are sought and new matching zeroes are chosen. If no such path exists, the algorithm performs subtractions and additions to the columns and the rows which do not alter the solution and create new paths [62]. This step is iterated until completion, to yeld a new matrix with a unique chosen zero on each row and each column. (

**D**) The set of chosen zeroes in the matrix (one per row and column) is the sought assignment, i.e., in this example, W1 in the sample is assigned the label W5, W2 is assigned the label W3, etc.

**Figure 2.**10,000 snapshots for a single molecule from the dynamics are shown in gray and 1000 snapshots in black after relabeling.

**Figure 3.**10,000 snapshots for two neighbouring relabeled molecules from the dynamics are shown in gray and black, respectively.

**Figure 4.**The average estimated entropy (and its extrapolation to zero distance) is plotted against the average distance to the first 20 nearest neighbours with error bars on both distance and entropy estimate. The vertical error bars are equal to the standard deviations of the entropy estimate drawn for each individual molecule in a single snapshot from the first 20 neighbours in all other snapshots.

**Figure 5.**100 snapshots along a 10 ns trajectory of the molecules matched to the four molecules closest to the fixed water in the reference snapshot. Oxygen atoms are shown in red and hydrogen atoms in white.

**Figure 6.**The entropy of the relabeled molecules (continuous line) indexed based on the distance from the fixed molecule. The entropy corresponding to the non-interacting symmetric molecules is shown by the broken line.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Fogolari, F.; Esposito, G.
Optimal Relabeling of Water Molecules and Single-Molecule Entropy Estimation. *Biophysica* **2021**, *1*, 279-296.
https://doi.org/10.3390/biophysica1030021

**AMA Style**

Fogolari F, Esposito G.
Optimal Relabeling of Water Molecules and Single-Molecule Entropy Estimation. *Biophysica*. 2021; 1(3):279-296.
https://doi.org/10.3390/biophysica1030021

**Chicago/Turabian Style**

Fogolari, Federico, and Gennaro Esposito.
2021. "Optimal Relabeling of Water Molecules and Single-Molecule Entropy Estimation" *Biophysica* 1, no. 3: 279-296.
https://doi.org/10.3390/biophysica1030021