# Principal Component Analysis and Related Methods for Investigating the Dynamics of Biological Macromolecules

## Abstract

**:**

## 1. Historical Overview

^{−1}. Since the vibrational frequencies of bond stretching modes are higher than this by three orders of magnitude, the amplitudes of the lowest and highest modes also differ by three-fold, indicating the highly anisotropic nature of proteins even within the range of vibrational motions. This high anisotropy may be partly attributed to the highly packed structures of folded native proteins, whose packing densities are comparable to that of a face-centered cubic lattice [7]. In highly packed structures, local motions uncorrelated with the surroundings are limited to small amplitudes because of possible collisions, while concerted motions of groups of atoms such as protein domains or loops can move in certain directions largely without altering atomic packing. In 1981, Karplus and Kushick proposed a method to estimate the configurational entropy of macromolecules from NMA, MD and Monte Carlo (MC) simulations [8]. That publication also showed that simulations with (NMA) and without harmonic approximation (MD and MC) can be connected by PCA. The length of the first reported protein MD was 8.8 ps [3], which roughly corresponds to one period of the lowest-frequency normal mode of typical small globular proteins and thus was insufficiently long to sample large-amplitude motions of the protein. However, increasing simulation lengths allowed investigation of the quasi-harmonic features of butane and BPTI, mainly focusing on quasi-harmonic frequencies deduced from PCA [9,10]. Later, projecting simulation trajectories onto collective coordinates was shown to be very useful for characterizing dominant protein dynamics, but the early stages of this endeavor used low-frequency normal modes for the projected collective variables [11]. Since normal modes are determined based only on one energy minimum, they are not necessarily the best choice to investigate the anharmonic nature of protein dynamics. In contrast, PCA determines principal coordinates as the collective coordinates, which incorporate anharmonic features included in the MD or MC trajectory. Longer and more realistic MD simulations in solution were performed from the 1980s to the early 1990s, allowing the PCA of MD trajectories. In the early 1990s, the anisotropic and anharmonic nature of native protein dynamics was elucidated by PCA, focusing on principal components (PCs), defined as the projections onto the principal coordinates [12,13,14,15]. PCA was also shown to be useful for analyzing simulation trajectories of protein folding/non-folding dynamics [16,17]. The past three decades have been seen the frequent use of PCA to investigate the dynamic behavior of biopolymers, as well as many important methodological improvements and the elucidation of simulated dynamic features [18,19,20,21,22,23]. Since PCA employs a variance–covariance matrix for dimensionality reduction, it is useful to characterize large-amplitude conformational change in molecules, such as protein domain motion and folding. However, PCA may not be sensitive for detecting localized, small amplitude but functionally important motions, such as backrub motion [24], peptide-plane flip [25], the side-chain flip and path-preserving motions [26].

## 2. Basic Concept behind PCA

## 3. Error in PCA

## 4. Relation with NMA

## 5. Solvent and Other Environmental Effects on Macromolecular Dynamics

^{−1}[15].

## 6. Choice of Variables and Spaces for Better Representation of Macromolecular Dynamics in PCA

_{α}atoms is useful for selecting a small number of large-amplitude motions, namely, “essential dynamics” [14]. Raw atomic coordinates from the original dataset typically reflect internal movements of the selected atoms, as well as overall translation and rotation. The translational and rotational components can be eliminated by the best fit of each dataset (typically a snapshot of a simulation trajectory) to a reference dataset so that the Eckart condition [57] is satisfied, for example, using the Kabsch method [58] or another method. To set the average $\u2329q\u232a$ as the origin of the coordinates, $\u2329q\u232a$ obtained after best fit should be used as the reference for the next round of best fit [12]. $\u2329q\u232a$ quickly converges within about five cycles with this procedure. Once translational and rotational components are completely eliminated, Cartesian PCA results in (3N − 6) positive eigenvalues for internal motions and six zero eigenvalues corresponding to PCs of the translation and rotation.

_{α}and C) and dihedral angles (ϕ, ψ, and ω) of small globular proteins, Omori et al. also showed that ${\tilde{C}}_{\theta}$ precisely recovers the information of ${C}_{r}$ and contains higher-order dihedral correlations, but ${C}_{\theta}$ does not [60]. Additionally, the mean-square atomic displacements tended to be minimized upon rotation of the dihedral angles, indicating the compensative nature of dihedral dynamics. However, such latent dynamics behavior was not seen in dihedral PCs of deca-alanine, a short peptide.

_{α}distances (C

_{α}PCA) and compared each result to those obtained using dPCA [73]. For conPCA, the distance between the closest heavy atom of each residue is considered as a contact if it is less than 4.5 Å and the residue pair of the contact is separated by more than three residues along the sequence [73,74]. Thus, conPCA can consider side chains in contact with each other but excludes information regarding local fluctuation along the sequence. Using 300 μs HP35 and 1 ms BPTI MD trajectories and examining the resolution of the free energy landscape and the decay of autocorrelation functions, Ernst et al. showed that distance-based PCAs, particularly C

_{α}PCA, tend to be versatile, but they exhibit fewer landscape details than dPCA does [73]. Recently, Ogata proposed grid-based PCA (GBPCA), which considers a grid system consisting of cubes with 5 Å edges [75]. This method uses a unit vector of mass-weighted averages of atoms in each cube to calculate the correlations to be diagonalized and was applied to bulk water, bulk methane and hydrated proteins.

## 7. The Fluctuation–Dissipation Theorem, Linear Response Theory and PCA

## 8. Non-Gaussianity and Non-Linearity in PCA

## 9. Detecting Data Differences by PCA and Related Methods

## 10. Time Evolution of Collective Variables

## 11. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Pearson, K.L., III. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci.
**1901**, 2, 559–572. [Google Scholar] [CrossRef] [Green Version] - Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol.
**1933**, 24, 417–441. [Google Scholar] [CrossRef] - Mccammon, J.A.; Gelin, B.R.; Karplus, M. Dynamics of Folded Proteins. Nature
**1977**, 267, 585–590. [Google Scholar] [CrossRef] [PubMed] - Go, N.; Noguti, T.; Nishikawa, T. Dynamics of a Small Globular Protein in Terms of Low-Frequency Vibrational-Modes. Proc. Natl. Acad. Sci. USA
**1983**, 80, 3696–3700. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Levitt, M.; Sander, C.; Stern, P.S. The normal modes of a protein: Native bovine pancreatic trypsin inhibitor. Int. J. Quant. Chem.
**1983**, 24, 181–199. [Google Scholar] [CrossRef] - Brooks, B.; Karplus, M. Harmonic dynamics of proteins: Normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proc. Natl. Acad. Sci. USA
**1983**, 80, 6571–6575. [Google Scholar] [CrossRef] [Green Version] - Richards, F.M. The interpretation of protein structures: Total volume, group volume distributions and packing density. J. Mol. Biol.
**1974**, 82, 1–14. [Google Scholar] [CrossRef] - Karplus, M.; Kushick, J.N. Method for estimating the configurational entropy of macromolecules. Macromolecules
**1981**, 14, 325–332. [Google Scholar] [CrossRef] - Levy, R.M.; Srinivasan, A.R.; Olson, W.K.; McCammon, J.A. Quasi-harmonic method for studying very low frequency modes in proteins. Biopolymers
**1984**, 23, 1099–1112. [Google Scholar] [CrossRef] - Levy, R.M.; Rojas, O.D.; Friesner, R.A. Quasi-Harmonic Method for Calculating Vibrational-Spectra from Classical Simulations on Multidimensional Anharmonic Potential Surfaces. J. Phys. Chem.
**1984**, 88, 4233–4238. [Google Scholar] [CrossRef] - Horiuchi, T.; Go, N. Projection of Monte Carlo and molecular dynamics trajectories onto the normal mode axes: Human lysozyme. Proteins Struct. Funct. Genet.
**1991**, 10, 106–116. [Google Scholar] [CrossRef] - Kitao, A.; Hirata, F.; Gō, N. The effects of solvent on the conformation and the collective motions of protein: Normal mode analysis and molecular dynamics simulations of melittin in water and in vacuum. Chem. Phys.
**1991**, 158, 447–472. [Google Scholar] [CrossRef] - García, A.E. Large-Amplitude Nonlinear Motions in Proteins. Phys. Rev. Lett.
**1992**, 68, 2696–2699. [Google Scholar] [CrossRef] [PubMed] - Amadei, A.; Linssen, A.B.M.; Berendsen, H.J.C. Essential Dynamics of Proteins. Proteins Struct. Funct. Genet.
**1993**, 17, 412–425. [Google Scholar] [CrossRef] [PubMed] - Hayward, S.; Kitao, A.; Hirata, F.; Go, N. Effect of solvent on collective motions in globular protein. J. Mol. Biol.
**1993**, 234, 1207–1217. [Google Scholar] [CrossRef] - Maisuradze, G.G.; Liwo, A.; Scheraga, H.A. Principal component analysis for protein folding dynamics. J. Mol. Biol.
**2009**, 385, 312–329. [Google Scholar] [CrossRef] [Green Version] - Maisuradze, G.G.; Liwo, A.; Senet, P.; Scheraga, H.A. Local vs global motions in protein folding. J. Chem. Theory Comput.
**2013**, 9, 2907–2921. [Google Scholar] [CrossRef] [Green Version] - Hayward, S.; Go, N. Collective Variable Description of Native Protein Dynamics. Annu. Rev. Phys. Chem.
**1995**, 46, 223–250. [Google Scholar] [CrossRef] - Kitao, A.; Go, N. Investigating protein dynamics in collective coordinate space. Curr. Opin. Struct. Biol.
**1999**, 9, 164–169. [Google Scholar] [CrossRef] - Berendsen, H.J.C.; Hayward, S. Collective protein dynamics in relation to function. Curr. Opin. Struct. Biol.
**2000**, 10, 165–169. [Google Scholar] [CrossRef] - David, C.C.; Jacobs, D.J. Principal component analysis: A method for determining the essential dynamics of proteins. Methods Mol. Biol.
**2014**, 1084, 193–226. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kitao, A.; Takemura, K. High anisotropy and frustration: The keys to regulating protein function efficiently in crowded environments. Curr. Opin. Struct. Biol.
**2017**, 42, 50–58. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Sittel, F.; Stock, G. Perspective: Identification of collective variables and metastable states of protein dynamics. J. Chem. Phys.
**2018**, 149, 150901. [Google Scholar] [CrossRef] [PubMed] - Davis, I.W.; Arendall, W.B., 3rd; Richardson, D.C.; Richardson, J.S. The backrub motion: How protein backbone shrugs when a sidechain dances. Structure
**2006**, 14, 265–274. [Google Scholar] [CrossRef] [Green Version] - Hayward, S. Peptide-plane flipping in proteins. Protein Sci.
**2001**, 10, 2219–2227. [Google Scholar] [CrossRef] - Nishima, W.; Qi, G.; Hayward, S.; Kitao, A. DTA: Dihedral transition analysis for characterization of the effects of large main-chain dihedral changes in proteins. Bioinformatics
**2009**, 25, 628–635. [Google Scholar] [CrossRef] [Green Version] - Kitao, A.; Hayward, S.; Go, N. Energy landscape of a native protein: Jumping-among-minima model. Proteins Struct. Funct. Genet.
**1998**, 33, 496–517. [Google Scholar] [CrossRef] - Joti, Y.; Kitao, A.; Go, N. Protein boson peak originated from hydration-related multiple minima energy landscape. J. Am. Chem. Soc.
**2005**, 127, 8705–8709. [Google Scholar] [CrossRef] - Kitao, A.; Wagner, G. A space-time structure determination of human CD2 reveals the CD58-binding mode. Proc. Natl. Acad. Sci. USA
**2000**, 97, 2064–2068. [Google Scholar] [CrossRef] [Green Version] - Kitao, A.; Wagner, G. Amplitudes and directions of internal protein motions from a JAM analysis of 15N relaxation data. Magn. Reson. Chem.
**2006**, 44, S130–S142. [Google Scholar] [CrossRef] - Hess, B. Similarities between principal components of protein dynamics and random diffusion. Phys. Rev. E
**2000**, 62, 8438–8448. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Edelman, A.; Rao, N.R. Random matrix theory. Acta Numer.
**2005**, 14, 233–297. [Google Scholar] [CrossRef] [Green Version] - Kwapień, J.; Drożdż, S. Physical approach to complex systems. Phys. Rep.
**2012**, 515, 115–226. [Google Scholar] [CrossRef] - Palese, L.L. Random Matrix Theory in molecular dynamics analysis. Biophys. Chem.
**2015**, 196, 1–9. [Google Scholar] [CrossRef] [PubMed] - Palese, L.L. A random version of principal component analysis in data clustering. Comput. Biol. Chem.
**2018**, 73, 57–64. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Cossio-Perez, R.; Palma, J.; Pierdominici-Sottile, G. Consistent Principal Component Modes from Molecular Dynamics Simulations of Proteins. J. Chem. Inf. Model.
**2017**, 57, 826–834. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hayward, S.; Kitao, A.; Go, N. Harmonicity and anharmonicity in protein dynamics: A normal mode analysis and principal component analysis. Proteins Struct. Funct. Genet.
**1995**, 23, 177–186. [Google Scholar] [CrossRef] - Bahar, I.; Rader, A.J. Coarse-grained normal mode analysis in structural biology. Curr. Opin. Struct. Biol.
**2005**, 15, 586–592. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Ma, J. Usefulness and limitations of normal mode analysis in modeling dynamics of biomolecular complexes. Structure
**2005**, 13, 373–380. [Google Scholar] [CrossRef] [Green Version] - Normal Mode Analysis: Theory and Applications to Biological and Chemical Systems; Cui, Q. , Bahar, I., Eds.; Chapman & Hall/CRC: Boca Raton, FL, USA, 2006. [Google Scholar]
- Dykeman, E.C.; Sankey, O.F. Normal mode analysis and applications in biological physics. J. Phys. Condens. Matter
**2010**, 22, 423202. [Google Scholar] [CrossRef] - Yamato, T.; Laprevote, O. Normal mode analysis and beyond. Biophys. Physicobiol.
**2019**, 16, 322–327. [Google Scholar] [CrossRef] [Green Version] - Jacob, A.B.; Vladena, B.-H. Normal Mode Analysis: A Tool for Better Understanding Protein Flexibility and Dynamics with Application to Homology Models. In Homology Molecular Modeling; Rafael Trindade, M., de Moraes, F.R.M., Magnólia, C., Eds.; IntechOpen: Rijeka, Croatia, 2021. [Google Scholar]
- Moritsugu, K.; Smith, J.C. Langevin model of the temperature and hydration dependence of protein vibrational dynamics. J. Phys. Chem. B
**2005**, 109, 12182–12194. [Google Scholar] [CrossRef] - Moritsugu, K.; Smith, J.C. Temperature-dependent protein dynamics: A simulation-based probabilistic diffusion-vibration Langevin description. J. Phys. Chem. B
**2006**, 110, 5807–5816. [Google Scholar] [CrossRef] - Lamm, G.; Szabo, A. Langevin Modes of Macromolecules. J. Chem. Phys.
**1986**, 85, 7334–7348. [Google Scholar] [CrossRef] - Kottalam, J.; Case, D.A. Langevin Modes of Macromolecules—Applications to Crambin and DNA Hexamers. Biopolymers
**1990**, 29, 1409–1421. [Google Scholar] [CrossRef] - Kirkwood, J.G.; Riseman, J. The Intrinsic Viscosities and Diffusion Constants of Flexible Macromolecules in Solution. J. Chem. Phys.
**1948**, 16, 565–573. [Google Scholar] [CrossRef] - Kirkwood, J.G. The statistical mechanical theory of irreversible processes in solutions of flexible macromolecules. Visco-elastic behavior. Recl. Trav. Chim. Pays-Bas
**1949**, 68, 649–660. [Google Scholar] [CrossRef] - Rotne, J.; Prager, S. Variational Treatment of Hydrodynamic Interaction in Polymers. J. Chem. Phys.
**1969**, 50, 4831–4837. [Google Scholar] [CrossRef] - Kim, B.; Hirata, F. Structural fluctuation of protein in water around its native state: A new statistical mechanics formulation. J. Chem. Phys.
**2013**, 138, 054108. [Google Scholar] [CrossRef] [Green Version] - Hirata, F.; Kim, B. Multi-scale dynamics simulation of protein based on the generalized Langevin equation combined with 3D-RISM theory. J. Mol. Liq.
**2016**, 217, 23–28. [Google Scholar] [CrossRef] - Chong, S.-H.; Hirata, F. Dynamics of solvated ion in polar liquids: An interaction-site-model description. J. Chem. Phys.
**1998**, 108, 7339–7349. [Google Scholar] [CrossRef] - Chong, S.-H.; Hirata, F. Dynamics of ions in liquid water: An interaction-site-model description. J. Chem. Phys.
**1999**, 111, 3654–3667. [Google Scholar] [CrossRef] - Hirata, F. On the interpretation of the temperature dependence of the mean square displacement (MSD) of protein, obtained from the incoherent neutron scattering. J. Mol. Liq.
**2018**, 270, 218–226. [Google Scholar] [CrossRef] - Hayward, S.; Kitao, A.; Go, N. Harmonic and anharmonic aspects in the dynamics of BPTI: A normal mode analysis and principal component analysis. Protein Sci.
**1994**, 3, 936–943. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Eckart, C. Some studies concerning rotating axes and polyatomic molecules. Phys. Rev.
**1935**, 47, 552–558. [Google Scholar] [CrossRef] - Kabsch, W. Solution for Best Rotation to Relate 2 Sets of Vectors. Acta Crystallogr. A
**1976**, 32, 922–923. [Google Scholar] [CrossRef] - Omori, S.; Fuchigami, S.; Ikeguchi, M.; Kidera, A. Linear response theory in dihedral angle space for protein structural change upon ligand binding. J. Comput. Chem.
**2009**, 30, 2602–2608. [Google Scholar] [CrossRef] - Omori, S.; Fuchigami, S.; Ikeguchi, M.; Kidera, A. Latent dynamics of a protein molecule observed in dihedral angle space. J. Chem. Phys.
**2010**, 132, 115103. [Google Scholar] [CrossRef] - Mu, Y.G.; Nguyen, P.H.; Stock, G. Energy landscape of a small peptide revealed by dihedral angle principal component analysis. Proteins
**2005**, 58, 45–52. [Google Scholar] [CrossRef] - Altis, A.; Nguyen, P.H.; Hegger, R.; Stock, G. Dihedral angle principal component analysis of molecular dynamics simulations. J. Chem. Phys.
**2007**, 126, 244111. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Altis, A.; Otten, M.; Nguyen, P.H.; Hegger, R.; Stock, G. Construction of the free energy landscape of biomolecules via dihedral angle principal component analysis. J. Chem. Phys.
**2008**, 128, 245102. [Google Scholar] [CrossRef] [PubMed] - Sittel, F.; Jain, A.; Stock, G. Principal component analysis of molecular dynamics: On the use of Cartesian vs. internal coordinates. J. Chem. Phys.
**2014**, 141, 014111. [Google Scholar] [CrossRef] [PubMed] - Huckemann, S.; Ziezold, H. Principal component analysis for Riemannian manifolds, with an application to triangular shape spaces. Adv. Appl. Probab.
**2006**, 38, 299–319. [Google Scholar] [CrossRef] - Sargsyan, K.; Wright, J.; Lim, C. GeoPCA: A new tool for multivariate analysis of dihedral angles based on principal component geodesics. Nucleic Acids Res.
**2012**, 40, e25, Erratum in Nucleic Acids Res.**2015**, 43, 10571–10572. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Nodehi, A.; Golalizadeh, M.; Heydari, A. Dihedral angles principal geodesic analysis using nonlinear statistics. J. Appl. Stat.
**2015**, 42, 1962–1972. [Google Scholar] [CrossRef] - Eltzner, B.; Huckemann, S.; Mardia, K.V. Torus principal component analysis with applications to RNA structure. J. Appl. Stat.
**2018**, 12, 1332–1359. [Google Scholar] [CrossRef] [Green Version] - Sittel, F.; Filk, T.; Stock, G. Principal component analysis on a torus: Theory and application to protein dynamics. J. Chem. Phys.
**2017**, 147, 244101. [Google Scholar] [CrossRef] - Post, M.; Wolf, S.; Stock, G. Principal component analysis of nonequilibrium molecular dynamics simulations. J. Chem. Phys.
**2019**, 150, 204110. [Google Scholar] [CrossRef] [Green Version] - Abagyan, R.; Argos, P. Optimal protocol and trajectory visualization for conformational searches of peptides and proteins. J. Mol. Biol.
**1992**, 225, 519–532. [Google Scholar] [CrossRef] - David, C.C.; Singam, E.R.A.; Jacobs, D.J. JED: A Java Essential Dynamics Program for comparative analysis of protein trajectories. BMC Bioinform.
**2017**, 18, 271. [Google Scholar] [CrossRef] [Green Version] - Ernst, M.; Sittel, F.; Stock, G. Contact- and distance-based principal component analysis of protein dynamics. J. Chem. Phys.
**2015**, 143, 244114. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Heringa, J.; Argos, P. Side-chain clusters in protein structures and their role in protein folding. J. Mol. Biol.
**1991**, 220, 151–171. [Google Scholar] [CrossRef] - Ogata, K. Investigation of Cooperative Modes for Collective Molecules Using Grid-Based Principal Component Analysis. J. Phys. Chem. B
**2021**, 125, 1072–1084. [Google Scholar] [CrossRef] - Beattie, J.R.; Esmonde-White, F.W.L. Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra. Appl. Spectrosc.
**2021**, 75, 361–375. [Google Scholar] [CrossRef] [PubMed] - Cochran, R.N.; Horne, F.H. Strategy for resolving rapid scanning wavelength experiments by principal component analysis. J. Phys. Chem.
**1980**, 84, 2561–2567. [Google Scholar] [CrossRef] - Cochran, R.N.; Horne, F.H.; Dye, J.L.; Ceraso, J.; Suelter, C.H. Principal component analysis of rapid scanning wavelength stopped-flow kinetics experiments on the liver alcohol dehydrogenase catalyzed reduction of p-nitroso-N,N-dimethylaniline by 1,4-dihydronicotinamide adenine dinucleotide. J. Phys. Chem.
**1980**, 84, 2567–2575. [Google Scholar] [CrossRef] - Yuan, B.; Murayama, K.; Wu, Y.; Tsenkova, R.; Dou, X.; Era, S.; Ozaki, Y. Temperature-dependent near-infrared spectra of bovine serum albumin in aqueous solutions: Spectral analysis by principal component analysis and evolving factor analysis. Appl. Spectrosc.
**2003**, 57, 1223–1229. [Google Scholar] [CrossRef] - Sakurai, K.; Goto, Y. Principal component analysis of the pH-dependent conformational transitions of bovine beta-lactoglobulin monitored by heteronuclear NMR. Proc. Natl. Acad. Sci. USA
**2007**, 104, 15346–15351. [Google Scholar] [CrossRef] [Green Version] - Henry, E.R. The Use of Matrix Methods in the Modeling of Spectroscopic Data Sets. Biophys. J.
**1997**, 72, 652–673. [Google Scholar] [CrossRef] [Green Version] - Shrager, R.I.; Hendler, R.W. Titration of individual components in a mixture with resolution of difference spectra, pKs, and redox transitions. Anal. Chem.
**2002**, 54, 1147–1152. [Google Scholar] [CrossRef] - Hofrichter, J.; Sommer, J.H.; Henry, E.R.; Eaton, W.A. Nanosecond absorption spectroscopy of hemoglobin: Elementary processes in kinetic cooperativity. Proc. Natl. Acad. Sci. USA
**1983**, 80, 2235–2239. [Google Scholar] [CrossRef] [Green Version] - Schmidt, M.; Rajagopal, S.; Ren, Z.; Moffat, K. Application of Singular Value Decomposition to the Analysis of Time-Resolved Macromolecular X-Ray Data. Biophys. J.
**2003**, 84, 2112–2129. [Google Scholar] [CrossRef] [Green Version] - Rajagopal, S.; Schmidt, M.; Anderson, S.; Ihee, H.; Moffat, K. Analysis of experimental time-resolved crystallographic data by singular value decomposition. Acta Crystallogr. D
**2004**, 60, 860–871. [Google Scholar] [CrossRef] - Kostov, K.S.; Moffat, K. Cluster analysis of time-dependent crystallographic data: Direct identification of time-independent structural intermediates. Biophys. J.
**2011**, 100, 440–449. [Google Scholar] [CrossRef] [Green Version] - Kubo, R. The fluctuation-dissipation theorem. Rep. Prog. Phys.
**1966**, 29, 255–284. [Google Scholar] [CrossRef] [Green Version] - Des Cloizeaux, D. Linear Response, Generalized Susceptibility and Dispersion Theory. In Theory of Condensed Matter; Bassani, F., Caglioti, G., Ziman, J., Eds.; International Center for Theretical Physics: Trieste, Italy, 1968; pp. 325–354. [Google Scholar]
- Ikeguchi, M.; Ueno, J.; Sato, M.; Kidera, A. Protein structural change upon ligand binding: Linear response theory. Phys. Rev. Lett.
**2005**, 94, 078102. [Google Scholar] [CrossRef] - Yang, L.W.; Kitao, A.; Huang, B.C.; Go, N. Ligand-Induced Protein Responses and Mechanical Signal Propagation Described by Linear Response Theories. Biophys. J.
**2014**, 107, 1415–1425. [Google Scholar] [CrossRef] [Green Version] - Hirata, F. A molecular theory of the structural dynamics of protein induced by a perturbation. J. Chem. Phys.
**2016**, 145, 234106. [Google Scholar] [CrossRef] - Kitao, A. Transform and relax sampling for highly anisotropic systems: Application to protein domain motion and folding. J. Chem. Phys.
**2011**, 135, 045101, Erratum in J. Chem. Phys.**2011**, 135, 119903. [Google Scholar] [CrossRef] - Tamura, K.; Hayashi, S. Linear Response Path Following: A Molecular Dynamics Method To Simulate Global Conformational Changes of Protein upon Ligand Binding. J. Chem. Theory Comput.
**2015**, 11, 2900–2917. [Google Scholar] [CrossRef] - Tamura, K.; Hayashi, S. Atomistic modeling of alternating access of a mitochondrial ADP/ATP membrane transporter with molecular simulations. PLoS ONE
**2017**, 12, e0181489. [Google Scholar] [CrossRef] - Jutten, C.; Herault, J. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Process.
**1991**, 24, 1–10. [Google Scholar] [CrossRef] - Comon, P. Independent component analysis, A new concept? Signal Process.
**1994**, 36, 287–314. [Google Scholar] [CrossRef] - Hyvärinen, A.; Oja, E. Independent component analysis: Algorithms and applications. Neural Netw.
**2000**, 13, 411–430. [Google Scholar] [CrossRef] [Green Version] - Lange, O.F.; Grubmüller, H. Full correlation analysis of conformational protein dynamics. Proteins
**2008**, 70, 1294–1312. [Google Scholar] [CrossRef] - Nguyen, P.H. Conformational states and folding pathways of peptides revealed by principal-independent component analyses. Proteins
**2007**, 67, 579–592. [Google Scholar] [CrossRef] - Sakuraba, S.; Joti, Y.; Kitao, A. Detecting coupled collective motions in protein by independent subspace analysis. J. Chem. Phys.
**2010**, 133, 185102. [Google Scholar] [CrossRef] - Theis, F.J. Towards a general independent subspace analysis. In Advances in Neural Information Processing Systems; Schölkopf, B., Platt, J., Hoffman, T.E., Eds.; MIT Press: Cambridge, MA, USA, 2007; Volume 19, pp. 1361–1368. [Google Scholar]
- Nguyen, P.H. Complexity of free energy landscapes of peptides revealed by nonlinear principal component analysis. Proteins
**2006**, 65, 898–913. [Google Scholar] [CrossRef] - Kramer, M.A. Nonlinear principal component analysis using autoassociative neural networks. AIChE J.
**1991**, 37, 233–243. [Google Scholar] [CrossRef] - Schölkopf, B.; Smola, A.; Müller, K.-R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput.
**1998**, 10, 1299–1319. [Google Scholar] [CrossRef] [Green Version] - Coifman, R.R.; Lafon, S.; Lee, A.B.; Maggioni, M.; Nadler, B.; Warner, F.; Zucker, S.W. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proc. Natl. Acad. Sci. USA
**2005**, 102, 7426–7431. [Google Scholar] [CrossRef] [Green Version] - Coifman, R.R.; Lafon, S. Diffusion maps. Appl. Comput. Harmon. Anal.
**2006**, 21, 5–30. [Google Scholar] [CrossRef] [Green Version] - de la Portey, J.; Herbsty, B.M.; Hereman, W.; van der Walty, S.J. An Introduction to Diffusion Maps. In Proceedings of the The 19th Symposium of the Pattern Recognition Association of South Africa (PRASA 2008), Cape Town, South Africa, 27–28 November 2008. [Google Scholar]
- Ferguson, A.L.; Zhang, S.; Dikiy, I.; Panagiotopoulos, A.Z.; Debenedetti, P.G.; James Link, A. An experimental and computational investigation of spontaneous lasso formation in microcin J25. Biophys. J.
**2010**, 99, 3056–3065. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kim, S.B.; Dsilva, C.J.; Kevrekidis, I.G.; Debenedetti, P.G. Systematic characterization of protein folding pathways using diffusion maps: Application to Trp-cage miniprotein. J. Chem. Phys.
**2015**, 142, 085101. [Google Scholar] [CrossRef] [Green Version] - Trstanova, Z.; Leimkuhler, B.; Lelièvre, T. Local and global perspectives on diffusion maps in the analysis of molecular systems. Proc. R. Soc. A Math. Phys. Eng. Sci.
**2020**, 476, 20190036. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Hus, J.C.; Bruschweiler, R. Principal component method for assessing structural heterogeneity across multiple alignment media. J. Biomol. NMR
**2002**, 24, 123–132. [Google Scholar] [CrossRef] [PubMed] - Howe, P.W. Principal components analysis of protein structure ensembles calculated using NMR data. J. Biomol. NMR
**2001**, 20, 61–70. [Google Scholar] [CrossRef] - Yang, L.W.; Eyal, E.; Bahar, I.; Kitao, A. Principal component analysis of native ensembles of biomolecular structures (PCA_NEST): Insights into functional dynamics. Bioinformatics
**2009**, 25, 606–614, Erratum in Bioinformatics**2009**, 25, 2147–2147. [Google Scholar] [CrossRef] - Sakuraba, S.; Kono, H. Spotting the difference in molecular dynamics simulations of biomolecules. J. Chem. Phys.
**2016**, 145, 074116. [Google Scholar] [CrossRef] [Green Version] - Wang, H.; Yan, S.; Xu, D.; Tang, X.; Huang, T. Trace Ratio vs. Ratio Trace for Dimensionality Reduction. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, 17–22 June 2007; pp. 1–8. [Google Scholar]
- Ngo, T.T.; Bellalij, M.; Saad, Y. The Trace Ratio Optimization Problem. SIAM Rev.
**2012**, 54, 545–569. [Google Scholar] [CrossRef] [Green Version] - Peters, J.H.; de Groot, B.L. Ubiquitin dynamics in complexes reveal molecular recognition mechanisms beyond induced fit and conformational selection. PLoS Comput. Biol.
**2012**, 8, e1002704. [Google Scholar] [CrossRef] [PubMed] - Ahmad, M.; Helms, V.; Kalinina, O.V.; Lengauer, T. Relative Principal Components Analysis: Application to Analyzing Biomolecular Conformational Changes. J. Chem. Theory Comput.
**2019**, 15, 2166–2178. [Google Scholar] [CrossRef] [PubMed] - Molgedey, L.; Schuster, H.G. Separation of a Mixture of Independent Signals Using Time-Delayed Correlations. Phys. Rev. Lett.
**1994**, 72, 3634–3637. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Naritomi, Y.; Fuchigami, S. Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: The case of domain motions. J. Chem. Phys.
**2011**, 134, 065101. [Google Scholar] [CrossRef] - Naritomi, Y.; Fuchigami, S. Slow dynamics of a protein backbone in molecular dynamics simulation revealed by time-structure based independent component analysis. J. Chem. Phys.
**2013**, 139, 215102. [Google Scholar] [CrossRef] - Mori, T.; Saito, S. Dynamic heterogeneity in the folding/unfolding transitions of FiP35. J. Chem. Phys.
**2015**, 142, 135101. [Google Scholar] [CrossRef] - Takano, H.; Miyashita, S. Relaxation Modes in Random Spin Systems. J. Phys. Soc. Jpn.
**1995**, 64, 3688–3698. [Google Scholar] [CrossRef] - Hirao, H.; Koseki, S.; Takano, H. Molecular Dynamics Study of Relaxation Modes of a Single Polymer Chain. J. Phys. Soc. Jpn.
**1997**, 66, 3399–3405. [Google Scholar] [CrossRef] - Koseki, S.; Hirao, H.; Takano, H. Monte Carlo Study of Relaxation Modes of a Single Polymer Chain. J. Phys. Soc. Jpn.
**1997**, 66, 1631–1637. [Google Scholar] [CrossRef] - Mitsutake, A.; Iijima, H.; Takano, H. Relaxation mode analysis of a peptide system: Comparison with principal component analysis. J. Chem. Phys.
**2011**, 135, 164102. [Google Scholar] [CrossRef] [PubMed] - Mitsutake, A.; Takano, H. Relaxation mode analysis and Markov state relaxation mode analysis for chignolin in aqueous solution near a transition temperature. J. Chem. Phys.
**2015**, 143, 124111. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Karasawa, N.; Mitsutake, A.; Takano, H. Two-step relaxation mode analysis with multiple evolution times applied to all-atom molecular dynamics protein simulation. Phys. Rev. E
**2017**, 96, 062408. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Schultze, S.; Grubmüller, H. Time-Lagged Independent Component Analysis of Random Walks and Protein Dynamics. J. Chem. Theory Comput.
**2021**, 17, 5766–5776. [Google Scholar] [CrossRef] [PubMed] - Morishita, T. Time-dependent principal component analysis: A unified approach to high-dimensional data reduction using adiabatic dynamics. J. Chem. Phys.
**2021**, 155, 134114. [Google Scholar] [CrossRef] - Perez-Hernandez, G.; Paul, F.; Giorgino, T.; De Fabritiis, G.; Noe, F. Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys.
**2013**, 139, 015102. [Google Scholar] [CrossRef] - Scherer, M.K.; Trendelkamp-Schroer, B.; Paul, F.; Perez-Hernandez, G.; Hoffmann, M.; Plattner, N.; Wehmeyer, C.; Prinz, J.H.; Noe, F. PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models. J. Chem. Theory Comput.
**2015**, 11, 5525–5542. [Google Scholar] [CrossRef] - Harrigan, M.P.; Sultan, M.M.; Hernandez, C.X.; Husic, B.E.; Eastman, P.; Schwantes, C.R.; Beauchamp, K.A.; McGibbon, R.T.; Pande, V.S. MSMBuilder: Statistical Models for Biomolecular Dynamics. Biophys. J.
**2017**, 112, 10–15. [Google Scholar] [CrossRef] [Green Version] - Schwantes, C.R.; Pande, V.S. Modeling molecular kinetics with tICA and the kernel trick. J. Chem. Theory Comput.
**2015**, 11, 600–608. [Google Scholar] [CrossRef] [Green Version] - Husic, B.E.; Pande, V.S. Markov State Models: From an Art to a Science. J. Am. Chem. Soc.
**2018**, 140, 2386–2396. [Google Scholar] [CrossRef] - Wang, X.; Unarta, I.C.; Cheung, P.P.; Huang, X. Elucidating molecular mechanisms of functional conformational changes of proteins via Markov state models. Curr. Opin. Struct. Biol.
**2021**, 67, 69–77. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kitao, A.
Principal Component Analysis and Related Methods for Investigating the Dynamics of Biological Macromolecules. *J* **2022**, *5*, 298-317.
https://doi.org/10.3390/j5020021

**AMA Style**

Kitao A.
Principal Component Analysis and Related Methods for Investigating the Dynamics of Biological Macromolecules. *J*. 2022; 5(2):298-317.
https://doi.org/10.3390/j5020021

**Chicago/Turabian Style**

Kitao, Akio.
2022. "Principal Component Analysis and Related Methods for Investigating the Dynamics of Biological Macromolecules" *J* 5, no. 2: 298-317.
https://doi.org/10.3390/j5020021