# The Automatic Solution of Macromolecular Crystal Structures via Molecular Replacement Techniques: REMO22 and Its Pipeline

^{*}

## Abstract

**:**

## 1. Introduction

- (i)
- Finding a good enough model. If the sequence identity between the known structure and target is low or limited, the solution of the phase problem may be hindered.
- (ii)
- An efficient MR program to orient and translate the model molecules correctly into the target asymmetric unit. This program must be able to handle cases where the model search is far from optimal, as even well-defined rotation and translation parameters can lead to a large mean phase error.
- (iii)
- The phase extension and refinement step. The MR modulus often produces a large phase error on a limited number of reflections. This step is usually accomplished through electron density modification (EDM) techniques included in large crystallographic packages such as CNS [34], CCP4 [35], SHARP [36], PHENIX [37] and the SHELX series [38]. Burla et al. [39] described a procedure, named SYNERGY, which combines DM by Cowtan [40] with out-of-mainstream techniques such as free lunch [41,42], low-density Fourier transform [43], vive la difference [44,45], phantom derivative [46,47], and phase driven model refinement [48].
- (iv)
- An automated model building (AMB) program to generate a model that fits the experimental data. Popular AMB programs include BUCCANEER [49] for proteins, NAUTILUS [50] for nucleic acids, ARP/wARP [51] for proteins and nucleic acids, and the PHENIX AUTOBUILD wizard [52] for proteins and nucleic acids. Recently, a new cyclic AMB procedure called CAB [53], which uses BUCCANEER for protein model building and NAUTILUS for nucleic acids building, has been developed and shown to be highly efficient in experimental applications (see Papers I–III [54,55,56]).

## 2. Results

#### 2.1. About REMO22

_{R}>, <R

_{P}> and <R

_{M}>, respectively, were calculated for each subset of test structures.

_{R}, NR30

_{P}and NR30

_{M}entries, which indicate the number of cases in which each program produced an R value smaller than 0.30. Such cases represent high-quality MR models that do not require further refinement before being submitted to an AMB program. REMO22 produced models meeting this criterion in 61 cases, while PHASER and MOLREP did so in 22 and 36 cases, respectively.

_{R}, <|Δϕ|>

_{P}and <|Δϕ|>

_{M}, which represent the average deviation of the calculated phases from the published phases at the end of the MR step, for REMO22, PHASER and MOLREP, respectively. As with the R values, each <|Δϕ|> value does not refer to the reflection subset actively used in the MR step, due to the different MR resolution limits employed by the three programs. Instead, it relates to all the measured reflections and can therefore be regarded as an absolute, meaningful a posteriori figure of merit. In Figure 1, we present <|Δϕ|>

_{R}, <|Δϕ|>

_{P}and <|Δϕ|>

_{M}, structure by structure, for each subset of test cases (i.e., PH, PD, PG, DNA and RNA). The structures are arranged in ascending order of <|Δϕ|>

_{R}to facilitate readability. For interested readers’ numerical reference, we report <|Δϕ|>

_{R}, <|Δϕ|>

_{P}and <|Δϕ>

_{M}for each structure in Table S2 of the Supplementary Materials.

_{R}= 45°, followed by MOLREP with <|Δϕ|>

_{M}= 56° and PHASER with <|Δϕ|>

_{P}= 58°. These values are in good correlation with the corresponding <R> values presented in Table 2.

_{R}, N70

_{P}and N70

_{M}, respectively). It is noteworthy that N70

_{R}is significantly smaller than N70

_{P}and N70

_{M}for each subset of the test structures (22 against 55 and 50, respectively). For proteins, MOLREP seems to be more effective than PHASER, while PHASER appears to be more effective than MOLREP for nucleic acids (18 cases with <|Δϕ|> ≥ 70° against 24). The correlation of <|Δϕ|>

_{P}and <|Δϕ|>

_{M}with the <R

_{P}> and <R

_{M}> values presented in Table 2 suggests that the overall qualities of the structural models provided by MOLREP and PHASER are quite similar. Furthermore, the REMO22 structural models are of superior quality compared to those provided by PHASER and MOLREP.

_{M}corresponding to approximately 43% of the nucleic acid test structures.

#### 2.2. About the SIR22 Pipeline

_{MR}, calculated over all the test structures at the end of the MR step with the corresponding <|Δϕ|>

_{REF}, calculated after the SYNERGY phase refinement (see Table 3). The second criterion focuses on the number of cases where SYNERGY improves the MR average phase error by at least 10° (N

_{P}10 for proteins and N

_{NA}10 for nucleic acids), or by at least 20° (N

_{P}20 for proteins and N

_{NA}20 for nucleic acids).

- i
- <|Δϕ|>
_{REF}is consistently smaller than <|Δϕ|>_{MR}, irrespective of whether SYNERGY is applied to the REMO22, PHASER, or MOLREP phases. - ii
- REMO22 + SYNERGY provides the phases with the smallest average error (41°), while PHASER + SYNERGY and MOLREP + SYNERGY have average errors of 53° and 46°, respectively.
- iii
- The effectiveness of SYNERGY varies depending on the MR program. When applied to PHASER phases, SYNERGY provides an average phase improvement of 5°, whereas for MOLREP phases, it provides an improvement of 10°. However, for REMO22 phases, the improvement is only 4°. This is not surprising, as REMO22 phases are already refined phases (with an average phase error of 45°, compared to 58° and 56° for PHASER and MOLREP, respectively), making further refinement more challenging.
- iv
- The number of test structures with a phase error reduction of more than 10° (N
_{P}10, N_{NA}10) or 20° (N_{P}20, N_{NA}20) is much higher for the PHASER and MOLREP phases when SYNERGY is applied. Specifically, a reduction of more than 10° is observed for 10% of the test structures for the PHASER phases and 28% of the test structures for the MOLREP phases. - v
- We note that the larger effectiveness of SYNERGY for MOLREP phases compared to PHASER phases is not completely understood at this point.

- i
- RESOLVE leads to a 2° improvement in the PHASER and MOLREP phases, as compared to the 5° and 10° improvement obtained by SYNERGY, respectively.
- ii
- The values of N
_{P}10, N_{P}20, N_{NA}10, N_{NA}20 corresponding to RESOLVE phases are almost always close to zero. This means that RESOLVE is not able to improve the average phase errors by at least 10°, regardless of whether the phases were originally obtained by MOLREP or PHASER. - iii
- The phases obtained by PHASER + RESOLVE are similar to those obtained by MOLREP + RESOLVE, making them an almost equivalent starting point for the application of the AMB programs.

- i
- If 65% or more of non-H atoms are within 0.6 Å of the published coordinates at the end of the CAB procedure, then the automatic crystal structure solution is considered successful. While some readers may find this percentage too lenient, and others too strict, we believe it to be practical, since refinement and completion of the model structure may be easily performed once this percentage is exceeded.
- ii
- If less than or equal to 40% of non-H atoms are within 0.6 Å of the published coordinates at the end of the CAB procedure, then the automatic crystal structure solution fails.
- iii
- Partial success occurs when a percentage smaller than 65% and larger than 40% is obtained.

_{MA}) for each pipeline.

- i
- The number of test structures for which the automatic crystal structure solution procedure succeeds, as per the criteria specified earlier, are: 122 for REMO22 + SYNERGY + CAB (N
_{RSC}), 93 for PHASER + SYNERGY + CAB (N_{PSC}) and 108 for MOLREP + SYNERGY + CAB (N_{MSC}). The failure cases, as per the same criteria, are 23 for REMO22 + SYNERGY + CAB, 52 for PHASER + SYNERGY + CAB, and 35 for MOLREP + SYNERGY + CAB. - ii
- MOLREP phases resulted in a smaller number of CAB failures and a larger number of successes compared to PHASER. It is important to note that part of this bias is due to five cases where PHASER stops prematurely while trying to estimate the number of chains in the target asu. User intervention can solve this problem, leading to a statistical improvement in the PHASER results.

_{PRC}) and MOLREP + RESOLVE + CAB (N

_{MRC}) pipelines was also analyzed. Using RESOLVE instead of SYNERGY as the phase refinement program implies that:

- -
- 13 structures are no longer automatically solved with PHASER data, while the number of failures increased by 17 (compare the columns N
_{PSC}and N_{PRC}). - -
- 14 structures are no longer automatically solved with MOLREP data, while the number of failures increased by 17 (compare the columns N
_{MSC}and N_{MRC}).

_{BN}in Table 4), which is 25 less than the number solved by the REMO22 + SYNERGY + CAB pipeline. However, the number of failures increases from 22 to 46. This demonstrates the significant contribution of CAB to the success of the pipeline.

## 3. Discussion

- i
- The SI = 0.3 threshold presents a significant challenge, as evidenced by the fact that 4 out of 10 attempts failed (3nng, 3npg, 3nr6, 3tx8);
- ii
- The inadequacy of the model used for protein complexes containing hetero-oligomers can lead to failure. For instance, the 1lat structure comprises two polypeptide chains of 71 and 74 residues, respectively, as well as two identical nucleic acid chains, each with 19 nucleotides. However, the model only corresponds to the polypeptide chains of the 1glu structure. Similarly, in the case of the 2iff structure, which is a complex of a monoclonal antibody (two chains of 212 and 214 residues), and a lysozyme (one chain of 129 residues), the model only contains the lysozyme chain of the 1hem structure. Even if the models are correctly positioned, recovering the full structure for these cases is a challenge;
- iii
- Inaccurate or incomplete prior information on the crystal-chemical nature of the target can also contribute to failure. For instance, DNA molecules are flexible and can adopt various structures, including G-quadruplex structures formed by nucleic acids rich in guanine. These structures are helical in shape but may be challenging to locate if the model is not a four-stranded DNA structure. Examples of G-quadruplex structures include 1s45, 1s47, 4wo3, and 5ua3;
- iv
- Disorder can also pose a challenge in determining protein structures. For example, in the cases of 3tok and 4gsg, each chain exhibits two distinct configurations, with most of the phosphorus atoms being common to both configurations. The relatively small MA values (0.45 and 0.41, respectively) are calculated with respect to the total number of atoms in the asymmetric unit, including the disordered pairs.

## 4. Material and Methods

#### 4.1. Extension to Nucleic Acids

#### 4.2. Estimation of the Number of Chains Per Asu and of the Number of Model Copies for MR

_{prot}) is usually around 1.35 g/cm

^{3}, which is independent of the protein’s nature and molecular weight [73]. However, these assumptions are not always valid in practice. Fischer et al. [74] conducted tests that suggest that δ

_{prot}= 1.41 g/cm

^{3}is a suitable estimate for proteins with high molecular weight (i.e., M > 30 kDa). However, the protein density increases with decreasing molecular weight and reaches its maximum value of δ

_{prot}= 1.50 g/cm

^{3}for the smallest proteins (i.e., M ≈ 7 kDa).

- i
- In small molecule crystallography, the expected number of molecules per asu is based on the volume per non-H atom (VOLAT), which is usually assumed to be between 16 and 18 Å
^{3}. For macromolecules, the sizes and sequences of the molecular chains present in a target crystal are typically known beforehand. However, the volume of the surrounding solvent remains unknown, making it challenging to estimate the number of chains per target asu. We have modified this rule based on a survey of a wide range of proteins and DNA-RNA structures. For proteins, the expected number of chains per target asu (NCHT) is that for which VOLAT is closest to 38 Å^{3}, and not smaller than 22 Å^{3}. For DNA structures, NCHT is that for which VOLAT is closest to 34.5 Å^{3}, and not smaller than 22 Å^{3}. For RNA structures, NCHT is that for which VOLAT is closest to 44 Å^{3}, and not smaller than 22 Å^{3}. The numerical values were established empirically. - ii
- The second step of the algorithm is aimed at estimating the number of model copies to accommodate in the target asu (NMOD). This information is typically sought after by the MR user. While not critical for the success of the MR procedure, a good early estimate of NMOD can simplify the automatic approach. Furthermore, this step can correct any incorrect NCHT estimate made in the first step of the algorithm. In cases where the model includes n identical chains, the NCHT value needs to be searched among multiples of n. However, there are scenarios where the target composition is made up of NCHT copies of two different sequence chains (one large and one small), while the model comprises only a single large chain. In such cases, confirming the experimental NCHT value is clearly incorrect, while NCHT/2 is a more accurate choice. Our algorithm can identify and address such situations, especially when the size of the smaller chain is insignificant compared to the larger chain (for example, less than 50% of the long chain). In such cases, the smaller chains are disregarded. The algorithm is designed to be flexible and can be applied to situations where the model and/or target consist of copies of chains of varying sizes.To assess the effectiveness of the choices mentioned above, we compared the number of incorrect estimates using the PROTFRAC criterion (50% solvent) versus the VOLAT criterion. Out of a total of 157 test cases, we discovered that the PROTFRAC criterion led to 30 erroneous NCHT estimates, whereas the VOLAT criterion resulted in only 15 incorrect estimates. These findings provide a promising foundation for the complete automation of the MR procedure. In addition, the NMOD value can be rectified in the third step of the algorithm, as described in the main text.
- iii
- During the third step, it is possible to correct the number of model chains to be placed in the target asu through post-estimation. Let us assume that the orientation and location of the nth model have already been determined by the MR procedure, and that the figure of merit (FOM
_{n}) has been calculated to assess the reliability of the model’s position and orientation. The FOM_{n}value is expected to increase with the accuracy of the model, which corresponds to the number of accurately located model copies. If FOM_{n+1}is found to be less than FOM_{n}, then the (n + 1)th copy of the model is rejected, the MR procedure is stopped, and the phase refinement step is started.

_{n}, is calculated. For proteins, CLASH

_{n}estimates the fraction of Cα atoms that overlap (within 3.0 Å) once the nth model has been located. For nucleic acids, it estimates the overlapping fraction of the phosphate and C atoms in the ribose-phosphate backbone and the N atoms of the bases.

_{n+1}> 35% or

_{n+1}> 0.10

_{n+1}is sufficiently large, it may be risky to include the (n + 1)th model copy in the current model. Conversely, if CLASH

_{n+1}is very small, and R(n + 1) − R(n) is also sufficiently small to meet Condition (2), accepting the (n + 1)th model copy appears to be a reasonable decision. To avoid numerical divergence in Equation (2), we consider a CLASH value below 0.10 to be insignificant. Therefore, if CLASH < 0.05, we set CLASH to 0.05 in Equation (2). This algorithm is applied identically to both proteins and nucleic acids.

#### 4.3. Resolution Limits

#### 4.4. Search Algorithm for the Rotation Step

_{1}dθ, n

_{2}dθ, n

_{3}dθ being the Euler angles corresponding to the cubic primitive lattice. There is at least one point in the unit cell of such a lattice that is approximately 0.87dθ from the lattice points (i.e., the center of the cubic cell). To reduce sampling errors, the angular grid can be lowered to dθ/2, but this results in eight times more lattice points. An alternative approach is to explore the orientation space using a body-centered lattice, which doubles the number of lattice points but ensures that no point in the body-centered cubic cell is farther than 0.56dθ from any lattice point. The body-centered cubic lattice is obtained by first exploring the orientation space using a primitive lattice and then exploring the same angular space using the same primitive cubic lattice, but starting from (dθ/2, dθ/2, dθ/2).

#### 4.5. Anisotropy Correction

**h**] = [hkl]. If these points are chosen at very low resolution (e.g., [100], [010], [001], [110], [101], etc.), they will represent all directions in reciprocal space and will be referred to as polar directions. For each polar direction [

**h**], the following steps are executed:

- (1)
- The reciprocal space is divided into cones, all with the same axis as the polar direction. The cones are arranged so that each one is fully contained in the next. The shells (i.e., the regions of reciprocal space between adjacent cones) have approximately equal volumes and therefore contain approximately the same number of lattice points. For each shell, α is the average angle (
**k**,**h**), where**k**is the generic lattice point in the shell. - (2)
- For each shell, <|E
|_{k}^{2}> is calculated, and the corresponding values are plotted against α. - (3)
- The von Mises distributionM = exp (G cos 2α)
|_{k}^{2}> distribution. If G is large and positive, then <|E|_{k}^{2}> > 1 along the**h**direction, if G is large and negative then <|E|_{k}^{2}> < 1 along the**h**direction.

**c***. In contrast, the orthorhombic system requires all three ellipsoid axes to be parallel to

**a***,

**b***and

**c***. In monoclinic systems, one of the three ellipsoid axes aligns with the unique two-fold axis

**b***. In the absence of symmetry constraints, the ellipsoid orientation in triclinic systems is not predetermined.

**a***,

**b***and

**c***are unrelated by symmetry elements, and the ellipsoid must have three axes to account for all possible anisotropy values against the crystal symmetry. To avoid discrepancies, the three ellipsoid axes necessarily align with

**a***,

**b*** and

**c*** because otherwise the [hkl], [-h-kl], [-hk-l], [h-k-l] directions should have different anisotropy values with respect to the crystal symmetry.

^{2}>

_{[100]}(cos

^{2}ϑ

_{1}+ cos

^{2}ϑ

_{2}) + <E

^{2}>

_{[001]}cos

^{2}ϑ

_{3}

_{1}is the angle between the direction [hkl] and the direction [100], ϑ2 is the angle between [hkl] and [1−20], ϑ3 is the angle between [hkl] and [001]. E denotes the normalized structure factor corresponding to F.

^{2}>

_{[100]}(cos

^{2}ϑ

_{1}+ cos

^{2}ϑ

_{2}) + <E

^{2}>

_{[001]}cos

^{2}ϑ

_{3}

_{1}, ϑ

_{2}, ϑ

_{3}are the angles between [hkl] and [100], [010], [001] respectively.

^{2}>

_{[100]}cos

^{2}ϑ

_{1}+ <E

^{2}>

_{[010]}cos

^{2}ϑ

_{2}+ <E

^{2}>

_{[001]}cos

^{2}ϑ

_{3}

_{1}, ϑ

_{2}, ϑ

_{3}are the angles between [hkl] and [100], [010] and [001] respectively.

**a***and

**c***. As a result, these axes coincide with the directions [h0l]. To correct for anisotropy in the monoclinic system, the polar direction with the largest G value, denoted as [h

_{1}0l

_{1}], is identified. Next, a direction [h

_{2}0l

_{2}] that is approximately or exactly perpendicular to [h

_{1}0l

_{1}] is sought. This direction will be used to correct the anisotropy of the reciprocal space:

_{1}, ϑ

_{2}, ϑ

_{3}are the angles between [hkl] and [h

_{1}0l

_{1}], [010] and [h

_{2}0l

_{2}], respectively.

_{1}k

_{1}l

_{1}], is identified. Next, a direction [h

_{2}k

_{2}l

_{2}] with the largest G value is found in the plane that is approximately or exactly perpendicular to [h

_{1}k

_{1}l

_{1}]. Finally, a direction [h

_{3}k

_{3}l

_{3}] is identified that is perpendicular to both [h

_{1}k

_{1}l

_{1}] and [h

_{2}k

_{2}l

_{2}], either exactly or approximately. These directions will be used to correct the anisotropy of the reciprocal space in the triclinic system:

_{1}, ϑ

_{2}, ϑ

_{3}are the angles between [hkl] and [h

_{1}k

_{1}l

_{1}], [h

_{2}k

_{2}l

_{2}] and [h

_{3}k

_{3}l

_{3}] respectively.

^{2}

_{obs}= |E|

^{2}

_{obs}/O

^{2}

_{obs}in the RFOM calculations.

#### 4.6. Figures of Merit for the Rotation Step

_{p}

_{1}, F

_{p}

_{2}, …, F

_{pn}are the structure factors corresponding to the first, second,…, nth located model copy, m is the number of symmetry operators for the given space group and $\sum _{s=1}^{m}|{F}_{ps}{|}^{2}$ refers to the (n + 1)th model copy, for which we are searching the correct orientation. When n = 0, meaning that the first model copy is being rotated, Equation (4) simplifies to:

#### 4.7. Figures of Merit for the Translation Step

_{pi}represents the structure factor of the model, which is calculated based on the position of the i-th previously located copy of the model. When n = 0, meaning that the first model copy is being rotated, Equation (8) simplifies to:

_{p}is the normalized structure factor of F

_{p}.

#### 4.8. Rigid Body Refinement by SIMPLEX

#### 4.9. Selection of the Correct Solutions

#### 4.10. About the Location of the Second and Further Model Copies

_{1}values are calculated to start the search for the second model copy’s roto-translation. However, this approach has a potential pitfall because Fp

_{1}arises from a rigid body model, and inaccuracies in the model orientation and location, as well as structural differences between the model and target, may create a systematic bias that can affect the FOMs effectiveness. As a result, it can be challenging to recognize the correct roto-translation parameters for the second model copy.

#### 4.11. Automatic Restart

#### 4.12. Essential Directives

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

asu | asymmetric unit |

m | number of symmetry operators for a given space group |

t, t_{p} | number of non-H atoms in the asymmetric units of the target and model structure, respectively |

N = mt, N _{p} = mt_{p} | number of non-H atoms in the unit cells of the target and model structure, respectively. To simplify, all of the atoms are assumed to be in general position. |

$F={\displaystyle \sum _{s=1}^{m}}{F}_{s}$ | structure factor of the target structure, where ${F}_{s}=\sum _{j=1}^{t}{f}_{j}\mathit{exp}[2\pi i\mathit{h}({\mathit{R}}_{s}{\mathit{r}}_{j}+{\mathit{T}}_{s})]$, r_{j} are the atomic positions of the model structure |

${F}_{p}={\displaystyle \sum _{s=1}^{m}}{F}_{ps}$ | structure factor of the model structure, where ${F}_{ps}=\sum _{j=1}^{{t}_{p}}{f}_{j}\mathit{exp}[2\pi i\mathit{h}({\mathit{R}}_{s}{\mathit{r}}_{pj}+{\mathit{T}}_{s})]$ |

$E,{E}_{p}$ | normalized structure factors of $F,{F}_{p}$ respectively |

r_{pj} | are the atomic positions of the model structure |

EDM | electron density modification techniques |

SI | sequence identity between target and model structure |

AMB | automated model building |

R | crystallographic R residual |

## References

- Weeks, C.M.; DeTitta, G.T.; Hauptman, H.A.; Thuman, P.; Miller, R. Structure solution by minimal-function phase refinement and Fourier filtering. II. Implementation and applications. Acta Crystallogr. A
**1994**, 50, 210–220. [Google Scholar] [CrossRef] - Rappleye, J.; Innus, M.; Weeks, C.M.; Miller, R. SnB version 2.2: An example of crystallographic multiprocessing. J. Appl. Crystallogr.
**2002**, 35, 374–376. [Google Scholar] [CrossRef] - Sheldrick, G.M. SHELX Applications to Macromolecules. In Direct Methods for Solving Macromolecular Structures; Fortier, S., Ed.; Springer: Dordrecht, The Netherlands, 1998; pp. 401–411. ISBN 978-94-015-9093-8. [Google Scholar]
- Foadi, J.; Woolfson, M.M.; Dodson, E.J.; Wilson, K.S.; Jia-xing, Y.; Chao-de, Z. A flexible and efficient procedure for the solution and phase refinement of protein structures. Acta Cryst. D Biol. Crystallogr.
**2000**, 56, 1137–1147. [Google Scholar] [CrossRef] [Green Version] - Palatinus, L. Ab initio determination of incommensurately modulated structures by charge flipping in superspace. Acta Crystallogr. A
**2004**, 60, 604–610. [Google Scholar] [CrossRef] [PubMed] - Burla, M.C.; Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C.; Polidori, G. More power for direct methods: SIR2002. Z. Krist.
**2002**, 217, 629–635. [Google Scholar] [CrossRef] - Burla, M.C.; Caliandro, R.; Camalli, M.; Carrozzini, B.; Cascarano, G.L.; Caro, L.D.; Giacovazzo, C.; Polidori, G.; Spagna, R. SIR2004: An improved tool for crystal structure determination and refinement. J. Appl. Crystallogr.
**2005**, 38, 381–388. [Google Scholar] [CrossRef] [Green Version] - Giacovazzo, C. A general approach to phase relationships: The method of representations. Acta Crystallogr. A
**1977**, 33, 933–944. [Google Scholar] [CrossRef] - Giacovazzo, C. Representations of structure invariants and seminvariants. In Direct Phasing in Crystallography: Fundamentals and Applications; Oxford Science Publications; Oxford University Press: New York, NY, USA, 1998; pp. 243–274. ISBN 978-0-19-850072-8. [Google Scholar]
- Frazão, C.; Sieker, L.; Sheldrick, G.; Lamzin, V.; LeGall, J.; Carrondo, M.A. Ab initio structure solution of a dimeric cytochrome c3 from Desulfovibrio gigas containing disulfide bridges. JBIC J. Biol. Inorg. Chem.
**1999**, 4, 162–165. [Google Scholar] [CrossRef] [PubMed] - Mooers, B.H.M.; Matthews, B.W. Extension to 2268 atoms of direct methods in the ab initio determination of the unknown structure of bacteriophage P22 lysozyme. Acta Crystallogr. D Biol. Crystallogr.
**2006**, 62, 165–176. [Google Scholar] [CrossRef] - Buerger, M.J. Phase Determination with the Aid of Implication Theory. Phys. Rev.
**1948**, 73, 927–928. [Google Scholar] [CrossRef] - Buerger, M.J. Vector Space; Wiley: New York, NY, USA, 1959; Chapter 11. [Google Scholar]
- Simpson, P.G.; Dobrott, R.D.; Lipscomb, W.N. The symmetry minimum function: High order image seeking functions in X-ray crystallography. Acta Crystallogr.
**1965**, 18, 169–179. [Google Scholar] [CrossRef] - Richardson, J.W.; Jacobson, R.A. Computer-aided analysis of multi-solution Patterson superpositions. In Patterson and Pattersons: Fifty Years of the Patterson Function: Proceedings of a Symposium Held at the Institute for Cancer Research, the Fox Chase Cancer Center, Philadelphia, PA, USA, November 13–15, 1984; Glusker, J.P., Patterson, B.K., Rossi, M., Eds.; International Union of Crystallography Crystallographic Symposia; Oxford University Press: New York, NY, USA, 1987; ISBN 978-0-19-855230-7. [Google Scholar]
- Sheldrick, G.M. Tutorial on automated Patterson interpretation to find heavy atoms. In Crystallographic Computing 5: From Chemistry to Biology; Moras, D., Podjarny, A.D., Thierry, C., Eds.; IUCr Crystallographic Symposia; Oxford University Press: New York, NY, USA, 1991; pp. 145–157. ISBN 978-0-19-855384-7. [Google Scholar]
- Pavelčík, F.; Kuchta, L.; Sivý, J. Patterson-oriented automatic structure determination. Utilizing Patterson peaks. Acta Crystallogr. A
**1992**, 48, 791–796. [Google Scholar] [CrossRef] - Caliandro, R.; Carrozzini, B.; Cascarano, G.L.; Caro, L.D.; Giacovazzo, C.; Mazzone, A.; Siliqi, D. Ab initio phasing of proteins with heavy atoms at non-atomic resolution: Pushing the size limit of solvable structures up to 7890 non-H atoms in the asymmetric unit. J. Appl. Crystallogr.
**2008**, 41, 548–553. [Google Scholar] [CrossRef] - Rossmann, M.G.; Blow, D.M. The Detection of Sub-Units within the Crystallographic Asymmetric Unit. Acta Crystallogr.
**1962**, 15, 24–31. [Google Scholar] [CrossRef] - Rossmann, M.G. The Molecular Replacement Method; Gordon & Breach: New York, NY, USA, 1972. [Google Scholar]
- Rossmann, M.G. The Molecular Replacement Method. Acta Crystallogr A
**1990**, 46, 73–82. [Google Scholar] [CrossRef] - Kissinger, C.R.; Gehlhaar, D.K.; Fogel, D.B. Rapid automated molecular replacement by evolutionary search. Acta Crystallogr. D Biol. Crystallogr.
**1999**, 55, 484–491. [Google Scholar] [CrossRef] - Jamrog, D.C.; Zhang, Y.; Phillips, G.N.J. SOMoRe: A multi-dimensional search and optimization approach to molecular replacement. Acta Crystallogr. D Biol. Crystallogr.
**2003**, 59, 304–314. [Google Scholar] [CrossRef] [Green Version] - Glykos, N.M.; Kokkinidis, M. A stochastic approach to molecular replacement. Acta Crystallogr. D Biol. Crystallogr.
**2000**, 56, 169–174. [Google Scholar] [CrossRef] - Fujinaga, M.; Read, R.J. Experiences with a new translation-function program. J. Appl. Crystallogr.
**1987**, 20, 517–521. [Google Scholar] [CrossRef] - Navaza, J. AMoRe: An automated package for molecular replacement. Acta Crystallogr. A
**1994**, 50, 157–163. [Google Scholar] [CrossRef] - Read, R.J. Detecting outliers in non-redundant diffraction data. Acta Crystallogr. D Biol. Crystallogr.
**1999**, 55, 1759–1764. [Google Scholar] [CrossRef] [Green Version] - Vagin, A.; Teplyakov, A. Molecular replacement with MOLREP. Acta Crystallogr. D Biol. Crystallogr.
**2010**, 66, 22–25. [Google Scholar] [CrossRef] [PubMed] - McCoy, A.J.; Grosse-Kunstleve, R.W.; Adams, P.D.; Winn, M.D.; Storoni, L.C.; Read, R.J. Phaser crystallographic software. J. Appl. Crystallogr.
**2007**, 40, 658–674. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Caliandro, R.; Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C.; Mazzone, A.; Siliqi, D. Molecular replacement: The probabilistic approach of the program REMO09 and its applications. Acta Crystallogr. A
**2009**, 65, 512–527. [Google Scholar] [CrossRef] [PubMed] - Rigden, D.J.; Thomas, J.M.H.; Simkovic, F.; Simpkin, A.; Winn, M.D.; Mayans, O.; Keegan, R.M. Ensembles generated from crystal structures of single distant homologues solve challenging molecular-replacement cases in AMPLE. Acta Crystallogr. D Struct. Biol.
**2018**, 74, 183–193. [Google Scholar] [CrossRef] [Green Version] - Millán, C.; Sammito, M.; Usón, I. Macromolecular ab initio phasing enforcing secondary and tertiary structure. IUCrJ
**2015**, 2, 95–105. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Millán, C.; Jiménez, E.; Schuster, A.; Diederichs, K.; Usón, I. ALIXE: A phase-combination tool for fragment-based molecular replacement. Acta Crystallogr. D Struct. Biol.
**2020**, 76, 209–220. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Brünger, A.T.; Adams, P.D.; Clore, G.M.; DeLano, W.L.; Gros, P.; Grosse-Kunstleve, R.W.; Jiang, J.S.; Kuszewski, J.; Nilges, M.; Pannu, N.S.; et al. Crystallography & NMR System: A New Software Suite for Macromolecular Structure Determination. Acta Crystallogr. D Biol. Crystallogr.
**1998**, 54, 905–921. [Google Scholar] [CrossRef] - Winn, M.D.; Ballard, C.C.; Cowtan, K.D.; Dodson, E.J.; Emsley, P.; Evans, P.R.; Keegan, R.M.; Krissinel, E.B.; Leslie, A.G.W.; McCoy, A.; et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D Biol. Crystallogr.
**2011**, 67, 235–242. [Google Scholar] [CrossRef] [Green Version] - Bricogne, G.; Vonrhein, C.; Flensburg, C.; Schiltz, M.; Paciorek, W. Generation, representation and flow of phase information in structure determination: Recent developments in and around SHARP 2.0. Acta Crystallogr. D Biol. Crystallogr.
**2003**, 59, 2023–2030. [Google Scholar] [CrossRef] [Green Version] - Adams, P.D.; Afonine, P.V.; Bunkóczi, G.; Chen, V.B.; Davis, I.W.; Echols, N.; Headd, J.J.; Hung, L.-W.; Kapral, G.J.; Grosse-Kunstleve, R.W.; et al. Phenix: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr.
**2010**, 66, 213–221. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Sheldrick, G.M. Crystal structure refinement with SHELXL. Acta Crystallogr. C Struct. Chem.
**2015**, 71, 3–8. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Burla, M.C.; Cascarano, G.L.; Giacovazzo, C.; Polidori, G. Synergy among phase-refinement techniques in macromolecular crystallography. Acta Crystallogr. D Struct. Biol.
**2017**, 73, 877–888. [Google Scholar] [CrossRef] [PubMed] - Cowtan, K. Fast Fourier feature recognition. Acta Crystallogr. D Biol. Crystallogr.
**2001**, 57, 1435–1444. [Google Scholar] [CrossRef] [Green Version] - Caliandro, R.; Carrozzini, B.; Cascarano, G.L.; De Caro, L.; Giacovazzo, C.; Siliqi, D. Phasing at resolution higher than the experimental resolution. Acta Crystallogr. D Biol. Crystallogr.
**2005**, 61, 556–565. [Google Scholar] [CrossRef] - Caliandro, R.; Carrozzini, B.; Cascarano, G.L.; De Caro, L.; Giacovazzo, C.; Siliqi, D. Ab initio phasing at resolution higher than experimental resolution. Acta Crystallogr. D Biol. Crystallogr.
**2005**, 61, 1080–1087. [Google Scholar] [CrossRef] [Green Version] - Giacovazzo, C.; Siliqi, D. Improving Direct-Methods Phases by Heavy-Atom Information and Solvent Flattening. Acta Crystallogr. A
**1997**, 53, 789–798. [Google Scholar] [CrossRef] [Green Version] - Burla, M.C.; Caliandro, R.; Giacovazzo, C.; Polidori, G. The difference electron density: A probabilistic reformulation. Acta Crystallogr. A
**2010**, 66, 347–361. [Google Scholar] [CrossRef] - Burla, M.C.; Giacovazzo, C.; Polidori, G. From a random to the correct structure: The VLD algorithm. J. Appl. Crystallogr.
**2010**, 43, 825–836. [Google Scholar] [CrossRef] - Giacovazzo, C. Solution of the phase problem at non-atomic resolution by the phantom derivative method. Acta Crystallogr. A Found. Adv.
**2015**, 71, 483–512. [Google Scholar] [CrossRef] - Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C. Phase improvement via the Phantom Derivative technique: Ancils that are related to the target structure. Acta Crystallogr. D Struct. Biol.
**2016**, 72, 551–557. [Google Scholar] [CrossRef] [PubMed] - Giacovazzo, C. From direct-space discrepancy functions to crystallographic least squares. Acta Crystallogr. A Found. Adv.
**2015**, 71, 36–45. [Google Scholar] [CrossRef] - Cowtan, K. The Buccaneer software for automated model building. 1. Tracing protein chains. Acta Crystallogr. D Biol. Crystallogr.
**2006**, 62, 1002–1011. [Google Scholar] [CrossRef] [Green Version] - Cowtan, K. Automated nucleic acid chain tracing in real time. IUCrJ
**2014**, 1, 387–392. [Google Scholar] [CrossRef] [Green Version] - Langer, G.; Cohen, S.X.; Lamzin, V.S.; Perrakis, A. Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat. Protoc.
**2008**, 3, 1171–1179. [Google Scholar] [CrossRef] [PubMed] - Terwilliger, T.C.; Grosse-Kunstleve, R.W.; Afonine, P.V.; Moriarty, N.W.; Zwart, P.H.; Hung, L.-W.; Read, R.J.; Adams, P.D. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. Acta Crystallogr. D Biol. Crystallogr.
**2008**, 64, 61–69. [Google Scholar] [CrossRef] [Green Version] - Burla, M.C.; Carrozzini, B.; Cascarano, G.L.; Polidori, G.; Giacovazzo, C. CAB: A cyclic automatic model-building procedure. Acta Crystallogr. D Struct. Biol.
**2018**, 74, 1096–1104. [Google Scholar] [CrossRef] [PubMed] - Burla, M.C.; Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C.; Polidorib, G. How far are we from automatic crystal structure solution via molecular-replacement techniques? Acta Crystallogr. D Struct. Biol.
**2020**, 76, 9–18. [Google Scholar] [CrossRef] - Burla, M.C.; Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C.; Polidori, G. Cyclic automated model building (CAB) applied to nucleic acids. Crystals
**2020**, 10, 280. [Google Scholar] [CrossRef] [Green Version] - Cascarano, G.L.; Giacovazzo, C. Towards the automatic crystal structure solution of nucleic acids: Automated model building using the new CAB program. Acta Crystallogr. D Struct. Biol.
**2021**, 77, 1602–1613. [Google Scholar] [CrossRef] - Keegan, R.M.; Winn, M.D. MrBUMP: An automated pipeline for molecular replacement. Acta Crystallogr. D Biol. Crystallogr.
**2008**, 64, 119–124. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature
**2021**, 596, 583–589. [Google Scholar] [CrossRef] [PubMed] - Terwilliger, T.C.; Poon, B.K.; Afonine, P.V.; Schlicksup, C.J.; Croll, T.I.; Millán, C.; Richardson, J.S.; Read, R.J.; Adams, P.D. Improved AlphaFold modeling with implicit experimental information. Nat. Methods
**2022**, 19, 1376–1382. [Google Scholar] [CrossRef] [PubMed] - Bond, P.S. Next Generation Software for Placing Atoms into Electron Density Maps. Ph.D. Thesis, University of York, York, UK, 2021. [Google Scholar]
- Stein, N. CHAINSAW: A program for mutating pdb files used as templates in molecular replacement. J. Appl. Crystallogr.
**2008**, 41, 641–643. [Google Scholar] [CrossRef] - Burla, M.C.; Caliandro, R.; Carrozzini, B.; Cascarano, G.L.; Cuocci, C.; Giacovazzo, C.; Mallamo, M.; Mazzone, A.; Polidori, G. Crystal structure determination and refinement via SIR2014. J. Appl. Crystallogr.
**2015**, 48, 306–309. [Google Scholar] [CrossRef] - Terwilliger, T.C. Reciprocal-space solvent flattening. Acta Crystallogr. D Biol. Crystallogr.
**1999**, 55, 1863–1871. [Google Scholar] [CrossRef] - Terwilliger, T.C. Maximum-likelihood density modification. Acta Crystallogr. D Biol. Crystallogr.
**2000**, 56, 965–972. [Google Scholar] [CrossRef] - Bricogne, G. Maximum entropy and the foundations of direct methods. Acta Crystallogr. A
**1984**, 40, 410–445. [Google Scholar] [CrossRef] - Bricogne, G. A Bayesian statistical theory of the phase problem. I. A multichannel maximum-entropy formalism for constructing generalized joint probability distributions of structure factors. Acta Crystallogr. A
**1988**, 44, 517–545. [Google Scholar] [CrossRef] - Lunin, V.Y. Electron-denisty histograms and the phase problem. Acta Crystallogr. D Biol. Crystallogr.
**1993**, 49, 90–99. [Google Scholar] [CrossRef] - Cascarano, G.L.; Cuocci, C.; Mallamo, M.; Carrozzini, B.; Moliterni, A. JAV (Just Another Viewer). Istituto di Cristallografia, The National Research Council (CNR), Bari, Italy. Graphic software to display and manipulate atomic models of small structures or macomolecules. Unpublished work. 2021. [Google Scholar]
- DiMaio, F.; Terwilliger, T.C.; Read, R.J.; Wlodawer, A.; Oberdorfer, G.; Wagner, U.; Valkov, E.; Alon, A.; Fass, D.; Axelrod, H.L.; et al. Improved molecular replacement by density- and energy-guided protein structure optimization. Nature
**2011**, 473, 540–543. [Google Scholar] [CrossRef] [Green Version] - Das, R.; Baker, D. Prospects for de novo phasing with de novo protein models. Acta Crystallogr. D Biol. Crystallogr.
**2009**, 65, 169–175. [Google Scholar] [CrossRef] [Green Version] - Matthews, B.W. Solvent content of protein crystals. J. Mol. Biol.
**1968**, 33, 491–497. [Google Scholar] [CrossRef] [PubMed] - Kantardjieff, K.A.; Rupp, B. Matthews coefficient probabilities: Improved estimates for unit cell contents of proteins, DNA, and protein-nucleic acid complex crystals. Protein Sci.
**2003**, 12, 1865–1871. [Google Scholar] [CrossRef] [PubMed] - Quillin, M.L.; Matthews, B.W. Accurate calculation of the density of proteins. Acta Crystallogr. D Biol. Crystallogr.
**2000**, 56, 791–794. [Google Scholar] [CrossRef] [Green Version] - Fischer, H.; Polikarpov, I.; Craievich, A.F. Average protein density is a molecular-weight-dependent function. Protein Sci.
**2009**, 13, 2825–2828. [Google Scholar] [CrossRef] - Hirshfeld, F.L. Symmetry in the generation of trial structures. Acta Crystallogr. A
**1968**, 24, 301–311. [Google Scholar] [CrossRef] - Altomare, A.; Burla, M.C.; Cascarano, G.; Giacovazzo, C.; Guagliardi, A.; Moliterni, A.G.G.; Polidori, G. Early Finding of Preferred Orientation: Applications to Direct Methods. J. Appl. Crystallogr.
**1996**, 29, 341–345. [Google Scholar] [CrossRef] - Giacovazzo, C. Updating direct methods. Acta Crystallogr. A Found. Adv.
**2019**, 75, 142–157. [Google Scholar] [CrossRef] - Vagin, A.; Teplyakov, A. MOLREP: An Automated Program for Molecular Replacement. J. Appl. Crystallogr.
**1997**, 30, 1022–1025. [Google Scholar] [CrossRef] - Rowan, T. Functional Stability Analysis of Numerical Algorithms. Ph.D. Thesis, University of Texas, Austin, TX, USA, 1990. [Google Scholar]
- Murshudov, G.N.; Skubák, P.; Lebedev, A.A.; Pannu, N.S.; Steiner, R.A.; Nicholls, R.A.; Winn, M.D.; Long, F.; Vagin, A.A. Refmac 5 for the refinement of macromolecular crystal structures. Acta Crystallogr. D Biol. Crystallogr.
**2011**, 67, 355–367. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Chothia, C.; Lesk, A.M. The relation between the divergence of sequence and structure in proteins. EMBO J.
**1986**, 5, 823–826. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**The average phase errors <|Δϕ|> (in degrees) obtained by REMO22 (<|Δϕ|>

_{R}; red line), PHASER (<|Δϕ|>

_{P}; green line) and MOLREP (<|Δϕ|>

_{M}; blue line) at the end of the MR steps ((

**a**) SET PH; (

**b**) SET PD; (

**c**) SET PG; (

**d**) SET DNA; (

**e**) SET RNA). In cases where an MR program declares a failure before the standard ending, we assume <|Δϕ|> = 90° (in five DNA-RNA cases for PHASER). The structures are ordered in increasing values of (<|Δϕ|>

_{R}) for clarity.

**Figure 2.**3zyt, MA = 0.81, SI = 0.22. CAB chain-trace in red, published chain-trace in blue. CAB and the published backbones coincide in most of the target asu.

**Figure 3.**2i3p, MA = 0.63. CAB chain-trace in red, published chain-trace in blue of the protein component. There are regions of the target asu in which CAB and the published backbones do not coincide.

**Table 1.**PDB codes of test structures for MR applications, organized by set: PH, PD, PG, DNA and RNA.

SET | PDB | PDB | PDB | PDB | PDB | PDB | PDB | PDB | PDB | PDB |
---|---|---|---|---|---|---|---|---|---|---|

PH | 1a6m | 1aki | 1bxo | 1dy5 | 1kf4 | 1kqw | 1lat | 1lys | 1na7 | 1s31 |

1tgx | 1tp3 | 1xyg | 1ycn | 1yxa | 1zs0 | 2a03 | 2a46 | 2a4k | 2ah8 | |

2ayv | 2b5o | 2f53 | 2f84 | 2fc3 | 2gq3 | 2h8q | 2hyw | 2i3p | 2iff | |

2o3k | 2oka | 2omt | 2otb | 2p0g | 2pby | 2qu5 | 2sar | 6ebx | 6rhn | |

PD | 3nng | 3npg | 3nr6 | 3o8s | 3on5 | 3q6o | 3tx8 | 3zyt | 4e2t | 4fqd |

1cgn | 1cgo | 1e8a | 2f8m | 5ww0 | ||||||

PG | 1vkf | 1vki | 1vl2 | 1vl7 | 1vlc | 2wu6 | 2x7h | 3e49 | 3gp0 | 3h9e |

3h9r | 3khu | 3l23 | 3llx | 3m7a | 3mbj | 3mcq | 3mdo | 3mz2 | 3nyy | |

3obi | 3oz2 | 3p94 | 3ufi | 3us5 | 4e2e | 4ef2 | 4ezg | 4fvs | 4gbs | |

4gcm | 4ler | 4mru | 4ogz | 4ouq | 4q1v | 4q34 | 4q53 | 4q6k | 4q9a | |

4qjr | 4qni | 4r0k | 4rvo | 4rwv | 4yod | |||||

DNA | 1s45 | 1s47 | 2b1d | 2htt | 3ce5 | 3eil | 3gom | 3goo | 3n4o | 3tok |

4gsg | 4l24 | 4ltl | 4ms5 | 4wo3 | 4xqz | 4zym | 5cv2 | 5i4s | 5ihd | |

5j0e | 5ju4 | 5lj4 | 5mvt | 5nt5 | 5t4w | 5tgp | 5ua3 | 6f3c | 6h5r | |

6tzq | ||||||||||

RNA | 1iha | 1lc4 | 1mwl | 1q96 | 1z7f | 2a0p | 2fd0 | 2pn4 | 3d2v | 3fs0 |

3owi | 3oxm | 3s49 | 3td1 | 4enc | 4jab | 5fj0 | 5kvj | 5l4o | 5nz6 | |

5ux3 | 5uz6 | 5zeg | 6az4 | 6cab |

**Table 2.**Performance comparison of REMO22, PHASER and MOLREP (the subscripts R, P and M represent the three MR programs) The final average R values (in %), denoted by <R

_{R}>, <R

_{P}> and <R

_{M}> respectively, were calculated for each subset of test structures. NR30

_{R}, NR30

_{P}and NR30

_{M}are the number of test structures for which the final R value was ≤0.30. N70

_{R}, N70

_{P}, and N70

_{M}are the number of test structures for which the final <|Δϕ|> value of ≥70°.

SUBSET | <R_{R}> | <R_{P}> | <R_{M}> | NR30_{R} | NR30_{P} | NR30_{M} | N70_{R} | N70_{P} | N70_{M} |
---|---|---|---|---|---|---|---|---|---|

PH | 30 | 36 | 31 | 24 | 16 | 20 | 2 | 5 | 3 |

PD | 42 | 50 | 50 | 1 | 0 | 0 | 7 | 12 | 12 |

PG | 35 | 43 | 38 | 14 | 3 | 7 | 6 | 20 | 11 |

DNA | 34 | 45 | 46 | 11 | 3 | 2 | 3 | 10 | 16 |

RNA | 34 | 46 | 42 | 11 | 2 | 7 | 4 | 8 | 8 |

OVERALL | 34 | 43 | 40 | 61 | 22 | 36 | 22 | 55 | 50 |

**Table 3.**The results for each pipeline segment are quoted: (i) the global average phase error ${\u27e8\left|\mathsf{\Delta}\mathsf{\varphi}\right|\u27e9}_{\mathrm{MR}}$ calculated over all the test structures at the end of the MR step via REMO22, PHASER and MOLREP, and the corresponding ${\u27e8\left|\mathsf{\Delta}\mathsf{\varphi}\right|\u27e9}_{\mathrm{REF}}$ calculated after the phase refinement step using either SYNERGY or RESOLVE. All phase errors are in degrees; (ii) the number of proteins for which SYNERGY or RESOLVE improves the MR average phase error by at least 10° (N

_{P}10) and by at least 20° (N

_{P}20); (iii) the number of nucleic acids for which SYNERGY or RESOLVE improves the MR average phase error by at least 10° (N

_{NA}10) and by at least 20° (N

_{NA}20).

Pipeline Segment | ${\mathbf{\u27e8}\mathbf{\left|}\mathbf{\Delta}\mathsf{\varphi}\mathbf{\right|}\mathbf{\u27e9}}_{\mathbf{M}\mathbf{R}}$ | ${\mathbf{\u27e8}\mathbf{\left|}\mathbf{\Delta}\mathsf{\varphi}\mathbf{\right|}\mathbf{\u27e9}}_{\mathbf{R}\mathbf{E}\mathbf{F}}$ | N_{P}10 | N_{P}20 | N_{NA}10 | N_{NA}20 |
---|---|---|---|---|---|---|

REMO22 + SYNERGY | 45 | 41 | 12 | 6 | 0 | 0 |

PHASER + SYNERGY | 58 | 53 | 16 | 6 | 17 | 4 |

MOLREP + SYNERGY | 56 | 46 | 44 | 23 | 16 | 10 |

PHASER + RESOLVE | 58 | 56 | 1 | 0 | 0 | 0 |

MOLREP + RESOLVE | 56 | 54 | 1 | 0 | 1 | 0 |

**Table 4.**MA denotes the percentage of non-hydrogen atoms within 0.6 Å of the published atomic coordinates, represented by the metric MA. The number of structures (NRSC, NPSC, NMSC, NPRC, NMRC, NRSBN) with MA belonging to each MA interval (INTMA) are shown *.

INT_{MA} | N_{RSC} | N_{PSC} | N_{MSC} | N_{PRC} | N_{MRC} | N_{RSBN} |
---|---|---|---|---|---|---|

MA > 65 | 122 | 93 | 108 | 80 | 94 | 98 |

40 < MA ≤ 65 | 12 | 12 | 14 | 8 | 11 | 13 |

MA ≤ 40 | 23 | 52 | 35 | 69 | 52 | 46 |

_{RSC}), PHASER + SYNERGY + CAB (N

_{PSC}), MOLREP + SYNERGY + CAB (N

_{MSC}), PHASER + RESOLVE + CAB (N

_{PRC}), MOLREP + RESOLVE + CAB (N

_{MRC}) and REMO22 + SYNERGY + (BUCCANEER or NAUTILUS) (N

_{RSBN}).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Carrozzini, B.; Cascarano, G.L.; Giacovazzo, C.
The Automatic Solution of Macromolecular Crystal Structures via Molecular Replacement Techniques: REMO22 and Its Pipeline. *Int. J. Mol. Sci.* **2023**, *24*, 6070.
https://doi.org/10.3390/ijms24076070

**AMA Style**

Carrozzini B, Cascarano GL, Giacovazzo C.
The Automatic Solution of Macromolecular Crystal Structures via Molecular Replacement Techniques: REMO22 and Its Pipeline. *International Journal of Molecular Sciences*. 2023; 24(7):6070.
https://doi.org/10.3390/ijms24076070

**Chicago/Turabian Style**

Carrozzini, Benedetta, Giovanni Luca Cascarano, and Carmelo Giacovazzo.
2023. "The Automatic Solution of Macromolecular Crystal Structures via Molecular Replacement Techniques: REMO22 and Its Pipeline" *International Journal of Molecular Sciences* 24, no. 7: 6070.
https://doi.org/10.3390/ijms24076070