Next Article in Journal
Endolithic Fungal Species Markers for Harshest Conditions in the McMurdo Dry Valleys, Antarctica
Next Article in Special Issue
A Constructive Way to Think about Different Hydrothermal Environments for the Origins of Life
Previous Article in Journal
RNA Aptamers for a tRNA-Binding Protein from Aeropyrum pernix with Homologous Counterparts Distributed Throughout Evolution
Previous Article in Special Issue
Polyesters as a Model System for Building Primitive Biologies from Non-Biological Prebiotic Chemistry
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

A Polyaddition Model for the Prebiotic Polymerization of RNA and RNA-Like Polymers

Department of Computer Engineering, University of California, Santa Cruz, CA 95064, USA
Center for Studies in Physics and Biology, The Rockefeller University, New York, NY 10065, USA
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Life 2020, 10(2), 12;
Received: 16 December 2019 / Revised: 24 January 2020 / Accepted: 26 January 2020 / Published: 2 February 2020
(This article belongs to the Special Issue Themed Issue Commemorating Prof. David Deamer's 80th Birthday)


Implicit in the RNA world hypothesis is that prebiotic RNA synthesis, despite occurring in an environment without biochemical catalysts, produced the long RNA polymers which are essential to the formation of life. In order to investigate the prebiotic formation of long RNA polymers, we consider a general solution of functionally identical monomer units that are capable of bonding to form linear polymers by a step-growth process. Under the assumptions that (1) the solution is well-mixed and (2) bonding/unbonding rates are independent of polymerization state, the concentration of each length of polymer follows the geometric Flory-Schulz distribution. We consider the rate dynamics that produce this equilibrium; connect the rate dynamics, Gibbs free energy of bond formation, and the bonding probability; solve the dynamics in closed form for the representative special case of a Flory-Schulz initial condition; and demonstrate the effects of imposing a maximum polymer length. Afterwards, we derive a lower bound on the error introduced by truncation and compare this lower bound to the actual error found in our simulation. Finally, we suggest methods to connect these theoretical predictions to experimental results.

1. Introduction

The RNA world hypothesis maintains that RNA molecules, being capable of both performing functions and storing information, were the first self-replicating molecules in the origin of life [1,2]. Deamer et al. [3] have advanced a specific theory that outlines the importance of RNA to the origins of life. Of particular interest is the formation of long RNAs called ribozymes which are capable of catalysis [4,5]. A variety of ribozymes have been designed and synthesized [6], including some capable of catalyzing the reactions necessary for RNA replication [7], and some capable of replicating other ribozymes [8,9]. In theory, collections of ribozymes may form autocatalytic sets, leading to self replication and evolution [10,11,12,13]. In order for a ribozyme to exist in the first place, non-enzymatic RNA synthesis must have occured. Extensive experiments have been conducted on non-enzymatic RNA polymerization in various settings, including lipid-assisted synthesis, templating, and chemical activation of the phosphate [14]. Additionally, the effect of wet-dry cycling on RNA polymerization has been studied in simulation [15,16,17] as well as experiment [18,19,20].
RNA polymerization occurs through dehydration synthesis: the ribose unit of one nucleotide bonds with the phosphate unit of another, releasing a single water molecule. This is a classic example of a polycondensation process [21]; however, classical models of polycondensation are not appropriate for RNA polymerization because they were developed for chemical batch reactors where the reaction product is continuously evacuated to increase yield [22] (ch. 2). Instead, since the essential processes of life take place in aqueous solution, the condensate is negligibly small in comparison to the solution as a whole. As a result, the external concentration of water is approximately constant, so RNA polymerization is more accurately modeled as a polyaddition process.
Consider an experiment initially consisting of a solution of monomers (e.g., nucleotides) capable of bonding with each other to form polymers. Each monomer can support two bonds, one on its left and one on its right, so that these monomers can link together to form linear polymers of an arbitrary length. A contiguous chain of k monomer units will henceforth be referred to as a k-mer, including the monomer case where k = 1. For the sake of visualization, one can imagine each monomer unit as a puzzle piece with a A terminus and an B terminus. It is important to note that no matter how long a polymer becomes, it always has precisely one unbound A terminus and one unbound B terminus.
We now make the assumption that the system is well-mixed in the sense that all reactants move and interact freely independent of mass, polymerization status, etc. Under these conditions, the polyaddition interaction between A and B termini is described by Hill-Langmuir protein-ligand reaction kinetics; that is, the two termini bind to each other with a reaction rate constant k+, and bonded A − B pairs separate from each other at a rate k. These are assumed to be independent of the configuration of the reacting monomer units; that is, the bonding rate k+ does not depend on whether each A and B terminus is the endpoint of a long polymer or of a free monomer, nor is the unbonding rate k affected by the position of the A − B bond within a polymer. Under these conditions, the reactions affecting each bonding site take the following simple form:
A + B k k + A B

2. Flory-Schulz Polymer Length Distribution

We have assumed that all binding sites behave identically; this implies that each site has the same (potentially time-varying) probability p of being occupied at any given time. This fact, independent of bonding and un-bonding rates, leads very directly to a geometric distribution of polymer length [21]. Alternatively, Higgs [15] provides a proof of the Flory-Schulz distribution as an equilibrium state characterized entirely by bonding and un-bonding rates as opposed to bonding probabilities.
To see how our bonding probability assumption leads to a geometric distribution, we can perform the thought experiment of randomly selecting a k-mer of any length from the solution. Moving from left to right along the k-mer, the probability of a bond existing between two consecutive monomer units is p. In this way, we can view each k-mer as a sequence of Bernoulli trials, where the length of the k-mer is the number of trials up to and including the first failure. The result is by definition a geometric distribution with parameter p, so the probability mass function ρ(k) over polymer length is given for positive k by:
ρ ( k ) = ( 1 p ) p k 1
From this probability distribution over polymer length, we would like to find the expected concentration of each k-mer as a function of our total concentration [U] of monomer units. If we define n * to be equal to the total concentration of reactants, including monomers and polymers of all lengths, the expected concentration n(k) of k-mers is given by multiplication with (2) as follows:
n ( k ) = n * ρ ( k ) = n * ( 1 p ) p k 1
However, we would like to express this result in terms of [U] rather than n * because n * varies with time as bonds break and reform, whereas [U] is fixed in a closed system. We can find the value of n * using conservation of mass:
[ U ] = k = 1 k n ( k ) = n * ( 1 p ) k = 1 k p k 1 = n * ( 1 p ) 1 n * = ( 1 p ) [ U ]
The distribution of polymer lengths for any bonding probability p is given by substituting the value of n * into (3):
n ( k ) = ( 1 p ) 2 p k 1 [ U ]

2.1. Steady-State Bonding Probability from Reaction Rates

We consider a step-growth polymerization process described by (1), and assume that the total number of reactants is large enough for the law of mass action to apply. Under these conditions, if we introduce the equilibrium constant κ = k / k + , the steady-state concentration of A − B bonds [AB] is given by the Hill-Langmuir equation:
[ A B ] = [ U ] [ A ] [ A ] + κ
Here [A] is always equal to the total reactant concentration n * because each monomer or polymer has exactly one unbound A and one unbound B terminus. Thus we can calculate the steady-state bonding probability P b :
P b = [ A B ] [ U ] = n * n * + κ = ( 1 P b ) [ U ] ( 1 P b ) [ U ] + κ
Rearranging to solve for P b gives a quadratic equation with two real roots for positive values of κ . One of these roots is greater than 1 and thus cannot correspond to a probability, so the other must be the solution. We introduce the reduced rate constant κ ¯ = κ / 2 [ U ] and solve to find a value of P b which can be substituted into (4):
P b = 1 + κ ¯ κ ¯ ( 2 + κ ¯ )

2.2. Thermodynamics of Bonding

We have described the bonding sites as a vast number of non-interacting systems which alternate stochastically between discrete states. This means that the steady-state probability of bonding can be described by Boltzmann statistics if we associate a Gibbs free energy Δ G b with the bound state:
P b = e Δ G b / R T 1 + e Δ G b / R T Δ G b = R T ln P b 1 P b
For this system, the equilibrium constant κ must have units of concentration, meaning that the commonly-employed expression κ = e Δ G b / R T is dimensionally inconsistent. Solving (5) for κ , then substituting (7), we find:
κ = [ U ] e Δ G b / R T 1 + e Δ G b / R T
The relation defining the Gibbs free energy implies a functional dependence between Δ G b and temperature: Δ G b = Δ H b T Δ S b , where Δ H b and Δ S b are the enthalpy and entropy of bonding respectively. This means that depending on the signs of these two quantities, a polymerization reaction may change favorability depending on temperature as shown in Table 1-3 of Voet & Voet [23]; both P b and κ will vary with temperature to reflect this. The four cases are compared in Figure 1 as well as in Table 1.
We expect intuitively that in most polymerization reactions Δ S b would be negative due to the increased order, meaning that polymerization would have to be enthalpically favorable in order to be observed at all. This explains the observation that polymerization is favorable at low temperatures but polymers break down as temperature increases, for example in self-assembly of nanowires [24].
The suggestion that polymerization ought only to proceed when bond formation is enthalpically favorable may appear to conflict with the fact that the formation of an ester linkage between a sugar and a phosphate group is endergonic under standard conditions as shown in Table 13-4 of Nelson & Cox [25]. However, this is a consequence of entropic unfavorability—by confining the reactants to the surface of a microdroplet, Nam et al. were able to nearly eliminate the contribution of the T Δ S term to the free energy of esterification, revealing a favorable negative value of Δ H [26]. Other means of reducing the entropic unfavorability of polymerization such as mineral surface adsorption [27], restriction to small cavities [28], or the excluded volume effect of crowding [29] can also increase the favorability of polymerization.

3. Dynamics

In this section, we look at another way of thinking about our chemical system. In particular, we consider a countably infinite family of reaction equations which describe the way in which i-mers and j-mers bond to form ( i + j ) -mers, represented with the chemical symbols Pi, Pj, and Pi+j. The chemical equations in this family are of the form:
P i + P j k + k P i + j
It is perhaps not immediately obvious that (8) describes the same system as (1), but in fact they are two different views of the same chemical process. From the perspective of bond formation, a k-mer is identical to a monomer in that it has precisely one A terminus and one B terminus. In this view, (8) is derived from splitting up the single reaction Equation (1) into separate chemical equations describing the behavior of each possible configuration of A and B termini: the A terminus is the end of an i-mer, and the B terminus is the end of a j-mer.

3.1. Continuous Dynamics

We have found a set of chemical equations which describe the interactions of individual k-mers Pk. This is fundamentally a stochastic jump process describing discrete numbers of k-mers, but in the thermodynamic limit as the number of reactants grows very large, we can concern ourselves with the deterministic, continuous evolution of the expected concentration n ( k ) of k-mers.
Our dynamics can be written as a system of differential equations describing the time derivative of n ( k ) . As is usual for deriving mass-action differential equations from systems of chemical equations, we find the time derivative of n ( k ) by a summation over each place where Pk occurs in the system of chemical equations: if it is on the left-hand side, a negative contribution is made to d d t n ( k ) , and if on the right, the contribution is positive.
Any given Pk can appear in all three positions in the chemical Equation (8). For each equation where Pk appears as the first term on the left side (i.e., for each possible synthesis partner j N ), we lose Pk at a rate k + [ P k ] [ P j ] , but gain it at a rate k [ P k + j ] . Each of those contributions should also be doubled to handle the functionally identical case where Pk appears as the second term on the left side. Finally, when Pk appears on the right side, for each possible split point j { 1 k 1 } , we gain Pk at a rate k + [ P k j ] [ P j ] and lose it at a rate k [ P k ] . The facts above can be consolidated into a single differential equation describing the evolution of n ( k ) = [ P k ] as follows:
d n ( k ) d t = j = 1 2 k n ( k + j ) 2 k + n ( k ) n ( j ) + j = 1 k 1 k + n ( j ) n ( k j ) k n ( k )

3.2. Reduction to One Dimension

We consider the special case where the initial condition is a Flory-Schulz distribution (4) with rate parameter p ( 0 ) . For example, the p ( 0 ) = 0 case would be a solution consisting entirely of monomers, and is particularly relevant as it is a popular experimental initial condition [18,19,20,30].
The derivation of the Flory-Schulz distribution holds for all time in a well-mixed step-growth polymerization process, even as the distribution parameter p evolves. This has been predicted theoretically [31] and demonstrated experimentally [24,32,33]. We would like to calculate the rate at which the distribution parameter p changes with time. A generalization of this problem was discussed in the context of self-assembling nanoparticles by Gu et al. [34] (SI 2).
Applying the principle of mass action to (1) to calculate the time derivative of the total concentration of bonds [AB], then dividing through by [U] gives:
d p d t = d d t [ A B ] [ U ] = [ U ] 1 k + ( [ U ] [ A B ] ) 2 k [ U ] 1 [ A B ] = [ U ] k + ( 1 p ) 2 k p
The time evolution of the Flory-Schulz parameter p according to our closed-form solution of this equation, together with the resulting time evolution of the polymer length distribution, is shown in Figure 2. The initial condition is p = 0 , corresponding to an all-monomer solution.

3.3. Closed-Form Solution

For the special case we just considered where the initial distribution is Flory-Schulz, the system has been reduced to the one-dimensional ODE (10). We can go one step further: this ODE is separable and admits a closed-form solution. In preparation for this, we will perform some simplifications. First, recall that the steady-state bonding probability P b = 1 + κ ¯ Δ , where κ ¯ = k / k + 2 [ U ] and Δ = κ ¯ ( 2 + κ ¯ ) . We nondimensionalize the ODE (10) by setting τ = 2 k + [ U ] t , transforming the equation into:
d p d τ = 1 2 ( 1 p ) 2 κ ¯ p
The result is a separable ODE, allowing us to write:
d p 1 2 ( 1 p ) 2 κ ¯ p = d τ = τ + c
This gives us τ as a function of p, which can be inverted to give a solution to the ODE:
p ( τ ) = 1 + κ ¯ Δ tanh ( 1 2 Δ τ + c )
We can fix p ( 0 ) to solve for the value of c:
c = tanh 1 1 + κ ¯ p ( 0 ) Δ
Finally, we can recover the original time parameterization by replacing τ with 2 k + [ U ] t , which gives the parameter of the Flory-Schulz distribution as a function of time:
p ( t ) = 1 + κ ¯ Δ tanh Δ k + [ U ] t + tanh 1 1 + κ ¯ p ( 0 ) Δ
As τ , the tanh function asymptotically approaches a value of 1 regardless of initial condition, which recovers the previously derived steady-state value P b .

4. Numerical Treatments

We have derived the rate dynamics of interacting k-mers (9) from the family of reaction Equation (8). However, because the state vectors lie in an infinite-dimensional space, physically realizable numerical methods require us to approximate these dynamics in finitely many dimensions. From our work in Section 2, we know that the expected number of extremely long polymers tends to be low due to the geometrically-distributed equilibrium state. Therefore, we can achieve very low error by introducing a constraint d on the maximum length of polymers to be considered. This effectively constrains the system from the infinite-dimensional space 2 down to the finite-dimensional R d .

4.1. Choice of Parameter Values

An experimentalist investigating polymerization in the lab might choose a set of representative conditions in the form of an initial temperature, pH, total concentration of monomer units, and presence of other cofactors such as salts. The dynamics of the system are determined by the rate constants k + and k , which are a function of the experimental conditions; although the effect of pH and salt cofactors on hydrolysis have been studied in depth, e.g., by Oivanen et al. [35], the effects of the same on synthesis rates have remained obscure.
In our setting, we consider conditions common to experiments investigating the hot-spring origins of life hypothesis [16,18,20], with temperature T = 85 C and pH of approximately 3. This allows us to take the approximate value of k from the experimental results of Oivanen et al. [35] for similar conditions. As noted in Section 2.2, although ester formation is endergonic under standard conditions, it is enthalpically favorable and can be made spontaneous by decreasing its entropic unfavorability. We assume that Δ G b has been brought down by some means to an illustrative negative value, and compute the corresponding value of k + .

4.2. Truncation

Although we seek to truncate the system to a finite dimension d, we do this not by throwing away polymers which become too large, but rather by eliminating the formation of longer polymers in the first place. This means that we approximate the family of reaction Equation (8) by prohibiting all reactions which include a reactant of length greater than d:
P i + P j k k + P i + j for i + j d
The dynamics can be derived from the reaction family (12) in exactly the same way that (9) was derived from (8), the only difference being that the first sum becomes finite due to the truncation. The resulting system of ODEs, describing the evolution of a state vector x R d whose components x k represent the concentration of k-mers, is given by:
d x k d t = l = 1 d k 2 k x k + l 2 k + x k x l + l = 1 k 1 k + x l x k l k x k
A perhaps more obvious method of truncation would be to keep the exact original form of (9), but ignore lengths above d by taking n ( k ) = 0 for k > d . However, this approach leads to unsatisfactory results because it is equivalent to permanently deleting any k-mer which forms with k > d . Since the mass associated with these deleted k-mers is never returned to the system, mass is continually being lost, so the system asymptotically approaches a steady state at x = 0 .

4.3. Simulations

To demonstrate the dynamics of the system and the effects of truncation, we numerically solve (13) starting from an initial solution of exclusively monomers for the truncation lengths d = 100 and d = 10 , and plot the concentration of k-mers up to length 10 over time in Figure 3. All simulations were run using the DifferentialEquations.jl package [36] with parameters other than d held fixed; these values are given in Table 2. The Github repository containing our simulation code is given in the Supplementary Materials below.
The expected equilibrium state is the geometric distribution (4), which would appear uniformly spaced on a logarithmic plot, with the dimer concentration equal to P b multiplied by the monomer concentration and so on. In the case where d = 100 , this is exactly what we observe; however, when we truncate to d = 10 , the distribution goes through an inversion after which d-mers, rather than monomers, dominate. Since truncation depends on the assumption that longer polymers are negligible, this is obviously nonphysical.
Although each of our simulations converges to some steady-state distribution, the degree of agreement with our theoretical prediction varies depending on the truncation length d. To visualize this, Figure 4 plots the steady state distributions for three values of d compared to the theoretical steady-state Flory-Schulz distribution.

4.4. Error Bound

Since the Flory-Schulz distribution of polymer length which is the solution to the system of reaction Equation (9) includes a non-zero expected concentration for polymers longer than any finite d, it is impossible for the truncated probability distribution which is the solution to (13) to be identical to the infinite-dimensional solution. As noted above, the dynamics of (13) are exactly the result of constraining the dynamics of (9) to finite maximum polymer length while preserving conservation of mass. Therefore, the distance between the true solution n ( k ) and its projection n ^ ( k ) onto the set of d-dimensional distributions with the correct total mass provides a lower bound to the error of any mass-preserving truncation of the reaction family (8). We can use the “mass operator” M n = k n ( k ) , which counts the total concentration of monomer units in the system, to write this projection as:
minimize n ^ 2 n ^ n 2 subject to M n ^ = [ U ] and n ^ ( k ) = 0 k > d
We can simplify the objective by separating the portion of n ^ which is allowed to vary from the infinite “tail” which is fixed to zero. Introducing the d-dimensional truncations x and x ^ of n and n ^ respectively, and letting y = n x to capture the error in the tail, we have:
n ^ n 2 2 = k = 1 ( n ^ ( k ) n ( k ) ) 2 = k = 1 d ( n ^ ( k ) n ( k ) ) 2 + k = d + 1 n ( k ) 2 = x ^ x 2 2 + y 2 2
Now we can change variables to δ x = x ^ x and find the optimal projection using ordinary least squares. In plainer language, the problem being solved is to find the smallest correction δ x whose total mass is equal to that of the missing “tail” y. The finite-dimensional version of the mass operator M d x = k = 1 d k x k can be constructed as a 1 × d matrix whose entries are ascending integers.
minimize δ x R d δ x 2 2 subject to M d ( x + δ x ) = [ U ] M d δ x = [ U ] M d x = M y
The well-known closed-form solution to the ordinary least squares problem (15) is:
δ x = ( M d M d T ) 1 M d T M y
This can be brought into more elementary terms by calculating:
M y = k = d + 1 k n ( k ) = [ U ] ( 1 p ) 2 k = d + 1 k p k 1 = [ U ] ( 1 + d ( 1 p ) ) p d y 2 2 = k = d + 1 n ( k ) 2 = k = d + 1 ( 1 p ) 4 [ U ] 2 p 2 ( k 1 ) = ( 1 p ) 4 [ U ] 2 p 2 d 1 p 2 M d M d T = k = 1 d k 2 = 1 6 d ( d + 1 ) ( 2 d + 1 )
δ x 2 2 = δ x T δ x = M y M d ( M d M d T ) 1 ( M d M d T ) 1 M d T M y = ( M y ) 2 M d M d T
We can now directly calculate the total distance of the projection using (14).
n ^ n 2 2 = δ x 2 2 + y 2 2 = 6 [ U ] 2 ( 1 + d ( 1 p ) ) 2 p 2 d d ( d + 1 ) ( 2 d + 1 ) + ( 1 p ) 4 [ U ] 2 p 2 d 1 p 2
This provides a lower bound on the sum-squared error of any solution with maximum polymer length d and the same total mass as the previously proven solution n ( k ) . In order to make this result more directly comparable between different parameter values, however, we will use the relative error:
E ( p ) = n ^ n 2 2 n 2 2 = p d 1 + 6 ( 1 + d ( 1 p ) ) 2 ( 1 p 2 ) d ( d + 1 ) ( 2 d + 1 ) ( 1 p ) 4 > p d

4.5. Applying the Error Bound

This result E, which is strictly greater than but asymptotically equal to p d , provides an absolute lower bound on the 2 error between the instantaneous distribution of polymer lengths and any mass-conserving finite approximation to this distribution. In other words, error terms on the order of p d arise in any simulation of our chemical system, so long as the simulation (a) produces finite-dimensional results and (b) obeys conservation of mass. These errors can be surprisingly large even for quite reasonable-sounding d as p approaches 1. This result is relatively insensitive to the choice of error metric; although we specifically investigate the case of 2 norm, other metrics which we tested in simulation also produced error on the order of p d .
It is important to emphasize that this is a lower bound on error; this does not guarantee that a certain choice of d will produce less than a certain error, which is in fact impossible without being more specific about the method of solution. For example, in Figure 5, above a certain truncation length of about d = 250 , the finite precision of the solver becomes more of a limiting factor than the truncation error. Likewise, early in the dynamical simulation when the instantaneous value of p is very small, the error bound is practically useless. Instead, this bound guarantees that any simulation which chooses d too small will produce at least a certain specified error.
As an example of applying this error bound in practice, if we consider a system with Δ G b = 3.5   k c a l / m o l , corresponding to a steady-state bonding probability P b = ( 1 + e Δ G b / R T ) 1 99.3 % , we can numerically solve (16) for d to find that a simulation with d < 700 cannot have final relative error less than 1 % . The simpler error bound E > P b d is even easier to apply: any truncation length d < log P b E * must produce final relative error E > E * . In the above case, this laxer bound is only able to rule out truncations up to d = 632 , but the ease of calculation makes this bound probably more useful than the tighter one.

5. Comparison to Experiment

The main value of a theoretical model of any physical process is the predictive and interpretive power it brings to designing and analyzing experimental results. In this section, we present a few commonly measured experimental quantities and the predictions our theory makes about them.

5.1. Critical Concentration

As the total concentration [U] of monomer units increases, the concentration of monomers in the final solution approaches a fixed value C c called the critical concentration [37] (Section 3.2). This value is a function only of the rate constants, and can be calculated by taking the limit of n ( 1 ) as [U] is taken to infinity. The monomer concentration n ( 1 ) is given by substituting (6) into the Flory-Schulz distribution (4) with k = 1 :
n ( 1 ) = ( 1 P b ) 2 [ U ] = κ ¯ 2 2 κ ¯ κ ¯ ( 2 + κ ¯ ) + κ ¯ ( 2 + κ ¯ ) [ U ]
As [U] grows to infinity, the reduced equilibrium constant κ ¯ = κ / 2 [ U ] goes to zero, so we can calculate C c as:
C c = lim [ U ] n ( 1 ) = lim [ U ] 2 κ ¯ [ U ] + o ( κ ¯ 2 ) = κ
The critical concentration provides an alternate route to experimental determination of rate constants; since it is likely easier to measure the monomer concentration than to find the complete length distribution, it is possible to use the critical concentration to find one rate constant given the other.

5.2. Polymer Yield

Experimental studies commonly report the polymer yield, the fraction of mass which is converted to polymers at equilibrium. We can compute this mass conversion efficiency η as:
η = [ U ] n ( 1 ) [ U ] = 1 ( 1 P b ) 2 = P b ( 2 P b )
The analogous quantity derived from concentrations rather than masses is simply equal to P b , since the monomer concentration ratio is exactly 1 P b .

5.3. Mass Distribution

It is frequently easier to measure the mass rather than the concentration of polymers of each given length. For example, the output of high-performance liquid chromatography (HPLC) is a “spectrum” where the height and location of each peak corresponds approximately to the total and per-molecule mass of a reaction product respectively. For comparison with such results, we follow Flory’s treatment of the polymer mass distribution m ( k ) , which can be defined in terms of the concentration distribution n ( k ) (3) as follows [38]:
m ( k ) = k m 0 n ( k )
Here, each term m ( k ) is the theoretical total mass of k-mers, and m 0 is the molar mass of the corresponding monomer. Since m ( k ) is a single peak distribution we can analyze an experimentally determined mass distribution m ^ ( k ) by matching the position k * of the peak of m ( k ) to the position k ^ * of the peak of m ^ ( k ) . We find the peak of our theoretical distribution by differentiating m ( k ) :
d m ( k ) d k = m 0 ( 1 p ) 2 [ U ] p k ( 1 + k ln p )
The unique zero of this equation is k * , so according to our theory, the mode of the mass distribution is related to the bonding probability by:
k * = 1 / ln p p = e 1 / k *
In other words, given an experimental Flory-Schultz-like mass distribution, one can derive the value for the bonding probability p. At equilibrium, p P b , so the Gibbs free energy of bond formation can be calculated using (7).
As a simple example of how these results may be applied, Monnard et al. used HPLC to measure nonenzymatic polymerization of activated nucleotides [28]. When multiple bases were mixed, the HPLC results are difficult to interpret, but in the case with pure uridine, the result appears to be a Flory-Schulz mass distribution with a mode of about 2. This corresponds by (18) to P b 60 % . Substituting this value into (17), we expect polymer yield of η 85 % , consistent with their reported value of 88%.
We can also estimate the free energy of bond formation under their experimental conditions in this way: when P b 60 % , the Boltzmann statistics of (7) give a free energy Δ G 0.2 k c a l / m o l , corresponding to a process where bond formation is slightly favorable. Additionally, given the free energy at two different temperatures, we can separate the entropic and enthalpic contributions. Since this experiment was carried out at −18 °C, we can make a first approximation by assuming that our calculated value is directly comparable to the value for sugar-phosphate esterification under standard conditions of 3.3 kcal/mol as shown in Table 13-4 of Nelson & Cox [25].
The temperature of this experiment was 43 K colder than the standard temperature of 25 °C, and the concomitant free energy change was 3.5 kcal/mol. Since Δ G = Δ H T Δ S , a bit of algebra gives Δ S 0.1   kcal   mol 1 K 1 and Δ H 20   k c a l / m o l . These conditions correspond to the case in Figure 1 and Table 1 where polymerization is favorable below a critical temperature T c 15 C . It seems suspect, however, that Δ H would be so large [26], suggesting that the role of entropy in bond formation is reduced by the environment of the eutectic ice-water mixture, in agreement with the commentary of Monnard et al. [28].

5.4. Degree of Polymerization

Another quantity commonly measured in experiments is the average degree of polymerization in obtained solutions, e.g., [30]. Our model allows the evolution of this parameter to be calculated easily; by Carother’s equation, the number average degree of polymerization is given by X ¯ n = 1 1 p , so we can use (10) and the chain rule to calculate:
d X ¯ n d t = d d p 1 1 p d p d t = 2 k + [ U ] k X ¯ n ( X n 1 )
This correctly simplifies in the irreversible case k = 0 or when the degree of polymerization X ¯ n = 1 to a linear increase in degree of polymerization d d t X ¯ n | X ¯ n = 1 = 2 k + [ U ] . This linear dependence on initial concentration is in contrast to the quadratic dependence predicted for polycondensation. We are not aware of a study of the time course of X ¯ n in RNA polymerization, but these dynamics have been observed in the synthesis of nanowires by Gao et al. [24], in supramolecular polymerization of micelles by Lu et al. [39], and in simulations of interfacial polymerization by Xing et al. [40]. In all three cases, the reversibility of the polymerization process is also in evidence due to the slowdown in the rate of increase in degree of polymerization.

6. Discussion

Our model provides an explicit description of the formation of RNA polymers in aqueous prebiotic conditions as is necessary for the RNA-world hypothesis. The mathematical and computational models presented in this paper generalize to all polymers that grow by polyaddition in well-mixed solutions. In these cases, as well as in polycondensation processes where the concentration of the condensate remains approximately constant, our models describe the dynamics of the length distribution as well as the eventual steady state. The time evolution of an initial Flory-Schulz distribution is completely determined by the evolution of the bonding probability, for which we have stated a closed form solution depending only on the forward and back reaction rates and the number of monomer units.
In the simple case of an initial population of RNA monomers in absence of any cofactors, whenever polymerization is favorable in the sense that the Gibbs energy of bond formation is negative, the steady-state concentration of polymers is expected to exceed the concentration of monomers. However, we do not expect to see a population length inversion, sometimes dubbed a “kinetic trap,” in which polymers of certain lengths achieve a greater concentration than any shorter polymer. Any experimental deviation from these predictions with or without the presence of cofactors indicates the presence of significant higher-order effects (e.g., hairpin structures, cyclical polymers, catalysis) and may suggest future directions for mathematical models. The inclusion of the shielding of bonding sites from hydrolysis by virtue of the secondary structure of RNA in the model may increase the lifetime of long polymers in solution, leading to a recovery of kinetic trap-like behavior. Similarly, should the RNA in question be encapsulated in lipid vesicles small enough to introduce finite-size effects, or assisted in polymerization by association with a surface, then the statistics of our model may no longer apply.
Additionally, when considering the hydrothermal origins of life hypothesis advanced by Deamer et al. it becomes important to consider the effects of wet-dry cycling on RNA polymerization [3]. In the dry phase, polymerization is favorable due to the lack of hydrolysis but the well-mixed assumption is violated. In the wet phase, polymerization is unfavorable due to the presence of hydrolysis but the well-mixed assumption is upheld. Intuitively, this means that alternation between a relatively long dry phase and short wet phase allows for the “well-enough mixing” of the RNA molecules, such that the solution approaches the polymer distribution predicted for a dry phase with mixing [15]. In short, wet-dry cycling leads to an effective increase in the probability of bonding and therefore increases the concentration of long polymers. Besides wet-dry cycling, another important feature of the hydrothermal hypothesis is the suggestion that biogenesis occurs at a certain optimal temperature: it is well-known that elevated temperature is required to increase the rate of biological reactions [3], but as temperature increases, polymerization becomes entropically unfavorable, suggesting a Goldilocks effect with regard to the ambient temperature at the origins of life.

Supplementary Materials

The Julia code used to perform these simulations is available at the following GitHub repository:

Author Contributions

Conceptualization, A.S. and M.H.; Formal analysis, A.S. and M.H.; Investigation, A.S.; Methodology, A.S. and M.H.; Project administration, M.H.; Resources, A.S.; Software, A.S.; Validation, A.S. and M.H.; Visualization, A.S.; Writing—original draft, A.S. and M.H. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.


We thank David Deamer and Bruce Damer at UCSC for the insightful conversations which catalyzed the generation of this model. Additionally, we express our great and sincere gratitude to Katerina Tadenev and Alexander Epstein for assistance with revising the manuscripts and advice concerning the interdisciplinary interpretability of the paper. Finally, we thank the anonymous reviewers for their inputs, which greatly improved this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Gilbert, W. Origin of Life: The RNA World. Nature 1986, 319, 618. [Google Scholar] [CrossRef]
  2. Neveu, M.; Kim, H.J.; Benner, S.A. The “Strong” RNA World Hypothesis: Fifty Years Old. Astrobiology 2013, 13, 391–403. [Google Scholar] [CrossRef] [PubMed][Green Version]
  3. Deamer, D.; Damer, B.; Kompanichenko, V. Hydrothermal Chemistry and the Origin of Cellular Life. Astrobiology 2019, 19. [Google Scholar] [CrossRef] [PubMed]
  4. Kruger, K.; Grabowski, P.J.; Zaug, A.J.; Sands, J.; Gottschling, D.E.; Cech, T.R. Self-Splicing RNA: Autoexcision and Autocyclization of the Ribosomal RNA Intervening Sequence of Tetrahymena. Cell 1982, 31, 147–157. [Google Scholar] [CrossRef]
  5. Fedor, M.J.; Williamson, J.R. The Catalytic Diversity of RNAs. Nat. Rev. Mol. Cell Biol. 2005, 6, 399–412. [Google Scholar] [CrossRef]
  6. Bartel, D.; Szostak, J. Isolation of New Ribozymes from a Large Pool of Random Sequences. Science 1993, 261, 1411–1418. [Google Scholar] [CrossRef][Green Version]
  7. Johnston, W.K.; Unrau, P.J.; Lawrence, M.S.; Glasner, M.E.; Bartel, D.P. RNA-Catalyzed RNA Polymerization: Accurate and General RNA-Templated Primer Extension. Science 2001, 292, 1319–1325. [Google Scholar] [CrossRef][Green Version]
  8. Wochner, A.; Attwater, J.; Coulson, A.; Holliger, P. Ribozyme-Catalyzed Transcription of an Active Ribozyme. Science 2011, 332, 209–212. [Google Scholar] [CrossRef]
  9. Attwater, J.; Wochner, A.; Holliger, P. In-Ice Evolution of RNA Polymerase Ribozyme Activity. Nat. Chem. 2013, 5, 1011–1018. [Google Scholar] [CrossRef]
  10. Kauffman, S.A. The Origins of Order: Self-Organization and Selection in Evolution; Oxford University Press: Oxford, UK, 1993. [Google Scholar]
  11. Lancet, D.; Kedem, O.; Pilpel, Y. Emergence of Order in Small Autocatalytic Sets Maintained Far from Equilibrium: Application of a Probabilistic Receptor Affinity Distribution (RAD) Model. Berichte der Bunsengesellschaft für Physikalische Chemie 1994, 98, 1166–1169. [Google Scholar] [CrossRef]
  12. Vasas, V.; Fernando, C.; Santos, M.; Kauffman, S.; Szathmáry, E. Evolution before Genes. Biol. Direct 2012, 7, 1. [Google Scholar] [CrossRef] [PubMed][Green Version]
  13. Hordijk, W.; Steel, M. Conditions for Evolvability of Autocatalytic Sets: A Formal Example and Analysis. Orig. Life Evol. Biosph. 2014, 44, 111–124. [Google Scholar] [CrossRef] [PubMed]
  14. Orgel, L.E. Prebiotic Chemistry and the Origin of the RNA World. Crit. Rev. Biochem. Mol. Biol. 2004, 39, 99–123. [Google Scholar] [CrossRef] [PubMed][Green Version]
  15. Higgs, P.G. The Effect of Limited Diffusion and Wet–Dry Cycling on Reversible Polymerization Reactions: Implications for Prebiotic Synthesis of Nucleic Acids. Life 2016, 6, 24. [Google Scholar] [CrossRef] [PubMed][Green Version]
  16. Ross, D.S.; Deamer, D. Dry/Wet Cycling and the Thermodynamics and Kinetics of Prebiotic Polymer Synthesis. Life 2016, 6, 28. [Google Scholar] [CrossRef] [PubMed][Green Version]
  17. Hargrave, M.; Thompson, S.K.; Deamer, D. Computational Models of Polymer Synthesis Driven by Dehydration/Rehydration Cycles: Repurination in Simulated Hydrothermal Fields. J. Mol. Evol. 2018, 86, 501–510. [Google Scholar] [CrossRef]
  18. Rajamani, S.; Vlassov, A.; Benner, S.; Coombs, A.; Olasagasti, F.; Deamer, D. Lipid-Assisted Synthesis of RNA-like Polymers from Mononucleotides. Orig. Life Evol. Biosph. 2008, 38, 57–74. [Google Scholar] [CrossRef]
  19. Da Silva, L.; Maurel, M.C.; Deamer, D. Salt-Promoted Synthesis of RNA-like Molecules in Simulated Hydrothermal Conditions. J. Mol. Evol. 2015, 80, 86–97. [Google Scholar] [CrossRef]
  20. DeGuzman, V.; Vercoutere, W.; Shenasa, H.; Deamer, D. Generation of Oligonucleotides under Hydrothermal Conditions by Non-Enzymatic Polymerization. J. Mol. Evol. 2014, 78, 251–262. [Google Scholar] [CrossRef]
  21. Flory, P.J. Principles of Polymer Chemistry; Cornell University Press: Ithaca, NY, USA, 1953. [Google Scholar]
  22. Gupta, S.K.; Kumar, A. Reaction Engineering of Step Growth Polymerization.; Plenum Chemical Engineering Series; Plenum Press: New York, NY, USA, 1987. [Google Scholar]
  23. Voet, D.; Voet, J.G. Fundamentals of Biochemistry, 4th ed.; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
  24. Gao, H.; Ma, X.; Lin, J.; Wang, L.; Cai, C.; Zhang, L.; Tian, X. Synthesis of Nanowires via Temperature-Induced Supramolecular Step-Growth Polymerization. Macromolecules 2019, 52, 7731–7739. [Google Scholar] [CrossRef]
  25. Nelson, D.L.; Cox, M.M. Lehninger Principles of Biochemistry, 6th ed.; W. H. Freeman and Company: New York, NY, USA, 2013. [Google Scholar]
  26. Nam, I.; Lee, J.K.; Nam, H.G.; Zare, R.N. Abiotic Production of Sugar Phosphates and Uridine Ribonucleoside in Aqueous Microdroplets. Proc. Natl. Acad. Sci. USA 2017, 114, 12396–12400. [Google Scholar] [CrossRef] [PubMed][Green Version]
  27. Orgel, L.E. Polymerization on the Rocks: Theoretical Introduction. Orig. Life Evol. Biosph. 1998, 28, 227–234. [Google Scholar] [CrossRef]
  28. Monnard, P.A.; Kanavarioti, A.; Deamer, D.W. Eutectic Phase Polymerization of Activated Ribonucleotide Mixtures Yields Quasi-Equimolar Incorporation of Purine and Pyrimidine Nucleobases. J. Am. Chem. Soc. 2003, 125, 13734–13740. [Google Scholar] [CrossRef] [PubMed]
  29. Ellis, R.J. Macromolecular Crowding: Obvious but Underappreciated. Trends Biochem. Sci. 2001, 26, 597–604. [Google Scholar] [CrossRef]
  30. Costanzo, G.; Pino, S.; Ciciriello, F.; Mauro, E.D. Generation of Long RNA Chains in Water. J. Biol. Chem. 2009, 284, 33206–33216. [Google Scholar] [CrossRef] [PubMed][Green Version]
  31. Flory, P.J. Thermodynamics of Heterogeneous Polymers and Their Solutions. J. Chem. Phys. 1944, 12, 425–438. [Google Scholar] [CrossRef]
  32. Luo, B.; Smith, J.W.; Wu, Z.; Kim, J.; Ou, Z.; Chen, Q. Polymerization-like Co-Assembly of Silver Nanoplates and Patchy Spheres. ACS Nano 2017, 11, 7626–7633. [Google Scholar] [CrossRef]
  33. Yang, C.; Ma, X.; Lin, J.; Wang, L.; Lu, Y.; Zhang, L.; Cai, C.; Gao, L. Supramolecular “Step Polymerization” of Preassembled Micelles: A Study of “Polymerization” Kinetics. Macromol. Rapid Commun. 2018, 39, 1700701. [Google Scholar] [CrossRef]
  34. Gu, M.; Ma, X.; Zhang, L.; Lin, J. Reversible Polymerization-like Kinetics for Programmable Self-Assembly of DNA-Encoded Nanoparticles with Limited Valence. J. Am. Chem. Soc. 2019, 141, 16408–16415. [Google Scholar] [CrossRef]
  35. Oivanen, M.; Kuusela, S.; Lönnberg, H. Kinetics and Mechanisms for the Cleavage and Isomerization of the Phosphodiester Bonds of RNA by Brønsted Acids and Bases. Chem. Rev. 1998, 98, 961–990. [Google Scholar] [CrossRef]
  36. Rackauckas, C.; Nie, Q. DifferentialEquations.Jl – A Performant and Feature-Rich Ecosystem for Solving Differential Equations in Julia. J. Open Res. Softw. 2017, 5, 15. [Google Scholar] [CrossRef][Green Version]
  37. Oosawa, F.; Asakura, S. Thermodynamics of the Polymerization of Protein; Molecular Biology; Academic Press: London, UK, 1975. [Google Scholar]
  38. Flory, P.J. Molecular Size Distribution in Linear Condensation Polymers. J. Am. Chem. Soc. 1936, 58, 1877–1885. [Google Scholar] [CrossRef]
  39. Lu, Y.; Gao, L.; Lin, J.; Wang, L.; Zhang, L.; Cai, C. Supramolecular Step-Growth Polymerization Kinetics of Pre-Assembled Triblock Copolymer Micelles. Polym. Chem. 2019, 10, 3461–3468. [Google Scholar] [CrossRef]
  40. Xing, J.Y.; Xue, Y.H.; Lu, Z.Y.; Liu, H. In-Depth Analysis of Supramolecular Interfacial Polymerization via a Computer Simulation Strategy. Macromolecules 2019, 52, 6393–6404. [Google Scholar] [CrossRef]
Figure 1. A comparison of the four different cases described in Table 1 for the signs of Δ H b and Δ S b . When Δ H b and Δ S b have the same sign, there is a critical temperature T c = Δ H b Δ S b at which Δ G b = 0 , so P b = 50 % and polymerization changes between being favorable and unfavorable. When the signs differ, however, polymerization is either favorable or unfavorable regardless of temperature.
Figure 1. A comparison of the four different cases described in Table 1 for the signs of Δ H b and Δ S b . When Δ H b and Δ S b have the same sign, there is a critical temperature T c = Δ H b Δ S b at which Δ G b = 0 , so P b = 50 % and polymerization changes between being favorable and unfavorable. When the signs differ, however, polymerization is either favorable or unfavorable regardless of temperature.
Life 10 00012 g001
Figure 2. Closed-form solution to the dynamics of the Flory-Schulz rate parameter p starting from an initial condition p = 0 , corresponding to an all-monomer solution. The parameter itself is shown on the left, and the resulting concentrations of k-mers for k from 1 (blue) to 10 (cyan) are shown on the right.
Figure 2. Closed-form solution to the dynamics of the Flory-Schulz rate parameter p starting from an initial condition p = 0 , corresponding to an all-monomer solution. The parameter itself is shown on the left, and the resulting concentrations of k-mers for k from 1 (blue) to 10 (cyan) are shown on the right.
Life 10 00012 g002
Figure 3. Concentration of k-mers for k from 1 (blue) to 10 (cyan), for truncation lengths d = 100 (left) and d = 10 (right). The d = 100 case, visually identical to the results shown in Figure 2, reaches the correct geometric distribution, whereas the d = 10 case goes through a nonphysical inversion near time t = 10 5 .
Figure 3. Concentration of k-mers for k from 1 (blue) to 10 (cyan), for truncation lengths d = 100 (left) and d = 10 (right). The d = 100 case, visually identical to the results shown in Figure 2, reaches the correct geometric distribution, whereas the d = 10 case goes through a nonphysical inversion near time t = 10 5 .
Life 10 00012 g003
Figure 4. The steady state concentration distribution for d = 10 , d = 25 , and d = 100 compared to the closed-form solution. By d = 100 , the numerical and analytical solutions are indistinguishable.
Figure 4. The steady state concentration distribution for d = 10 , d = 25 , and d = 100 compared to the closed-form solution. By d = 100 , the numerical and analytical solutions are indistinguishable.
Life 10 00012 g004
Figure 5. Comparison between the error bound (16) and the actual error in the results of our simulation. The left figure depicts the steady-state error and the theoretical lower bound E ( P b ) as a function of d, and on the right is the time evolution of the error in a single simulation for d = 100 , compared to the bound E ( p ) computed from the instantaneous analytical value of p as in Figure 2.
Figure 5. Comparison between the error bound (16) and the actual error in the results of our simulation. The left figure depicts the steady-state error and the theoretical lower bound E ( P b ) as a function of d, and on the right is the time evolution of the error in a single simulation for d = 100 , compared to the bound E ( p ) computed from the instantaneous analytical value of p as in Figure 2.
Life 10 00012 g005
Table 1. The temperature dependence of the equilibrium probability of bond formation P b varies depending on the signs of two key thermodynamic quantities of interest: the free enthalpy change Δ H b and the corresponding entropy change associated with bond formation. Compare to Figure 1, which displays P b as a function of temperature in these four cases.
Table 1. The temperature dependence of the equilibrium probability of bond formation P b varies depending on the signs of two key thermodynamic quantities of interest: the free enthalpy change Δ H b and the corresponding entropy change associated with bond formation. Compare to Figure 1, which displays P b as a function of temperature in these four cases.
Δ H b Δ S b Effect on P b
++ P b > 0.5 above Δ H b Δ S b
+- P b < 0.5 at all T
-+ P b > 0.5 at all T
-- P b > 0.5 below Δ H b Δ S b
Table 2. Parameter values used in all numerical simulations.
Table 2. Parameter values used in all numerical simulations.
[U]initial monomer concentration1 M
GGibbs free energy of bonding−1.5 kcal/mol
kunbonding rate10−6 s−1
k+bonding rate constant7.4 × 10−5 s−1mol−1
Pbsteady-state bonding probability89%

Share and Cite

MDPI and ACS Style

Spaeth, A.; Hargrave, M. A Polyaddition Model for the Prebiotic Polymerization of RNA and RNA-Like Polymers. Life 2020, 10, 12.

AMA Style

Spaeth A, Hargrave M. A Polyaddition Model for the Prebiotic Polymerization of RNA and RNA-Like Polymers. Life. 2020; 10(2):12.

Chicago/Turabian Style

Spaeth, Alex, and Mason Hargrave. 2020. "A Polyaddition Model for the Prebiotic Polymerization of RNA and RNA-Like Polymers" Life 10, no. 2: 12.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop