All articles published by MDPI are made immediately available worldwide under an open access license. No special
permission is required to reuse all or part of the article published by MDPI, including figures and tables. For
articles published under an open access Creative Common CC BY license, any part of the article may be reused without
permission provided that the original article is clearly cited. For more information, please refer to
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature
Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for
future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive
positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world.
Editors select a small number of articles recently published in the journal that they believe will be particularly
interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the
most exciting work published in the various research areas of the journal.
Concepts of infinitely divisible distributions are reviewed and applied to mutant number distributions derived from the Lea-Coulson and other models which describe the Luria-Delbrück fluctuation test. A key finding is that mutant number distributions arising from a generalised Lea-Coulson model for which normal cell growth is non-decreasing are unimodal. An integral criterion is given which separates the cases of a mode at the origin, or not.
The Luria-Delbrück fluctuation assay is widely used to estimate mutation rates of micro organisms such as bacterial cells. In very broad outline, several test tubes containing a liquid nutrient medium are seeded with the same number of normal-type cells. These cells multiply by binary fission attaining the number by time t. At this time the contents of the test tube are ‘plated’ onto a solid substrate which is (almost) immediately lethal for the normal cells. Some cells may have mutated during the growing phase into a resistant type. Under ideal conditions they will form visible colonies on the lethal substrate. Counting these colony numbers provides data which is used to determine the rate of mutation intrinsic to the organism of interest.
Exactly how the data are used for this determination depends on the mathematical model chosen to describe the dynamics of the situation. Various choices are available and we refer to [1,2] for reviews and references. The Lea and Coulson  model and its subsequent tweaks is the most widely used of those available. In its simplest form it assumes the following occurs within each test tube.
Normal cell numbers increase exponentially fast: .
Mutation occurs randomly at a rate proportional to . Specifically, there is a mutation rate r (per unit time per bacterial cell) such that a mutation event occurs in the interval with probability , i.e., a normal cell converts to a resistant type. There is no mutation with probability .
Mutation events create mutant clones which grows independently of each other according to a linear birth process with split rate , i.e., a binary splitting or Yule process. The relative growth rates of normal to mutant cells is denoted by .
These assumptions give rise to probability distributions for the total number of mutants at time t. The model implies that these distributions are infinitely divisible (abbreviated infdiv), i.e., compound Poisson distributions . Our aim in this paper is to investigate the presence of deeper infdiv properties of mutant number distributions. More specifically, are they generalised negative-binomial convolutions (GNBC’s)? The answer is interesting in its own right, but a positive answer gives structural insight, in particular that such a distribution is unimodal and it provides criteria which determine if the associated probability mass function is non-increasing or it has a positive mode.
Definitions and basic properties of positive infdiv distributions are reviewed in Section 2. Useful subclasses of infdiv distributions are characterised by analytical properties of the density (defined for ) of the Lévy measure (c.f. (1)). The subclass of self-decomposable (SD) distributions is defined by requiring that is non-increasing. This class is significant because its members have a unimodal density function. A corresponding discrete version is defined, and they are unimodal too. In addition, precise criteria exist which separate the cases of a non-increasing mass function (i.e., a mode at the origin), or the smallest mode is positive. These notions are applied to models where mutant clones grow as deterministic integer-valued functions.
The Lea-Coulson model described above can be generalised to allow normal cell growth to be an arbitrary positive-valued function of time. The resulting mutant number distribution is a mixture of Poisson distributions where the mixing distribution is a continuous infdiv distribution with the special property that is completely monotone. Distributions having this property comprise the so-called Bondesson (BO) class. This notion is introduced in Section 3, along with its discrete version. Details are provided for the balanced () generalised Lea-Coulson model in which normal cell lines grow according to the logistic (Pearl-Reed) population model.
The generalised gamma convolution (GGC) class of infdiv distributions comprise the subset of BO distributions for which the product is completely monotone. This implies the inclusion GGC⊂SD, and hence GGC’s are unimodal. Poisson mixtures in which the mixing distribution is a GGC comprise the class of generalised negative-binomial convolutions (GNBC’s), and they too are unimodal. Relevant definitions and properties are introduced in Section 4 where it is shown that a mutant number distribution arising from a generalised Lea-Coulson model in which normal cell growth is non-decreasing is a GNBC. This of course applies to the standard Lea-Coulson model as described above, and details are presented in Section 4 together with precise criteria concerning the modal behaviour of the mutant number distribution; see Theorem 5(a). The section ends with a discussion of shapes of mutant number distributions selected by different estimation methodologies applied to experimental data and also the preservation of the GNBC property when plating efficiency is an issue.
It is often observed that mutations occur during the time of division of a normal cell. This contingency is addressed by branching process descriptions of the Luria-Delbrück set-up. Some details are provided in Section 5 for the two most common models, those due to Haldane and Bartlett. Mutant number distributions for the Haldane model are not infdiv, whereas they are infdiv for the Bartlett model. However, rather less can be ascertained about fine infdiv properties for this model.
Finally, in Section 6 we determine infdiv and modal properties of mutant number distributions arising from alternative models discussed by Kepler and Oprea , Angerer  and Stewart et al. .
Some notation may have different definitions in different sections, but no confusion should arise.
2. Infdiv Distributions and Deterministic Mutant Growth
In this section, we shall review necessary basic ideas of infinite divisibility and self-decomposability and explore their (limited) applicability to the Lea-Coulson and Armitage models in which mutant numbers are assumed to increase deterministically.
Let X be a non-negative random variable with distribution function (DF) and Laplace-Stieltjes transform (LST)
If has a probability density function (pdf) , then . Denote the left-extremity of F by and observe that . Thus, if and if . This quantity can be computed from the LST according to .
Each of the quantities X, F and are called infdiv if, for each , the function is the LST of a probability distribution. This implies that, for any positive integer n, X can be expressed as a sum of random variables, , where the summands are independent and they have the distribution determined by . This encapsulates the idea of infinite divisibility. It is the case that the sum of independent infdiv random variables is itself infdiv.
An infdiv LST has a special canonical form
where is called the Laplace exponent and is a measure, called the Lévy measure, which satisfies the conditions
This means that assigns a zero mass to the origin, it may assign infinite mass to any small interval but it integrates x at the origin, and it assigns a finite mass to infinite intervals ; here is an arbitrary positive number. Functions having the form (1) are called Bernstein functions—see , the standard reference. Differentiation of shows that
Many common distributions are infdiv: gamma, Pareto and log-normal, to mention a few. For us the most important is the gamma family. We say that the random variable has the standard gamma distribution with shape parameter if its pdf is if and if . Here denotes the gamma function (due to Euler); see . The gamma pdf is decreasing in if and it has a single positive mode at if . The corresponding LST is , equivalently, . We stress that infdiv laws can be multi-modal.
In many instances but we will need the additional generality for subsequent key definitions.
Suppose that . Then is a distribution function and the Laplace exponent (1) can be written as
with the interpretation that
where the are independent with DF G and is independent of the summands and it has the Poisson distribution with (rate) parameter (and denoted by Poisson). Thus, X is represented as a (Poisson) random sum of independent jumps and it is said to have a compound Poisson distribution. Conversely, any positive infdiv distribution can be realised as the limit of a sequence of compound Poisson distributions.
An important sub-class of infdiv distributions is the class of self-decomposable (SD) distributions. This notion can be given three equivalent definitions but we concern ourselves with the two which fit with our theme. The definition which explains the terminology is that X has a SD distribution if it has the autoregressive representation that, for any constant , there is a random variable independent of X such that
This says that if X is scaled down to , then the distribution of X can be recovered by adding an independent ‘error’ . Thus, the right-hand side represents the ‘self-decomposition’ of X. This definition can be expressed in terms of the LST of X as the assertion that X has a SD distribution if, for each , the quotient is completely monotone, and hence is the LST of a random variable, say.
It can be proved that a SD distribution is absolutely continuous and infdiv. (In addition, the ‘error’ term is infdiv.) The Lévy measure takes a special form which characterises SD distributions and which sometimes is adopted as the definition of this concept. We shall do likewise with the following formal definition refining (1).
An infdiv distribution is SD if its Lévy measure λ has a density,
where is non-increasing in . The regularity properties of λ then require that
It follows that .
The integral representation
(just differentiate each side) implies that the gamma distribution is SD with .
The following fact is important.
(a) Sums of independent SD random variables are SD.
(b) If F is the DF of a SD distribution, then it has a pdf f which solves the integral equation
This pdf is unimodal, and if , then it is non-increasing with a mode at zero. If , then f is bounded. In addition, with , there is a mode in the interval .
See  (pp. 408, 409) for the modality assertions, and more.
The integral Equation (6) has a wider applicability than is indicated by Fact 1. Specifically, if is a pdf for which there exists a function such that (6) holds, then f is infdiv iff . See  (p. 95) for an even more general account.
Since members of the class of SD distributions have an absolutely continuous DF, we may wonder about discrete analogues of this concept. Suppose that X is infdiv and it can take only non-negative integer values, i.e., it is discrete infdiv. Then it necessarily has a compound Poisson distribution with positive integer jumps. . Denoting the PGF of the jump distribution by , the general form (2) becomes
where the notation on the left-hand side anticipates the application of these concepts to mutant number distributions. Here we understand that , i.e., there are no zero-sized jumps.
Writing the jump PGF as , then setting in (7) and comparing the result with (1) (with ) makes it clear that the Lévy measure inherent in (7) assigns mass to integers . Hence the total mass of the Lévy measure is .
It is often convenient to express the PGF M in the form
noting then that logarithmic differentiation of (7)/(8) yields with
The sequence is called the canonical sequence, or r-sequence, of the infdiv distribution . In fact, for any discrete distribution there is a sequence such that (10) holds. An essential fact here is a theorem of Katti  asserting that is infdiv iff its r-sequence is non-negative. See  (p. 36). This result has subsequently been ‘re-discoverd’, e.g.,  and  (p. 174).
Many specific discrete distributions discussed in this paper arise as Poisson mixtures where the mixing distribution is infdiv, i.e.,
where X is infdiv with Laplace exponent (1). Hence
Thus the shift term in (1) induces a Poisson component in the discrete mixture. Manipulation of the integral will show that has the compound Poisson form (7) with
A result of Holgate  asserts that if the mixing distribution is unimodal (infdiv, or not), then the Poisson mixture is unimodal.
The next definition is suggested by Definition 1.
The discrete compound distribution is called discrete self-decomposable (DSD) if its r-sequence is non-increasing.
Thus, the Poisson distribution is DSD because and if and the general mixture (12) is SDS if .
The auto-regressive characterisation (4) of (continuous) SD distributions has the following analogue. The characterisation (4) of SD distributions involves multiplying a random variable by the constant c to give a product smaller than X. If X is discrete, then this cannot be done in a way which gives an integer-valued product. Binomial thinning is an analogue which addresses this issue: Define a ‘discrete product’ as follows. Let and
where the summands are independent with the Bernoulli distribution and they are independent of X. Thus, , and the PGF of the product is
This product concept is due to the authors of ; see p. 495 for the original reference.
The discrete random variable X has a DSD distribution if, for each , there is a discrete random variable such that
where the summands on the right-hand side are independent. Equivalently, the quantity is a PGF.
A DSD distribution is unimodal. Its mass function is non-increasing iff .
Fact 2 imparts useful qualitative information about the general shape of the mass function of a DSD distribution. If , then for all ; the mass function is non-increasing. If , then the modal value is positive and it may not be unique. See Discussion 1.
We now consider two models in which normal cells and mutation occur as in §1 and in which mutant clones grow deterministically with sizes having integer values. The first such model was introduced by Lea and Coulson  who derived some approximate results for it. Armitage  gave it a more careful consideration. More detail is provided by Crump and Hoel , who identify it as their model. The survey  names it the discretised Luria-Delbrück formulation and the treatment there probably is the most detailed.
Zheng’s term captures the central conception that at time t after its formation, the size of a mutant clone is
where denotes the ‘integer part of’. He shows that the PGF of is given by
The mutant number distribution is DSD, hence unimodal, if
(equivalently, ), in which case its mass function is non-increasing iff ; i.e.,
and , in which case its mass function is non-increasing iff , i.e.,
It follows from the definition of K that , where is the fractional part of . Hence .
Substituting into (13) and with reference to (8), a differentiation yields the evaluations
If , then the sum term in (13) vanishes and has a Poisson distribution with parameter m and Assertion (a) is known.
Suppose that . The general form of the r-sequence is , where
which clearly is decreasing in if .
If , then this representation of is not informative because now the first factor is increasing. Instead, computation of and letting will show that the sign of coincides with that of
Clearly and . Hence in a small interval . Both of are concave-increasing and hence they can achieve equality in for at most one value of u.
Numerical calculation shows that if specified in the assertion, and that if . It follows that is decreasing in iff . Consequently, if . In addition
Hence the r-sequence is non-increasing if , and Assertion (b) follows from Fact 2. □
The case covers the biologically more likely situation in which mutant clones grow no more quickly than normal clones. Theorem 1 fails if γ is sufficiently close to zero. Numerical calculation shows that there is a critical value such that (resp. >) if (resp. >). In other words, the modal value of the r-sequence jumps from zero to unity at . There is a similar jump from 1 to 2 at a critical value . These outcomes suggest the existence of a sequence of critical values as at which the modal value of the r-sequence jumps from i to . In addition, it suggests that Assertion (b) is valid if .
The second model we consider derives its deterministic growth character from assuming that mutant cells have a fixed lifetime of duration L at the end of which they divide. Thus, a clone has size during the interval since its inception. In order that mutant clones achieve splitting rate , we choose L such that , i.e., .
This model with was introduced in  where it is designated as the model. The expression (11) in this reference for the mutant number PGF is valid for and, with our notation, it is
where m and are the above time-dependent parameters and now . We have the following result.
The mutant number distribution specified by (15) is DSD, and hence unimodal if
, in which case the mass function is non-increasing iff , i.e.,
and , in which case its mass function is non-increasing iff , i.e.,
The mutant number distribution is not SD otherwise.
If , then and no mutant has reproduced. Thus, equals the number of mutations during and hence it has a Poisson distribution. Assertion (a) follows.
3. Bondesson Classes and the Generalised Lea-Coulson Model
In this section, we introduce the first of two special classes of infdiv distributions. The history of these notions is that the Swedish actuary/mathematician Olaf Thorin introduced in 1977/78 distributions now called Generalised Gamma Convolutions (GGC’s) with the specific purpose of proving that Pareto and lognormal distributions are infdiv. Subsequently many other distributions conjectured to be infdiv have been proved to be so by showing they are GGC’s. A nett benefit of this is that GGC’s are SD and hence unimodal. It follows then from Holgate’s theorem that Poisson mixtures of GGC’s are unimodal too. Lennart Bondesson introduced in 1981 the larger class of infdiv distributions which we review in this section. Detailed accounts of these topics are  (, Chapter VI) and  (Chapters 6–9).
We begin as follows. Let G be a DF on and define a mixture of exponential distributions by
Clearly fis a pdf and the corresponding LST is
A function F is the DF of a mixture of exponential distributions (written ) if
where and G is a DF on .
If X has the DF , then , where has an exponential distribution and is independent of .
If , then it is infdiv.
The DF iff
where is a (measurable) function on satisfying
It follows from Example 1 that the Lévy density of the gamma distribution is and, in particular, that it is completely monotone. This motivates the following definition of the class BO of distributions named after Lennart Bondesson.
An infdiv DF F belongs to the Bondesson class (written ) if its Lévy measure has a completely monotone density,
where B is a measure (the Bondesson measure) satisfying.
If , then its Laplace exponent has the form
where B is a Bondesson measure.
The class is the smallest set of distributions containing and which is closed under convolution and weak limits.
There is a clear similarity of the cumulant functions (16) and (19) with . This is not mere coincidence. If , then iff and , where b satsfies (17).
The discrete random variable X has a geometric mixture distribution if its PGF has the form
where Π is a random variable satisfying .
If is independent of the random variable which has a unit exponential distribution, then it follows from the mixture representation of the geometric distribution that
The product is infdiv, hence any geometric mixture is compound-Poisson.
We now introduce a discrete version of BO; the class BOP of Poisson mixtures with mixing distribution in BO. We will see that mutant number distributions arising from a generalisation of the Lea-Coulson model (below) and from the Bartlett model (Section 5) live in BOP.
The discrete distribution belongs to BOP if its PGF , where is the Laplace exponent of .
The following fact arises fairly readily from (19) and Definition 7.
The discrete infdiv distribution iff is a Hausdorff moment sequence; specifically,
A distribution in is a mixture of geometric distributions if and .
The substitution will make clear that really is a Hausdorff moment sequence. For example, if B has a density , then
where, in general, denotes the measure which assigns unit mass to the real number ρ and zero mass to any interval not containing ρ. The representation asserted in Fact 6 often is more convenient for our purposes.
In the most general situation, the fact that jump probabilities of a compound Poisson distribution comprise a non-increasing sequence implies little about the modal properties of . For example, if X has the Poisson and the Poisson distributions, respectively, and X and Y are independent, then has at least two modes, one at and the other at , if and . For example, if and .
By definition a generalised Lea-Coulson model admits any (measurable) deterministic growth function of normal type cells. Replacing the exponential form with in the specification of §1 yields a compound Poisson distribution for mutant numbers whose Lévy masses are
These outcomes are well-known and they follow from the order statistics property of Poisson processes. See  for what seems the earliest and most general formulation. A later independent account specifically for the Luria-Delbrück context is in , and the model is reviewed in . This generalised Lea-Coulson model can also be regarded as a branching process with inhomogeneous immigration. The branching component comprises the independently growing mutant clone birth processes and immigrants comprise the inhomogeneous Poisson process of mutations. See  for a review of this topic.
We have the following general result.
Let . The mutant number distribution of the generalised Lea-Coulson model is a distribution whose Bondesson measure has the density
where and is the Heaviside unit step function.
Just make the substitution in (21) and refer to Fact 6 to obtain the desired moment representation, . The resulting infinite integral does converge because it equals the integral (21). Alternatively, observe that and , implying that the regularity conditions in Definition 5 always are satisfied. □
For computational purposes it is more convenient to shift the integration variable in Fact 6 to obtain
and the corresponding Lévy density
Substituting, again, in (22) and (24) gives the ‘explicit’ moment representation
Hence the representing measure for any mutation number distribution derived from a generalised Lea-Coulson model has the time-dependent support .
This moment relation yields the fundamental relations
Suppose that normal cells increase in number as a logistic growth model with carrying capacity . Thus,
whose well known solution is
Hence, for the balanced case, , some manipulation yields
where is the exponential integral; see  (# 6.2.1).
Define . The integrand of (26) resolves into partial fractions:
It follows that
We obtain expressions for the Poisson rates as follows.
where, as usual, . It follows that
The power series expansion of the logarithm term yields the form
Letting , recalling that and noting that recovers the balanced Lea-Coulson model which we will consider in more detail in the next section.
4. Thorin Classes and the Lea-Coulson Model
We now introduce the above mentioned GGC class of infdiv distributions which are pertinent to a significant subclass of generalised Lea-Coulson models. We motivate the general definition by observing that, given independent gamma random variables () and constants , it follows from (5) that the Laplace exponent of the sum can be expressed as
where is a measure which assigns mass to the point . It follows from Fact 1(a) that X is SD.
The SD class is closed under limits in distribution so, taking the informal limit in (32) yields a putative limiting Laplace exponent
This does specify a SD distribution for any measure U on satisfying
A distribution whose Laplace exponent has the form (33) where and U is a measure on subject to (34) is called a generalised gamma convolution (GGC). A function of the form (33) is called a Thorin Bernstein function. An equivalent specification is that the class of GGC’s is the smallest which contains scaled gamma distributions and is closed under convolution and weak limits.
The representing measure U in (33) is often called the Thorin measure and we define the Thorin distribution function .
A GGC is a SD distribution for which the function k is completely monotone, .
Any GGC has a unimodal pdf f.
A GGC belongs to BO and its Bondesson measure is absolutely continuous with density .
We motivate a discrete version of by observing that the best known case of a Poisson mixture (12) is where , c a positive scaling constant, giving
where . Hence this gamma-mixed Poisson distribution is the negative binomial distribution with parameters p and , denoted NB. The case of course is a geometric distribution whose mixing distribution is an exponential one. The following definition extends this idea.
A Poisson mixture distribution is a generalised negative-binomial convolution (GNBC) if the distribution of the mixing random variable X is a GGC as defined above.
A calculation using Fact 7 gives
the PGF of a GNBC has the canonical form
where V is a right-continuous function on such that ,
The r-sequence (c.f. (9)) is a Hausdorff moment sequence,
Conversely, if the r-sequence of a DID distribution has this moment representation, then it is a GNBC.
The GNBC class is the smallest class of discrete distributions which contains negative-binomial distributions and is closed under convolution and weak limits.
A GNBC is discrete unimodal and its mass function is non-decreasing iff .
Since the shift constant a in (33) induces a Poisson component in (35), the left-extremity of a GNBC always is zero. Assertion (d) follows from Fact 7(b) and Holgate’s theorem , and then Fact 2 observing that .
The following fact gives a canonical representation for a mixture of geometric distributions and a condition that it be a GNBC;  (pp. 381, 390).
A function M defined on is the PGF of a geometric mixture distribution iff it has the form
where w is a (measurable) function on such that
A GNBC PGF (35) is the PGF of a geometric-mixture distribution iff and its representing function V satisfies , in which case .
Referring to (35), we will later need a general relation between the function V and the Thorin measure U of the mixing GGC distribution. The following result achieves this in terms of the Thorin distribution function .
The function is the right-continuous version of .
The integral in (33) can be written as the Stieltjes integral
It follows from the first member of (34) that for any we can choose such that
Next, it follows from the second member of (34) that there exists such that if , then
implying that .
Observing that the integrand in (37) is asymptotically proportional to as , and to as , it follows from an integration by parts that
In a similar manner, it follows from (35) with that the PGF of the corresponding GNBC is
The left-hand side equals and a computation shows that reduces to a Stieltjes integral as above with T as asserted. □
Recall the expression (22) for the Lévy masses pertaining to the generalised Lea-Coulson model. A very natural condition on the growth function of normal cells implies that mutation number distributions are GNBC’s.
Assume that the normal cell growth function is non-decreasing. Fix . Then:
The distribution of is a GNBC and hence unimodal. Its mass function is non-increasing iff
The Lévy density of the mixing GGC is given by
where the Thorin distribution function is
The canonical form of the PGF of is
The mutant number distribution is a geometric mixture iff .
Observe that is non-decreasing in y and that, since , it follows from (22) that
a Hausdorff moment. The GNBC assertion follows from Fact 8(b). The unimodality assertions follow from Fact 8(d).
With defined as above, observe that the representation (23) yields
Observe that in (35) and the form of follows from Theorem 4 expressed as and (39). Assertion (d) follows from Fact 9(b) and noting that .
It follows from (20) and the hypothesis of Theorem 5 that is an increasing function of t. Clearly if t is sufficiently small in which case the mutant number mass function will be non-increasing. It attains a positive maximum value if eventually exceeds unity.
The logistic differential Equation (28) implies that if , then its solution is strictly increasing. It follows that the corresponding mutant number distribution is a GNBC. However, except for the balanced case it does not seem that the integrals (26) and (27) can be evaluated in any insightful way. In the balanced case we now know that the Lévy density (30) is such that is completely monotone. The following direct demonstration of this fact yields its Thorin function .
Integration by parts shows that , where is completely monotone. Substitution into (30) leads to
The substitution exhibits as the sum of two completely monotone functions:
Thus the Thorin measure has a discrete component - a point mass at and its support is independent of K.
In the remainder of this section we restrict consideration to the Lea-Coulson  model described in §1 and give a self-contained treatment starting from (26). Taking we thus obtain
In the sequel we usually suppress the time dependence, thus regarding the distributions determined by (40) as a parametric family determined by where and .
Expressions equivalent to (40) appear first in . Sometimes  is coupled with this reference because, independently, a system of differential equations for the mass function of is derived, generalising the system in  for the case , and deducing a numerical solution scheme. The integral in (40) has no simple evaluation except perhaps for .
In fact, if , then evaluation gives the familiar outcome
This PGF appears for the first time in  (p. 10) as a result of solving the linear first-order partial differential equation derived in . Zheng  denotes the corresponding distribution by where the letter designation is chosen to honour the pioneering contribution of Salvador Luria and Max Delbrück.
Frequently in laboratory situations the product is so large that and it is argued that the form (42) is approximated by
This is a PGF as can be deduced from the explicit time-dependent distributions by allowing (implying ) and such that ; a kind of Poisson approximation. Zheng  (and others before him) name the distribution corresponding to (43) after Lea and Coulson because they derive (43) by using a clever manipulation to solve their partial differential equation. It is denoted by and thus coincides with . The solution (42) satisfies , reflecting the assumption (and laboratory situation) that . The LC solution does not satisfy this initial condition, but it has an interesting form-invariant character which bears the interpretation that mutant numbers evolve as a non-homogeneous Poisson process.
In view of this historical progression, we will designate the full family of distributions corresponding to (40) by .
It is well known that the distribution is qualitatively very different to distributions when . The moments of the former are infinite, reflecting the very slow decrease of its right-hand tail. If , then all moments are finite and the right-hand tail decays exponentially fast . The following result shows that each distributions is a GNBC and that the just-mentioned differences are reflected in the representing measures of the mixing GGC. Here, and below, recall that denotes the Heaviside unit-step function, i.e., the DF of the degenerate distribution allocating unit mass to the origin. Just below, and later, we will encounter the second confluent hyper-geometric function,
where and b is real. Observe that this function is completely monotone;  (Chapter 13).
If and , then the distribution has the following properties.
It is a GNBC, hence unimodal. Its mass function is non-increasing iff
In particular, the distribution is a geometric mixture iff .
The GGC mixing distribution has the Thorin distribution function
The Lévy measure of the GGC mixing distribution has a density which has the following explicit forms:
If , then
If , then
The above-mentioned difference between the cases and are manifested in the fact that the representing functions V and T are continuous with supports coinciding with their domains iff . Indeed, if , then decreases from θ at to at and it jumps to zero at . Note that (48) results by letting in (30).
(a) Comparing (6) with (40), a differentiation gives
This exhibits the desired Hausdorff moment form with the measure
This implies the first assertion, and the second follows by evaluating and appealing to Fact 2. Observe that the measure V has a discrete component which vanishes when .
(b) Integrating (49) and simplifying the result leads to
where C is the constant of integration. The condition implies that , whence (45).
(c) The evaluation (46) comes directly from Theorem 1 and (45).
(d) Recall that the Lévy density exists and, with no parameter restriction,
The right-hand side integral is an ‘incomplete’ confluent hypergeometric type of integral. If , then the first term vanishes and (47) follows.
If , but , then the substitution produces the evaluation
Reverting to the time-dependent form of parameters, it follows from Theorem 4 that being a geometric mixture and the nature of modality are time-dependent properties, whereas, e.g., the SD property of the distributions is a time-independent property. See  for this dichotomy.
The family of distributions is most commonly used to fit empirical mutant number distributions. It follows from the criterion (44) that as increases from small to large values, the mutant number mass function transitions from decreasing to having a positive mode. If equality folds in (44), then zero and unity are modes.
It usually is the case that the estimate of is so close to unity that it is chosen to equal unity. In this case the criterion (44) simplifies to
In the case of equal fitness of normal and mutant cells, (the model), then the transition from a zero to positive mode can be seen in the first three columns of Table 2 in  where, if (denoted by m in this reference), then . The model is fitted to three sets of laboratory data in  where is estimated as 0.3783, 3.84 and 3.03, respectively. Figures 3–5 in  graph the mass functions corresponding to these values.
Cases of differential fitness are illustrated in  (where is denoted by ). Figure 1 therein shows the mass function of the distribution with a modal value roughly 40. These numerical values are computed from those in the caption of Figure 1: and . In addition, the parameter values yield , justifying the choice . Figure 2 in  illustrates what can occur if is held constant and varies. This figure shows two graphs, the upper one for and the lower one for . Comparing these with Figure 1 in  suggests that increasing above unity yields more sharply peaked mass functions. The distribution has a finite mean iff , and a finite variance iff . Hence these example distributions have a finite mean and infinite variance.
Finally, to see that real estimated mutant number distributions can exhibit a zero or a positive mode we recall estimates determined in  from several experimental data sets for the distribution. A main objective in  is to introduce parameter estimation based on the empirical PGF and compare its performance with maximum likelihood estimation (MLE). Table 1 in  presents 95% confidence intervals for (denoted there by ) and (denoted there by ). Assuming that point estimates are the mid-point values of the confidence intervals, Table 1 here exhibits these estimates and it indicates the shapes of the estimated mass functions.
There now are several methods of estimating mutation model parameters and a question of interest is that if several methods are applied to a given set of data, will they be consistent as to the shape of the mutation number distributions they select? Published studies indicate that different methods can give quite different estimates, but they usually are, but not always, consistent in regard to the selected distribution shape. We mention two comparative studies for the model.
Five estimation methods are compared using four data sets in  (where m is used for ). Table 2 in  shows broad consistency in shape selection for Experiments A-C, with the first two indicating a zero mode and the third a positive mode. The -method was not applied/applicable to the Experiment D data. Two of the four estimated values resulted in a mode at zero, and the other two a positive mode.
Table 5 in  compares seven estimation methods using seven sets of experimental data. Estimates of (m in ) are quite variable across estimation methods, but selected shapes are broadly consistent. In fact, after adjusting the Luria-Delbrück mean method by eliminating large jackpots, all methods were consistent in five of the seven data sets. In the two other cases all methods except the Drake median method gave estimated , and the Drake estimate was a little over two; 2.07 for Experiment 2 and 2.08 for Experiment 6. In these cases the modal value is unity; , and if , and , and if . Finally, a zero mode was found for five of the seven data sets.
These investigations do provide confidence that, although different methodologies can show rather different parameter estimates, they in fact are broadly consistent with respect to shape selection.
We end this section with some remarks about plating efficiency. This term refers to the possibility that, upon plating, a mutant cell fails to establish a colony. This aspect frequently is modelled by assuming that plated mutants independently establish colonies each with a probability . In other words, successful establishment is modelled by binomial thinning – if is the PGF of the number of plated mutants, then the PGF of the number of established colonies is . A very convenient result asserts that binomial thinning preserves the GNBC property. Specifically, if , then we obtain from (35) that
In particular, if these measures have densities and , respectively, then
5. Branching Process Models
The normal population is depleted by one cell each time a mutation occurs. The Lea-Coulson model does not directly account for this. One argument is that in real situations so, this contingency can be neglected. Another response is to replace the parameter with , thus adjusting for a diminished average normal population growth rate.
Branching process models do take direct account of the normal population diminution due to mutation. A discrete-time model was propounded (no later than 1946) by J.B.S. Haldane. See Zheng  for an account and references. Haldane’s model counts population sizes generation by generation. Cell numbers increase by binary division and hence the total size (normals plus mutants) of the nth generation cannot exceed . Consequently the distribution of , the size of the nth mutant generation, cannot be infdiv. There is a Poisson type of limit theorem  resulting in a limiting compound Poisson distribution (and hence infdiv) and whose jump distribution has the PGF , a gap series, and hence this limiting distribution is not DSD.
Instead, we shall consider the continuous-time version of Haldane’s model. This model is a two-type linear birth process apparently formulated by M.S. Bartlett around 1951/2. It is mentioned for the first time in  (p. 37) with details appearing in the first edition of  (p. 132) published in 1955. See  for a detailed account and earlier references.
The balanced version of the model assumes that normal and mutant types divide into two cells during the interval with probability , independently of previous history. Mutants breed true, but a dividing normal cell has probability p of producing one mutant and one normal cell and probability of producing two normal cells.
The PGF of is
where (as above), but again we suppress the dependence on time t in our notation.
Zheng  (with more detail in ) notes a Poisson type of limit in which (i.e., ) and such that
resulting in the limiting PGF
The following result gathers infdiv properties of the Bartlett distributions, however, it is deficient in NOT concluding that they are GNBC’s. Referring to (50), the term in square brackets can be written as , where and
say. We show below that is a PGF.
It follows that in (50), the integer can be replaced with a positive-valued parameter, say. Thus
i.e., is the PGF of a gamma mixture of discrete infdiv distributions. We shall denote members of the resulting Bartlett family of distributions by .
The next result shows that a Bartlett distribution is a gamma mixture of GNBC’s.
Let and be as defined in (53). The distribution whose PGF is is a GNBC whose mixing GGC has the Lévy density
The corresponding Thorin distribution function is
We begin by showing is a PGF. Writing and referring to (53) we see that and
Hence and if .
Next, observe that
where is the beta function. We thus have the explicit representation
where , by virtue of a reflection formula for gamma functions.
It thus follows from the usual integral representation for beta functions that
Hence his a PGF, as asserted above.
Making the substitution and comparing the outcome with (20) we see that the Poisson intensity sequence is a Hausdorff moment sequence and that the Bondesson measure has support and density
The second equality above follows from the substitution and the final form follows from evaluating the subtracted integral term in the penultimate line using the substitution to obtain
We thus have obtain a final outcome
and it follows from its construction that ℓ is completely monotone. Hence is the PGF of a BOP distribution. Furthermore, this exhibits as the difference of completely monotone functions and we need to find a different representation to be able to conclude that is completely monotone.
The Kummer transformation implies the identity . Integration by parts of the right-hand side integral leads to
Substitution into (56) yields (54), as asserted. It is clear now that is completely monotone, and hence that is the PGF of a GNBC. □
An alternative, but not shorter, proof leading directly to (54) involves constructing the Bernstein representation of using the identity listed as Entry 2 in  (p. 304).
Recalling (50) with the integer replaced by and the definition , then choosing yields the representation for the Bartlett PGF,
It follows from Theorem 7 that this involves the composition of two Thorin Bernstein functions. However, the class of such functions is not closed under composition and hence we cannot conclude that a Bartlett distribution is a GNBC. On the other hand, the components of this composition are complete Bernstein functions and this class is closed under composition. See  (pp. 112 and 94), respectively. Hence we can conclude that Bartlett distributions of mutant numbers belong to BOP.
Similarly, the Zheng PGF (51) is that of a gamma mixture of Lea-Coulson distributions. Hence a corresponding analogue of Theorem 7 in essence is Theorem 6(d).
6. Some Other Mutant Number Distributions
The total population size for the above (balanced) Bartlett model comprises a linear birth process with splitting rate . Thus, the embedded jump chain is the deterministic process which jumps by unity at each cell division. Angerer  and Kepler and Oprea  independently and almost simultaneously proposed a discrete-time model for mutant numbers immediately following successive divisions at which takes values . Thus, if , and clearly . Their precise specifications differ in some details but, as in Section 5, a dividing normal cell produces one normal and one mutant with probability p. Angerer mentions back mutation but does not pursue that issue, instead he allows for mutation rates to depend on n and he provides a very careful and exact treatment of their models. Kepler and Oprea include the possibility of back mutation. Taking account of these differences, their fundamental difference equations relating the distributions of and , Equation (1) in both references, are the same.
In a more detail, Kepler and Oprea  assume a dividing mutant produces two mutants with probability and one cell of each type with probability . With no detail given, after they ‘pass to a continuum representation form’, they assert that the PGF of is given by
Let , although the biological context implies that . Note that taking yields the Poisson distribution with parameter .
So, assuming that , the substitution and then comparing the outcome with (40) shows that has the distribution with
and hence Theorem 6 above applies.
Angerer  proves several limit theorems for as and other constraints hold. For example, the limiting PGF displayed as (29) in  shows that the limiting distribution is that of a sum , where has a negative binomial distribution and M a LD distribution and they are independent. Hence the sum is a GNBC. Similarly the limit (32) in  is the PGF of a similar sum with replaced by a Poisson distributed random variable, again a GNBC.
More interesting is the PGF
displayed near the end of the proof of Theorem 5.2 in . The relation to the explicit form there is that and , where , and are certain constants specified in .
(a) If and , then (58) specifies a distribution which belongs to , but is not a GNBC.
(b) This distribution is DSD iff . In this case the mass function is non-increasing iff .
(a) We have
Expanding the logarithm term and collecting coefficients of , we find that and
Thus the sequence is a Hausdorff moment sequence, implying membership of BOP.
Recalling that , we have
Hence we obtain a moment representation , where
This function increases in but it has a negative jump at . Hence it is not monotone, implying the second assertion of (a).
(b) The second equality of (59) can be expanded as
Hence iff , a necessary condition for the SD property. In addition, iff
Hence (60) certainly holds if . The right-hand side is increasing in j and the case requires that . So, this condition is sufficient for the SD property. □
The final model we shall examine is based on the discretised Luria-Delbrück model as reformulated in . There are three model assumptions:
The probability of a mutation during is , where , but otherwise is arbitrary;
A mutation occurring at time t induces a growing clone of size at the time of plating/observation. Define ; and
Mutations are classifies as type j if . The number of type j mutations in a single culture is denoted by , a random variable having a Poisson distribution where
and “T is the time after which no observable mutations will occur”. Presumably, this could be the time of plating.
In relation to the second assumption, there is an enigmatic assertion that “depends on when the mutation occurs”. However, this is the absolute time t according to their direct specification. So, perhaps what is meant that t here means the current lifetime of the clone. We shall adopt this interpretation because it seems best aligned with the third assumption. Thus, is the number of type j mutations existing at time T.
Consequently, the number of mutants at time T is and, assuming that the are independent, which is unstated but implicit in , the PGF of the mutant number distribution is
where, as above, and . Thus, the computation of reduces to a determination of and .
A Luria-Delbrück model with a time and state-dependent mutation rate is specified in  (p. 181). Normal cell numbers grow according to and mutant numbers as . Hence a mutation at time t results in a clone size equal to
Denote a generic value of the right-hand side by j. Hence
So, if , then
where, in the integral, we regard t as a function of .
Thus, the problem reduces to deciding the form of . A standing assumption is that and and, more specifically, that
where and are related by
and , P and Q are positive constants. Here, represents a constant mutation rate per cell per generation and a rate per cell per time.
These specifications yield
Observe that the integral for diverges for . This is handled by computing the rate required to achieve a specified value of , although this tactic does represent a deviation from the model formulation in . The above log-term equals , and hence partial summation yields
Note that an approximation has been adopted in  whereby the zero-valued are replaced by the algebraic values obtained from the integration.
It follows that a necessary condition for DSD is , i.e.,
The mutant number distribution for the above specification is DSD iff (62) holds, in which case the mass function is non-increasing iff .
If , then
The coefficient of B equals
For any , the integrand decreases as j increases from to and the length of the interval of integration decreses too. Hence if . □
We know that the sequence of Poisson rates whose terms equal correspond to a GNBC. So a question is whether the sequence of rates ) together with an admissible value for similarly can be associated? We shall see below rhat the answer is No!
Referring to (61), if such that , then the result is the differential equation for logistic growth. Hence (61) itself represents a generalised form of logistic growth. More generally, (61) is a particular case of the relation
where is decreasing in n. We choose the following specific form.
Thus gives logistic growth, and if , then has a quadratic profile approximating the linear logistic profile. We compute
Evaluation of the integral follows from the substitution and resolving the integrand into partial fraction form. Note that the cases and yield the sequence in  and that our restriction is required by the context because is increasing if .
Proceeding further, let and define
(a) If , then the sequence is a Hausdorff moment sequence:
(b) If , then the sequence is a Hausdorff moment sequence:
(c) The Poisson rate (63) is and the r-sequence is given by .
(a) Begin with the following easily checked identity
The integrand term in brackets equals
Thus we obtain a double integral and the integral with respect to y is
Hence we have the evaluation
Now replace c with to obtain Assertion (a).
For (b), observe that
It follows from that and in addition, . The first equality in (65) follows, and a log-differentiation yields the second equality. □
It follows from Lemma 1 that if , then the distribution determined by (63) is a GNBC and hence that it is unimodal. The mass function is non-increasing iff , i.e., . If , then and the distribution is degenerate at infinity. Observe that, since , and hence Fact 5 shows that the mixing continuous distribution is not in .
Comparing the first member of (64) and (20) with shows that the Bondesson measure of the mixing GGC has the density
it follows that the integral expression for the Lévy density of the mixing GGC is
Kepler, T.B.; Oprea, M. Improved inference of mutation rates. I. An integral representation for the Luria-Delbrück distribution. Theor. Popul. Biol.2001, 59, 41–48. [Google Scholar] [CrossRef] [PubMed][Green Version]
Angerer, W.P. An explicit representation of the Luria-Delbrück distribution. J. Math. Biol.2001, 42, 145–174. [Google Scholar] [CrossRef]
Stewart, F.M.; Gordon, D.M.; Levin, B.R. Fluctuation analysis: The probability distribution of the number of mutants under different conditions. Genetics1990, 124, 175–185. [Google Scholar] [CrossRef]
Schilling, R.L.; Song, R.; Vondracek, Z. Bernstein Functions, 2nd ed.; De Gruyter: Berlin, Germany, 2012. [Google Scholar]
Olver, F.; Lozier, D.; Boisvert, R.; Clark, C. NIST Handbook of Mathematical Functions; C.U.P.: Cambridge, UK, 2010. [Google Scholar]
Steutel, F.W.; van Harn, K. Infinite Divisibility of Probability Distributions on the Real Line; Marcel Dekker, Inc.: New York, NY, USA, 2004. [Google Scholar]
Katti, S.K. Infinite divisibility of integer-valued random variables. Ann. Math. Stat.1967, 38, 1306–1308. [Google Scholar] [CrossRef]
Ma, W.T.; Sandri, G.v.H.; Sarkar, S. Novel representation of exponential functions of power series which arise in statistical mechanics and population genetics. Phys. Lett. A1991, 155, 103–106. [Google Scholar] [CrossRef]
Sarkar, S.; Ma; W. T.; Sandri, G.v.H. On fluctuation analysis: A new, simple and efficient method for computing the expected number of mutants. Genetica1992, 85, 173–179. [Google Scholar] [CrossRef]
Holgate, P. The modality of some compound Poisson distributions. Biometrika1970, 57, 666–667. [Google Scholar] [CrossRef]
Armitage, P. The statistical theory of bacterial populations subject to mutation. J. R. Stat. Soc. B1952, 14, 1–40. [Google Scholar] [CrossRef]
Crump, K.S.; Hoel, P.G. Mathematical models for estimating mutation rates in cell populations. Biometrika1974, 61, 237–244. [Google Scholar] [CrossRef]
Bondesson, L. Generalised Gamma Convolutions and Related Classes of Distributions and Densities; Springer: New York, NY, USA, 1992. [Google Scholar]
Karlin, S.; McGregor, J.M. The number of mutant forms maintained in a population. In Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, University of California, Berkeley (1965/1966) 1966; University of California Press: Berkeley, CA, USA, 1967; Volume 4, pp. 403–414. [Google Scholar]
Rahimov, I. Homogeneous branching processes with non-homogeneous immigration. Stoch. Qual. Control2021, 36, 165–183. [Google Scholar] [CrossRef]
Mandelbrot, B. A population birth-and-mutation process, I: Explicit distributions for the number of mutants in an old culture of bacteria. J. Appl. Prob.1974, 11, 437–444. [Google Scholar] [CrossRef]
Koch, A.L. Mutation and growth rates from Luria-Delbrück fluctuation tests. Mutat. Res.1982, 95, 129–143. [Google Scholar] [CrossRef]
Hamon, A.; Ycart, B. Statistics for the Luria-Delbrück distribution. Elec. J. Statist.2012, 6, 1251–1272. [Google Scholar] [CrossRef]
Luria, S.E.; Delbrück, M. Mutations of bacteria from virus sensitivity to virus insensitivity. Genetics1943, 28, 491–511. [Google Scholar] [CrossRef]
Boe, L.; Tolker-Nielsen, T.; Eegholm, K.; Spliid, H.; Vrang, A. Fluctuation analysis of mutations to nalidixic acid resistance in Escherichia coli. J. Bacteriol.1994, 176, 2781–2787. [Google Scholar] [CrossRef][Green Version]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely
those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or
the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas,
methods, instructions or products referred to in the content.