# VOLDIS: A Direct Search for the Size Distribution of Nanoparticles from Small-Angle Scattering Data and Stability of the Solution

## Abstract

**:**

## 1. Introduction

- (i)
- (ii)
- direct search for the particle size distribution as a histogram (McSAS random search method [4]);
- (iii)
- postulating the distribution in analytical form (e.g., normal distribution, Flory-Schulz distribution [5], etc.) and making multiparametric approximation of the data. This is implemented, e.g., in MIXTURE (and its modified version POLYMIX) [6] of the ATSAS package [2]; in SASFIT [7]); and some others.

## 2. Materials and Methods

#### 2.1. Theory and Algorithms

#### 2.1.1. General Principles

_{j}from R

_{min}to R

_{max}= D

_{max}/2. The way of estimation of D

_{max}is considered below. In discrete form, the theoretical scattering intensity from such a model can be written as:

_{V}(r

**) is the desired (volume) distribution function;**

_{j}**i**

_{0}(s,r

_{j}) is the square of the form factor, or the scattering intensity from a particle of the given shape with radius r

_{j}, unit volume, and density contrast (we will call them partial intensities); s = [4π⋅sin(θ)]/λ—scattering vector module (2θ—scattering angle in radians, λ—radiation wavelength); v(r

_{j})—the body volume with effective radius r

_{j}; dr

_{j}—step on the radius grid. If in Formula (1), the square of the volume is used, then the distribution would be by the number of particles, D

_{N}(r

_{j}) = D

_{V}(r

_{j})/v(r

_{j}). However, the shape of the D

_{N}(r

_{j}) graph is uninformative due to the rapid decay of the curve with increasing radius and, therefore, it is not used.

**x**is the vector of length N (vector of parameters) of the current D

_{V}(r) distribution values;

**I**(s) is the theoretical intensity according to formula (1);

**w**(s) is the weight function compressing the intensity range (see below); ξ is an auxiliary least-squares multiplier matching the scattering curves before calculating the difference

**x**of the target function (Equation (3)) are the values of the centers of the boxes on the histogram of the distribution represented on the grid of N particle radii from 0.1 nm to D

_{max}. The grid spacing in the program is equidistant. Smaller size step in the region of small sizes (for example, for the grid with the increasing relative step) is ineffective because scattering from small particles in the experimental region of angles is represented by almost similar weakly decreasing monotonous curves. The corresponding matrix of intensities in this case is close in character to the Hilbert matrix, which is characterized as a very ill-conditioned one. Consequently, the set of form factors of small particles can be degenerate within the accuracy of the machine arithmetic, and the relative small size step is unreasonable.

_{max}can be obtained with an acceptable accuracy from the estimate of the maximum radius of inertia of particles R

_{g}from the Guinier plot in the low angle region in the spherical approximation of the shape of scattering particles [9]:

_{max}should be assigned equal to ${D}_{\mathrm{max}}=2\cdot \sqrt{\raisebox{1ex}{$5$}\!\left/ \!\raisebox{-1ex}{$3$}\right.}\cdot {R}_{g}\text{\hspace{0.17em}}\approx \text{\hspace{0.17em}}2.58\cdot {R}_{g}$. However, the approximation (5) is only accurate for $s\to 0$. Since it is calculated from scattering data starting from s

_{min}> 0, it could be an underestimate, especially if the system contains a small fraction of large particles. Therefore, the estimate (6) is set as the minimum large value not yet leading to unreasonable overestimation of the model intensity in the region of extrapolation [0: s

_{min}]. In practice, (6) is a good choice for nonspherical particles as well.

- (1)
- (2)
- The program directly changes the distribution values at each node of the histogram. However, this distribution will consist of a large number of narrow peaks due to the strong correlation of the scattering intensity curves of neighboring (i.e., close in size) particles.
- (3)
- To suppress the splitting of broad distribution peaks into a large number of closely spaced narrow bands, the model scattering intensity in (3) is calculated from a smoothed distribution contour. This method additionally improves the conditionality of the problem.
- (4)
- The program finds a series of five to ten solutions with different (increasing from solution to solution) degrees of smoothing of the distribution contour. The search iteration for each subsequent solution starts with the smooth contour obtained at the previous iteration.
- (5)
- From the obtained set of solutions, the user chooses the smoothest distribution using an acceptable quality criterion of fitting (χ
^{2}, residual autocorrelations criterion, etc.). - (6)
- The obtained distributions are averaged, discarding the solutions for which the quality criterion does not exceed twice the value of the best one. In the process of averaging, absolute values of deviations between successive distributions are accumulated. These values are assumed to be the error bars in the
**D**_{V}plot. When averaging, strongly oscillating distributions can be discarded by specifying the interval of solution indices. - (7)
- In order to avoid artifacts in the area of large sizes, the model intensity in the extrapolation region to the zero scattering angle is calculated and analyzed. If this region of
**D**v decreases non-monotonically by the 2nd derivative criterion, the program gives a message about the necessity to decrease D_{max}(by default—by a factor of 1.5). - (8)
- The uniform particle number distribution is used as the starting approximation. From the 2nd iteration, distribution starts from the smoothed contour obtained at the previous step. It is possible to start a new iteration from the initial approximation.
- (9)
- Distributions are searched for in the range [D
_{Vmin}: 10^{4}], where D_{Vmin}is set to zero or a small number. The upper limit protects against numerical overflow. Dynamic normalization of distributions in the search process proved to be unacceptable due to a deterioration in the conditionality of the problem. In addition, this sometimes leads to its multimodality. Restrictions on the non-zero minimum values D_{Vmin}in the histogram may be set using information about the nature of the sample under study: sometimes it is necessary to consider the impossibility of a strict absence of particles of any size, then the presence of a non-zero background in the distribution can be introduced. Of course, the shape of such a background curve must depend on the type of the particular object and is not considered in this manuscript.

#### 2.1.2. Minimization Method

**x**

_{k}is the current point at the k-th iteration (the vector of distribution in our case), then, in the linear least-squares case, the point of minimum function is reached if the step

**p**

_{k}is taken from the equation (Newton’s method) as

**x**) (Equation (7)) can provide a convergence speed close to the quadratic one for nonlinear problems, even if one neglects the second order term Q(

**x**). This leads to the Levenberg-Marquardt method:

**p**

_{k}is taken according to equation

#### 2.1.3. Modification of the Minimization Procedure

^{−16}is the relative accuracy of double precision machine arithmetic, x

_{i}–current value of the i-th parameter of the model. In practice, instead of EPS, it is necessary to use accuracy of calculation of values of the target function. Accuracy here refers to the magnitude of fluctuations in the function value obtained with small argument increments. The errors are due to the total of rounding errors accumulated in the course of all calculations. Let’s call this error “calculation noise”, ε

_{F}. Unfortunately, the necessity of estimating ε

_{F}is not emphasized in known software packages. Calculation noise influences the efficiency of minimization programs, since at too small (or too large) parameter variations the calculation of gradients becomes meaningless, and the search stops farther from the minimum the greater the noise. It is impossible to predict ε

_{F}analytically, because even if such estimates exist, they are usually upper-bound values and have no practical sense. But it is possible to make an estimate “on the fact” by making M small variations of model parameters and analyzing the set of values of the target function, writing them in the vector

**f**

_{0}. In practice, the number of variations M = 15–20 for each parameter is sufficient. The values of variations for the variables

**x**are chosen from the approximate condition of optimal increment for the third derivative of the target function in the absence of additional information as ${\delta}_{x,i}=\left(1+\left|{x}_{i}\right|\right)\cdot \sqrt[3]{3\cdot EPS}$ [13]. It should be noted that the algorithm for calculating computational noise is a little sensitive to the choice of the magnitude of the increment. It can be varied within two to three orders of magnitude or even more.

**f**

_{0}are entered in the first column of the finite difference table. The next columns contain the values of differences of two consecutive elements of the previous column, obtaining first, second, third, etc., difference columns. In [14] (Chapt. 1.4 and 2.8), it is shown that, if in the column of n-th differences the signs of consecutive elements alternate (i.e., the correlation coefficient of consecutive elements tends to −1), then their values are due to the random variations in the first column. Knowing the order of differences n, from the value of the standard deviation σ

_{n}of elements in this column, we can reverse the calculation of the relative random deviation in the target function (first column) as:

^{−6}–10

^{−3}, depending on the type of the form factor, whereas, in the standard way, the assigned increments would be 10

^{−7}–10

^{−8}in double precision calculations, which would lead to inoperability in the search procedure.

#### 2.1.4. Details of the VOLDIS Algorithm Implementation

#### Maximum Diameter and Scattering Extrapolation to Zero Angle

#### Smoothing of the Distribution Contour and the Number of Histogram Nodes

_{V}.

_{i}, applied inside a window of width K = 2k + 1 points with indexes i − k, …, i + k, in which a weighted average is calculated subject to $\langle {D}_{V}{}_{i}\rangle ={\displaystyle \sum _{j\text{\hspace{0.17em}}=\text{\hspace{0.17em}}i-k}^{i+k}{D}_{V}{}_{j}\cdot {u}_{j}}$, with $\langle {D}_{V}{}_{i}\rangle ={\displaystyle \sum _{j\text{\hspace{0.17em}}=\text{\hspace{0.17em}}i-k}^{i+k}{D}_{V}{}_{j}\cdot {u}_{j}}$. The shape of the Hamming window

**u**is one period of cosine from –π to π, shifted along the ordinate to the positive region. The width of the window K (degree of smoothing) varies between 1 and 10% of the full distribution range, but not less than 3 points. The program uses an original approach consisting of smoothing that is carried out five to ten times sequentially, which excludes the bumps on the smoothed profile that can occur during one-pass smoothing. During program development, spline-smoothing algorithms and the best piecewise polynomial schemes were tested, with different types of weighting and regularization. These approaches did not reveal any advantages.

#### Solution Quality Criteria

^{2}, which does not exceed the value obtained with the minimum degree of smoothing by more than 1.5 times. Further, the corresponding model intensity should describe the experimental data without systematic deviations exceeding the standard deviation of noise in the data by more than 1.5 times.

^{2}criterion makes statistical sense only in the case of adequate evaluation of standard deviations in experimental data. Sometimes these estimates are absent or equal to zero in the case of model problems. But, even in the latter case, model intensities are calculated with finite accuracy, i.e., they contain some “calculation noise”. In such cases, a criterion of “residuals randomness” may be used. The program employs the Durbin-Watson test [16] to check the residuals for autocorrelations between consecutive elements, which are calculated using the formula:

_{i}is the tested sequence

**I**

_{exp}–

**I**

_{mod}. The value of the criterion lies in the region of {0–4}, taking values of about 2.0 in the case of no correlation. Thus, if the hypothesis of the presence of correlation between successive elements of the array has a significance level of 0.05, it is rejected if the criterion value lies in the range {1.8–2.2}. Of course, the statistically valid value depends on the number of elements in the sequence but, for our purposes, we can neglect the statistical rigor of the estimates. In practice, it is necessary to admit the presence of weak residual correlation due to instrumental distortions and other systematic measurement errors and, in most cases, the boundary should be slightly extended (for example, to {1.6–2.2}), justifying this extension empirically on the basis of experience of solving similar problems. Such an extension is not strict but refers to expert estimates. So, DW < 1.6 correspond to systematic deviations of the residuals. This usually corresponds to too high a degree of smoothing of the distribution. If DW > 2.2, then the solution is most likely adequate. Due to the high monotonicity of the small-angle scattering curves, no model will be able to describe the frequent systematic intensity fluctuations, which represent the high-frequency spectral component of the noise. In this case, the nature of the experimental noise is not accidental—the noise fluctuations change sign too often, which leads to negative autocorrelations in them. Estimate (11) can be used as an integral criterion, calculating it for the whole angular range of data, or we can apply a window criterion, calculating (11) inside a scanning window 10–20% of the whole range. In the latter case, we will obtain a curve of DW estimates in which areas with DW < 1.6 correspond to systematic deviations. This is extremely useful information, since on the basis of the obtained statistics, one can construct an additional weight function for

**I**

_{exp}, increasing the contribution of the corresponding scattering regions to (3) and repeating the search from scratch.

#### Intensity Weighting

^{2}value from one. Such a solution corresponds to the maximum likelihood criterion, which seems attractive because of its name and statistical meaning. However, in the case of SAS data analysis, this correct approach often fails. The problem is that the model intensity is fitted within an aligned error corridor. In this case, the shape of the weighted scattering curve becomes dependent on the type of detector (one-dimensional, sectoral, or two-dimensional). As a rule, the experimental data are dominated by Poisson noise, in which the standard deviation equals the square root of their mathematical expectation (~mean intensity). Then, the intensity obtained by azimuthal averaging of the isotropic pattern of the two-dimensional detector will have significantly less relative noise compared to the one-dimensional detector at large angles. However, this region contains mostly atomic scattering background and non-informative tails of particle scattering. In addition, the intensity at large angles can be incorrect due to slightly incorrect subtraction of the scattering from an empty cuvette or a cuvette with solvent. Consequently, the requirement for high absolute accuracy of the fit at large angles can be made weaker. The practice of solving a large number of model and real problems has shown that the solution of the distribution search problem weakly depends on the starting approximation if the ratio of the maximum intensity (at small angles) to the minimum intensity (at large angles) is within five to ten, i.e., the curve weighting should simply reduce this ratio to the required level, regardless of the error amplitude distribution along the data curve. The influence of a significant noise component at large angles will be small if the number of angular samples is more than five to ten times the number of Shannon channels in the data, which is usually fulfilled in practice. Under the Shannon channel, we mean the maximum allowable angular step at which there is still no loss of information. It is defined by the maximum size D

_{max}of particles as s

_{shann}= π/D

_{max}[9] (Chapt. 3.2). Then, the number of Shannon channels is N

_{shann}= (s

_{ma}x − s

_{min})/s

_{shann}. In practice, the number of angular samples N > 10 N

_{shann}is a good choice.

^{2}: s (so called Kratkis plot) or even I(s) s

^{p}, where p is a real exponent value. In the case of the analysis of monodisperse systems, such weighting of the data additionally attenuates the relative contribution to the non-conformity of the initial part of the scattering curve, where an undesirable contribution of scattering from particle aggregates is possible. In the case of polydisperse systems, however, the initial angular section already contains basic information about the shape of the distribution. In order not to lose the contribution of the scattering from the large particles, one can artificially increase the scattering at initial values of s by means of additional multipliers, as conducted, for example, in [17].

_{exp}is smoothed until there is no autocorrelation in the residuals (see above). The smoothed curve I

_{smo}(s) is raised to the required real degree p obtaining ${I}_{smo}^{p}(s)$. The ratio $w({s}_{i})={I}_{smo}^{p}({s}_{i})/{I}_{smo}^{}({s}_{i})$ is calculated. After multiplying I

_{work}(s

_{i}) = I

_{exp}(s

_{i}) u(s

_{i}), the working intensity curve is close in its shape to {I

_{exp}(s

_{i})}

^{p}, but it is obtained as a result of a linear transformation. We have called such an operation a “quasi-power” transformation. A similar approach can be applied to the “quasi-logarithmic” transformation.

#### On the Choice of Formfactor Type

_{max}, and with the maximum scattering angle equal to the experimental one. Scattering from the intermediate sizes will be calculated by scaling the given template intensity downward using spline interpolation. The number of points in the template formfactor data should not be less than 3000–5000 to ensure acceptable accuracy of handling sharp breaks in the curve.

#### Assessment of Stability of Distribution Shapes

^{2}or R-factor) does not exceed the minimum value obtained by a factor of 1.5–2 are involved in the averaging process. Of course, the search for effective criteria will continue.

#### Other Details

#### 2.2. Measurement Details

^{−1}and 0.1 < s < 6.5 nm

^{−1}for each of the instruments, respectively. The samples were placed in quartz capillaries of 1 mm diameter (sample thickness). The measurement time for one sample was 10 min. The experimental data were corrected for collimation distortions, as described in [9] (Chapt. 9.4). The scattering intensity from the capillary with the solvent was subtracted from the sample scattering data.

#### 2.3. Sample Description

^{−}on the surface of silica nanoparticles attracts Na

^{+}cations, which form a charged layer close to the surface of the silica called the stern layer. Such charged layers lead to stable colloid suspension due to strong electrostatic repulsion between the particles [19].

#### 2.4. Additional Software

## 3. Results and Discussion

^{+}ions and OH

^{−}groups form a double electric shell on the surface of the nanoparticles in solution, which prevents their aggregation and increases contrasts of their surface, thus increasing the determined diameter in solution. It can be assumed that the indicated diameter of 22 nm is obtained from electron microscopy data from dry nanoparticles. In [22], the data obtained for the TM50 solutions by acoustic spectrometer (32.1 nm), by laser diffraction (29.9 nm), and by dynamic light scattering (34.1 nm) are given. Such discrepancy of published data allows one to assert that the particle size found from data of small-angle X-ray scattering is the most adequate.

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Svergun, D.I. Determination of the regularization parameter in indirect–transform methods using perceptual criteria. J. Appl. Cryst.
**1992**, 25, 495–503. [Google Scholar] [CrossRef] - Manalastas-Cantos, K.; Konarev, P.V.; Hajizadeh, N.R.; Kikhney, A.G.; Petoukhov, M.V.; Molodenskiy, D.S.; Panjkovich, A.; Mertens, H.D.T.; Gruzinov, A.; Borges, C.; et al. ATSAS 3.0: Expanded functionality and new tools for small-angle scattering data analysis. J. Appl. Cryst.
**2021**, 54, 343–355. [Google Scholar] [CrossRef] [PubMed] - Brunner-Popela, J.; Glatter, O. Small-Angle Scattering of Interacting Particles. I. Basic Principles of a Global Evaluation Technique. J. Appl. Cryst.
**1997**, 30, 431–442. [Google Scholar] [CrossRef] - Bressler, I.; Pauw, B.R.; Thünemann, A.F. McSAS: Software for the retrieval of model parameter distributions from scattering patterns. J. Appl. Cryst.
**2015**, 48, 962–969. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Flory, P.J. Molecular Size Distribution in Linear Condensation Polymers. J. Am. Chem. Soc.
**1936**, 58, 1877–1885. [Google Scholar] [CrossRef] - Svergun, D.I.; Konarev, P.V.; Volkov, V.V.; Koch, M.H.J.; Sager, W.F.C.; Smeets, J.; Blokhuis, E.M. A small angle X-ray scattering study of the droplet–cylinder transition in oil–rich sodium bis(2–ethylhexyl) sulfosuccinate microemulsions. J. Chem. Phys.
**2000**, 113, 1651–1665. [Google Scholar] [CrossRef] [Green Version] - Bressler, I.; Kohlbrecher, J.; Thünemann, A.F. SASfit: A tool for small-angle scattering data analysis using a library of analytical expressions. J. Appl. Cryst.
**2015**, 48, 1587–1598. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kryukova, A.E.; Konarev, P.V.; Volkov, V.V.; Asadchikov, V.E. Restoring silicasol structural parameters using gradient and simulation annealing optimization schemes from small-angle X-ray scattering data. J. Mol. Liquids
**2019**, 283, 221–224. [Google Scholar] [CrossRef] - Feigin, L.A.; Svergun, D.I. Structure Analysis by Small-Angle X-ray and Neutron Scattering; Plenum Press: New York, NY, USA, 1987; p. 321. [Google Scholar]
- User Guide for the SASfit Software Package. Available online: https://raw.githubusercontent.com/SASfit/SASfit/master/doc/manual/sasfit.pdf (accessed on 5 November 2022).
- Available online: http://www.netlib.no/netlib/port/dn2fb.f (accessed on 3 January 2005).
- Dennis, J.E., Jr.; Gay, D.M.; Welsh, R.E. Algorithm 573 NL2SOL—An Adaptive Nonlinear Least–Squares Algorithm [E4]. ACM Trans. Math. Soft.
**1981**, 7, 369–383. [Google Scholar] [CrossRef] - Gill, P.E.; Murray, W.; Wright, M.H. Practical Optimization; Academic Press: London, UK, 1982; p. 462. [Google Scholar]
- Hamming, R.W. Numerical Methods for Scientists and Engineers; Mc Graw-Hill Book Company, Inc.: New York, NY, USA, 1962; 411p. [Google Scholar]
- Bowman, A.W.; Azzalini, A. Applied Smoothing Techniques for Data Analysis; Claredon Press: Oxford, UK, 1997; p. 193. [Google Scholar]
- Durbin, J.; Watson, G.S. Testing for Serial Correlation in Least–Squares regression III. Biometrika
**1971**, 58, 1–19. [Google Scholar] [CrossRef] - Svergun, D.I. Restoring Low Resolution Structure of Biological Macromolecules from Solution Scattering Using Simulated Annealing. Biophys. J.
**1999**, 76, 2879–2886. [Google Scholar] [CrossRef] [Green Version] - Available online: https://www.chempoint.com/products/grace/ludox-monodispersed-colloidal-silica/ludox-colloidal-silica/ludox-tm-50 (accessed on 2 January 2002).
- Sögaard, C.; Funehag, J.; Abbas, Z. Silica sol as grouting material: A physio-chemical analysis. Nano Converg.
**2018**, 5, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Data Analysis Software ATSAS 3.1.3. Available online: https://www.embl-hamburg.de/biosaxs/software.html (accessed on 9 September 2022).
- Konarev, P.V.; Volkov, V.V.; Sokolova, A.V.; Koch, M.H.J.; Svergun, D.I. PRIMUS—A Windows-PC based system for small-angle scattering data analysis. J Appl Cryst.
**2003**, 36, 1277–1282. [Google Scholar] [CrossRef] - LUDOX® Colloidal Silica, Grace—ChemPoint. Colloidal Silica as a Particle Size and Charge Reference Material. Available online: https://www.horiba.com/fileadmin/uploads/Scientific/Documents/PSA/TN158.pdf (accessed on 10 November 2019).

**Figure 1.**Experimental (dots, smoothed by nonparametric adaptive algorithm) and model (solid lines) intensities of small angle scattering from silica sample TM50. Final solution (Figure 3) is chosen to be as smooth as possible when χ

^{2}is still small (in this case χ

^{2}= 1.11). K denotes the width of the smoothing window in points.

**Figure 2.**The set of size distributions calculated at different degrees of smoothing. The widths of the smoothing windows are indicated in points. The average diameter of the main fraction is 26 ± 1 nm. The distribution of the main fraction is asymmetrical and is a superposition of distributions of nanoparticles with average diameters of 22 ± 0.5 and 27 ± 0.5 nm. Contributions of particles with average radii of 47 ± 2 nm and 1.5 ± 0.5 nm are also visible. The shape of the distributions for larger diameters weakly depends on the degree of smoothness of the distribution, whereas the distribution of the smallest particles at 1.5 nm is less stable.

**Figure 5.**The set of all solutions for the data in Figure 4.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Volkov, V.V.
VOLDIS: A Direct Search for the Size Distribution of Nanoparticles from Small-Angle Scattering Data and Stability of the Solution. *Crystals* **2022**, *12*, 1659.
https://doi.org/10.3390/cryst12111659

**AMA Style**

Volkov VV.
VOLDIS: A Direct Search for the Size Distribution of Nanoparticles from Small-Angle Scattering Data and Stability of the Solution. *Crystals*. 2022; 12(11):1659.
https://doi.org/10.3390/cryst12111659

**Chicago/Turabian Style**

Volkov, Vladimir V.
2022. "VOLDIS: A Direct Search for the Size Distribution of Nanoparticles from Small-Angle Scattering Data and Stability of the Solution" *Crystals* 12, no. 11: 1659.
https://doi.org/10.3390/cryst12111659