Representative Points from a Mixture of Two Normal Distributions

Li, Yinan; Fang, Kai-Tai; He, Ping; Peng, Heng

doi:10.3390/math10213952

Open AccessArticle

Representative Points from a Mixture of Two Normal Distributions

by

Yinan Li

^1,2

,

Kai-Tai Fang

^1,3,

Ping He

^1,* and

Heng Peng

²

¹

Department of Statistics and Data Science, Beijing Normal University–Hong Kong Baptist University United International College, Zhuhai 519087, China

²

Department of Mathematics, Hong Kong Baptist University, Kowloon, Hong Kong

³

The Key Lab of Random Complex Structures and Data Analysis, The Chinese Academy of Sciences, Beijing 100045, China

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(21), 3952; https://doi.org/10.3390/math10213952

Submission received: 29 August 2022 / Revised: 17 October 2022 / Accepted: 19 October 2022 / Published: 24 October 2022

(This article belongs to the Special Issue Distribution Theory and Application)

Download

Browse Figures

Versions Notes

Abstract

:

In recent years, the mixture of two-component normal distributions (MixN) has attracted considerable interest due to its flexibility in capturing a variety of density shapes. In this paper, we investigate the problem of discretizing a MixN by a fixed number of points under the minimum mean squared error (MSE-RPs). Motivated by the Fang-He algorithm, we provide an effective computational procedure with high precision for generating numerical approximations of MSE-RPs from a MixN. We have explored the properties of the nonlinear system used to generate MSE-RPs and demonstrated the convergence of the procedure. In numerical studies, the proposed computation procedure is compared with the k-means algorithm. From an application perspective, MSE-RPs have potential advantages in statistical inference.Our numerical studies show that MSE-RPs can significantly improve Kernel density estimation.

Keywords:

Fang-He algorithm; Kernel density estimations; k-means algorithm; mixture of normal distributions; representative points

MSC:

62E17; 62E10; 47J25

1. Introduction

A finite mixture of distributions allows for great flexibility in capturing a variety of density shapes. During the past few years, the Gaussian mixture family has attracted considerable attention due to its ability to approximate any continuous distribution using a finite Gaussian mixture model. Mixtures of normal distributions have numerous applications across a variety of disciplines, including physics, engineering, economics, biology, and finance. Andrew et al. [1] apply a two-component Gaussian mixture model for fast neutron detection with a pulse shape discriminating scintillator. Kong et al. [2] model the wireless channels using a mixture of normal distributions in order to analyze physical layer security over fading channels. Shen et al. [3] employ a mixture of two-component normal distributions to model the knock intensity in the gasoline engine combustion control problem. Mazzeo et al. [4], Ouarda and Charron [5] use a mixture of two truncated normal distributions for modeling wind speed. The mixture of normal distributions is capable of capturing the leptokurtic, skewed, and multimodal characteristics of financial data. Venkataraman [6] applies the mixture of two-component normal distributions to construct the Value at Risk (VaR) measures. In public health and biomedical studies, the admixture model is a two-component mixture model with one component being indexed by an unknown parameter and the other component’s parameter being known. The admixture models are widely used to account for population heterogeneity in genetic linkage analysis [7,8].

It is important to keep in mind that although the mixture model is flexible, it can also present certain statistical challenges. The number of parameters increases more than k-fold when k-component normal mixtures are applied. The mixture of normal distributions is not always an identifiable distribution. Different values of its parameters may generate the same probability distribution of the observable variables. The loss of identifiability breaks the asymptotic normality and the chi-squared approximation of a likelihood ratio test statistic constructed by independent and identically distributed samples from a mixture model. Hartigan [9] has shown that the likelihood ratio test statistic for homogeneity is stochastically unbounded and therefore fails to converge to the classical chi-squared distribution. Due to the non-identifiability problem, the asymptotic normality and chi-squared approximation of the likelihood ratio test statistic do not hold for the admixture models [8]. In addition to the loss of identifiability, the additive density form of a mixture model results in an unbounded likelihood function [10]. The singularity problem breaks down the consistency of MLE and influences the asymptotic normality property of MLE due to the occurrence of a degenerated Fisher information matrix. Recently, Panić et al. [11], Li and Fang [12] developed improved methods for estimating parameters of mixtures of normal distributions to overcome the difficulties associated with parameter estimation. More discussion on the inconsistency of MLE under a mixture of a two-component normal distribution can be found in Chen [13].

In this paper, we consider the problem of discretization of a mixture of two-component normal distributions (MixN) under the minimum mean squared error measure. Quantization is a process employed in source coding and information theory for converting continuous analog signals into discrete digital signals. Analog quantities must be quantized when they are represented, processed, stored, or transmitted by a digital system [14]. Quantization is also required for data compression. More information about quantization can be found in Graf and Luschgy [15], Gersho and Gray [16]. Max [17] considered the problem of minimizing the distortion of a signal by a quantizer when the number of output levels of the quantizer is fixed. The distortion can be defined as the squared error between the input signal and the output of the quantizer. Assume the input signal follows a continuous distribution. The optimal output levels are the representative points of the underlying distribution with minimum mean squared error. Our paper refers to the representative points as MSE-RPs for short. A more detailed discussion of MSE-RPs and their associated properties is provided in Section 3.

Due to the complexity of the density function of a MixN, selecting MSE-RPs from a MixN presents a challenge. Flury [18] provided analytic solutions to two principal points (MSE-RPs) of univariate symmetric distributions, such as the uniform distribution and the normal distribution. However, the additive density form and the uncertainty concerning the unique solution of MSE-RPs make it difficult to obtain the analytic solution of two MSE-RPs from a MixN. A log-concavity condition is commonly used in the literature to determine the unique solution to MSE-RP [19,20]. Later, Yamamoto and Shinozaki [21] derived a sufficient condition for the existence of unique two principal points (MSE-RPs) for the mirrored mixture of normal distributions. Mirrored mixtures of two-component normals have a symmetric distribution and a mixture proportion of 0.5. Using Truskin’s result [19], we derived a sufficient condition for location mixtures of two normal distributions with a mixture proportion ranging from 0 to 1 in Section 2. This condition can be used with any number of MSE-RPs.

In order to obtain numerical approximations to the MSE-RPs from a MixN, the Lloyd I procedure, also known as the k-means algorithm, is most applied [22,23]. Rowe [24] proposed an algorithm for finding solutions to a loss function with arbitrary order. In our case, we consider the order of 2. The algorithm obtains one value of n points at a time and then adjusts the other values based on the success of the corresponding search. This algorithm may not be efficient when it comes to generating large numbers of points. MSE-RPs of univariable distributions can be obtained by solving a system of non-linear equations. The non-linear system is formulated by taking the first-order partial derivatives of the mean squared function with respect to each point. Recently, Chakraborty et al. [25] applied the iterative Newton’s method to solve the nonlinear system. They demonstrated that their method calculates MSE-RPs with high precision for many univariate distributions. However, in their method, initial values for iterative Newton’s method are assigned at random. Convergence of the algorithm is not always guaranteed. Fang and He [26] proposed a computation procedure (Fang-He algorithm) also for solving the nonlinear system. They proved the convergence of the proposed algorithm for the normal distribution. Later, many authors proved the convergence of the Fang-He algorithm for other univariate distributions, for example, the Student t-distribution [27], the Gamma distribution [28], the S-type distributions [29], the Weibull distribution [30], and the Pearson distributions [31]. Recently, Fang et al. [32] applied the Fang-He algorithm to generate MSE-RPs from a MixN with bimodal. However, further investigation is necessary to determine whether the Fang-He algorithm is able to achieve convergence for MixN.

Taking advantage of the high precision and effectiveness of the Fang-He algorithm, we employ the same algorithm for finding numerical approximations of MSE-RPs from a MixN. The algorithm convergence and the properties of the nonlinear system based on the underlying distribution of MixN are discussed in Section 4.2. Numerical study indicates that the Fang-He algorithm is capable of providing more accurate MSE-RPs from the mixture of two-component normal distributions when compared to the famous k-means algorithm.

The rest of this paper is organized as follows. Section 2 includes the preliminaries of two-component normal mixtures. Section 3 discusses the MSE-RPs from a MixN. Section 4 gives the proposed computation procedures for MES-RPs selection. Section 5 presents numerical studies for examining the accuracy of the algorithms. A simulation study of kernel density estimation for signal transmission is also performed in this section. Finally, Section 6 gives conclusions and remarks.

2. Mixtures of Two-Component Normal Distributions

Consider a random variable X that follows a mixture of two independent normal distributions. Denote as X∼

MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

. The parameter

α

ranges from 0 to 1, and represents the contribution of the first component of the normal mixture

N (μ_{1}, σ_{1}^{2})

. The probability density function (pdf) of X is defined as

f (x) = α ϕ (x; μ_{1}, σ_{1}^{2}) + (1 - α) ϕ (x; μ_{2}, σ_{2}^{2}),

and the cumulative distribution function (cdf) of X is given by

F (x) = α Φ (x; μ_{1}, σ_{1}^{2}) + (1 - α) Φ (x; μ_{2}, σ_{2}^{2}),

where

α \in (0, 1)

,

ϕ

and

Φ

are the pdf and cdf of

N (0, 1)

, respectively. For the remainder of this paper, we denote MixN as a mixture of two-component normal distributions.

The normal distribution belongs to the location-scale family. The convex combination of two normal distributions, however, is no longer considered a location-scale distribution. There are two special sub-classes of MixN when two normal components have the same location or have the same scale. Let X∼

MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

.

If $μ_{1} = μ_{2}$ , $σ_{1}^{2} \neq σ_{2}^{2}$ , X follows a scale mixture, denoted as $MixN.S (α, μ, σ_{1}^{2}, σ_{2}^{2})$ ;
If $μ_{1} \neq μ_{2}$ , $σ_{1}^{2} = σ_{2}^{2}$ , X follows a location mixture, denoted as $MixN.L (α, μ_{1}, μ_{2}, σ^{2})$ .

Mirrored mixtures are a special case of location mixtures, denoted as

MixN.M (μ, σ^{2})

. The corresponding density function is

f (x) = \frac{0.5}{σ \sqrt{2 π}} e^{- \frac{{(x - μ)}^{2}}{2 σ^{2}}} + \frac{0.5}{σ \sqrt{2 π}} e^{- \frac{{(x + μ)}^{2}}{2 σ^{2}}} .

(1)

The density of a MixN is either unimodal or bimodal. Statistical analyses based on a distribution of MixN are potentially influenced by the number of modes. For instance, when there is a large overlap in density between two mixture components, parameter estimation of MixN will be difficult. This is due to the difficulty of identifying the correct group for each observation. In the case of a bimodal MixN, the two normal components can be separated more easily, resulting in a more accurate estimation of parameters. In contrast, parameter estimation for a unimodal MixN will be challenging. In our study, the number of modes will also influence the generation of MSE-RPs. It remains unexplored whether there exists a set of unique MSE-RPs for a bimodal MixN. Hence, all numerical approximations of MSR-RPS from a bimodal MixN are only stationary points.

It is advantageous to know the conditions on the number of modes prior to setting the five parameters of MixN in statistical simulations. Next, we review and discuss some sufficient conditions for unimodal and bimodal densities, respectively. Eisenberger [33] derived a sufficient condition for a MixN to be unimodal, that is, for all

α, 0 < α < 1

,

{(μ_{1} - μ_{2})}^{2} < \frac{27 σ_{1}^{2} σ_{2}^{2}}{4 (σ_{1}^{2} + σ_{2}^{2})},

(2)

and a sufficient condition for a MixN to be bimodal, that is, there exist values of

α

, where

0 < α < 1

such that

{(μ_{2} - μ_{1})}^{2} > \frac{8 σ_{1}^{2} σ_{2}^{2}}{σ_{1}^{2} + σ_{2}^{2}} .

According to Behboodian [34], a sufficient condition of a unimodal MixN is,

| μ_{1} - μ_{2} | \leq 2 min \{σ_{1}, σ_{2}\},

(3)

Consider the special sub-classes of MixN. If

μ_{1} = μ_{2}

is assumed, a scale mixture (MixN.S) is unimodal given

\forall α \in (0, 1)

according to the sufficient conditions of (2) and (3). When

σ_{1}^{2} = σ_{2}^{2} = σ^{2}

is assumed, Behboodian [34] further obtained a sufficient condition of unimodal for a location mixture (MixN.L), that is,

|μ_{1} - μ_{2}| \leq 2 σ \sqrt{1 + \frac{| log α - log (1 - α) |}{2}}, \forall α \in (0, 1) .

(4)

Note that (4) gives the dynamic upper bound of

|μ_{1} - μ_{2}|

of a unimodal location mixture. The sufficient condition derived from (2) is

|μ_{1} - μ_{2}| \leq 3 \sqrt{3} σ / 4

for

\forall α \in (0, 1)

. Behboodian [34] further proved that a location mixture with

σ_{1}^{2} = σ_{2}^{2}

and

α = 0.5

is unimodal, if and only if,

|μ_{1} - μ_{2}| \leq 2 σ

. Hence, a location mixture with

σ_{1}^{2} = σ_{2}^{2}

and

α = 0.5

is bimodal, if and only if,

| μ_{2} - μ_{1} | > 2 σ

.

The MixN models can achieve a great deal of flexibility using only a few components. Several examples of MixN densities are presented in Figure 1 to demonstrate its flexibility. In the upper left corner of Figure 1, we see the density of

MixN (0.1, 0, 100, 0, 1)

in which one mixture component is a standard normal and the other normal component has the same mean but a 100-fold increase in variance. This type of mixture is often used as a theoretical model for identifying outliers in a single sample or regression problems [35]. The upper right panel of Figure 1 displays an example of right-skewed unimodal density. The lower left panel of Figure 1 shows a mirrored mixture with one mode. This example tends to be uniformly distributed near the mode. The lower left panel of Figure 1 presents a bimodal example.

3. MSE-RPs from a MixN

Suppose

X \sim MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

with the cdf

F (x)

and the pdf

f (x)

. Consider to define a discrete distribution

\hat{F}

with n support points, where

n \in N^{+}

to approximate MixN. To provide the best representation of MixN, we select representative points that have the least mean squared error from

F (x)

. Assume

E (X^{2}) < \infty

. Let

Y_{n}

follows a discrete distribution

F_{m s e, n}

defined as

F_{m s e, n} (y) = \sum_{j = 1}^{n} p_{j}^{(n)} 1 \{a_{j}^{(n)} \leq y\}

with the probability mass function

f (y = a_{i}^{(n)}) = p_{i}^{(n)}, i = 1, \dots, n,

where

a_{1}^{(n)} < \dots < a_{n}^{(n)}

are MSE-RPs of X and

p_{1}^{(n)}, \dots, p_{n}^{(n)}

are corresponding probabilities with respect to

{MSE}_{X} (a_{1}^{(n)}, \dots, a_{n}^{(n)}) = \int_{- \infty}^{\infty} (min_{i = 1, \dots, n} {(x - a_{i}^{(n)})}^{2}) f (x) d x,

(5)

and

\begin{matrix} p_{1}^{(n)} & = & \int_{- \infty}^{(a_{1}^{(n)} + a_{2}^{(n)}) / 2} f (x) d x, \\ p_{i}^{(n)} & = & \int_{(a_{i - 1}^{(n)} + a_{i}^{(n)}) / 2}^{(a_{i}^{(n)} + a_{i + 1}^{(n)}) / 2} f (x) d x, i = 2, \dots, n - 1, \\ p_{n}^{(n)} & = & \int_{(a_{n - 1}^{(n)} + a_{n}^{(n)}) / 2}^{\infty} f (x) d x . \end{matrix}

The approximating distribution

F_{m s e, n}

has many useful properties. Given

\forall n \in N^{+}

,

F_{m s e, n}

and F have the same population mean, and

F_{m s e, n}

has less variance. Graf and Luschgy [15], (Remark 4.6 and Lemma 6.1) and Fei [36] proved that

E (Y_{n}) = E (X), lim_{n \to \infty} E [{(X - Y_{n})}^{2}] = lim_{n \to \infty} [var (X) - var (Y_{n})] = 0,

(6)

which imply that

Y_{n}

converges to X in the mean square as n approaches infinity. Hence,

Y_{n}

converges to X in distribution. Flury [37] proved that all RPs are self-consistent, that is,

a_{i}^{(n)} = E (X ∣ Y_{n} = a_{i}^{(n)}), i = 1, \dots, n .

This property explains that each representative point can best represent a region from the population based on the mean squared error criterion.

Next, we discuss the uniqueness of a locally optimal solution that is a stationary point of the mean squared error function from a MixN (5). According to Li and Flury [38], the set of n MSE-RPs, where

n \in N^{+}

is unique if

| μ | / σ ⩽ 1

for a mirrored mixture of two normal distributions (

MixN.M (μ, σ^{2})

) defined in (1). Lemma 1 gives a sufficient condition of the uniqueness of MSE-RPs.

Lemma 1.

(Trushkin [19]). Under Conditions (1) and (2), if X has a log-concave density

f (x)

, given n, then there exists a unique set of MSE-RPs

Y_{n} = \{y_{1}, . . ., y_{n}\}

that can minimize the mean squared error function of

Y_{n}

from the distribution of X.

Condition (1),

\exists I = (V, W)

, where

- \infty \leq V < W \leq \infty

, such that,

\{\begin{matrix} f (x) > 0, x \in I \\ f (x) = 0, x is outside of I, \end{matrix}

f (x)

is continuous and positive inside I.

Condition (2),

\int_{V}^{W} {(y - x)}^{2} f (x) d x < + \infty, f o r a n y y \in Y_{n} .

A MixN with finite variance satisfies Conditions (1) and (2) in Lemma 1. All log-concave densities are unimodal, but not necessarily symmetric. As a result, all bimodal MixN fail to satisfy the sufficient condition given in Lemma 1. Additionally, not all of the unimodal densities are log-concave. A counter-example is the Student-

t_{r}

distribution with a degree of freedom

r > 0

[39]. A density function f on

R

is log-concave if and only if f is strongly unimodal, i.e., a Pólya frequency function of order 2 [20,40]. Although the normal distributions are log-concave, a convex combination of two normal distributions cannot easily satisfy this property. More conditions on the five parameters of a MixN are required. In Theorem 1, we establish a sufficient condition for the uniqueness of MSE-RPs from a MixN.

Theorem 1.

(Location mixture of two normal densities). Suppose

X \sim M i x N . L (α, μ_{1}, μ_{2}, σ^{2})

. For any

n \in N^{+}

, the set of n MSE-RPs of X is unique if, for all α, where

α \in (0, 1)

,

| μ_{1} - μ_{2} | \leq \sqrt{2} σ .

The proof of Theorem 1 is presented in the Appendix A. It is strict for a mixture of two normal distributions to satisfy the log-concavity condition of uniqueness. As a result, if the strong unimodal condition is satisfied, there will be severe overlap between two mixture components. The practical application of this particular type of model is limited.

For the purpose of determining the number of principal components in principal component analysis, the proportion of variance explained by each component is considered in practice. Similarly, the information gain defined in (7) can be used to evaluate the repressiveness of a set of MSE-RPs in practice. IG is range from 0 to 1 (

0 \leq IG \leq 1

). MSE-RPs are considered valid as long as the information gain (IG) meets practical expectations.

IG = (1 - \frac{{MSE}_{X} (a_{1}^{(n)}, \dots, a_{n}^{(n)})}{var (X)}) .

(7)

4. Numerical Approximations to MSE-RPs from a MixN

Let

X \sim M i x N (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

with density

f (x)

. Given a set of n MSE-RPs of X, denoted as

- \infty < a_{1}^{(n)} < \dots < a_{n}^{(n)} < + \infty

, the objective Function (5) can be expressed as

\begin{matrix} {MSE}_{X} (a_{1}^{(n)}, \dots, a_{n}^{(n)}) & = \int_{- \infty}^{\infty} (min_{i = 1, \dots, n} {(x - a_{i}^{(n)})}^{2}) f (x) d x \\ = \sum_{i = 1}^{n} \int_{Δ_{i}} {(x - a_{i}^{(n)})}^{2} f (x) d x, \end{matrix}

(8)

where

Δ_{1} = (- \infty, (a_{1}^{(n)} + a_{2}^{(n)}) / 2]

,

Δ_{j} = ((a_{j - 1}^{(n)} + a_{j}^{(n)}) / 2, (a_{j}^{(n)} + a_{j + 1}^{(n)}) / 2]

, for

j = 2, \dots, n - 1

, and,

Δ_{n} = ((a_{n - 1}^{(n)} + a_{n}^{(n)}) / 2, + \infty)

.

4.1. The k-Means Algorithm

Due to the complexity of MixN distribution, it is challenging to generate MSE-RPs from a MixN. MSE-RPs are the theoretical counterparts of cluster means obtained by a k-means algorithm. The k-means algorithm is the most popular method to approximate MSE-RPs. We summarize the computation procedure of the k-means algorithm for approximating MSE-RPs from a MixN.

Step 1. Generate training samples

\{x_{1}, x_{2}, . . ., x_{N}\}

with a large size N from the underlying MixN.

Step 2. Initialize cluster centers

\{m_{1}^{(0)}, \dots, m_{n}^{(0)}\}

usually by the Monte Carlo method.

Step 3. Assign each observation x in the training sample to the cluster with the nearest mean measured by the least squared Euclidean distance.

S_{i}^{(t)} = \{x : {∥x - m_{i}^{(t)}∥}^{2} \leq {∥x - m_{j}^{(t)}∥}^{2} \forall j, 1 \leq j \leq n\} .

For points with equal distance to different cluster centers, they are arbitrarily assigned to one of

S^{(t)}

.

Step 4. Recalculate means for observations assigned to each cluster to form a new set of cluster centers. For

i = 1, \dots, n

, calculate

m_{i}^{(t + 1)} = \frac{\sum_{x \in S_{i}^{(t)}} x}{n_{i}^{(t)}},

where

n_{i}^{(t)}

is the number of x falling in

S_{i}^{(t)}

.

Step 5. Repeat Steps 3 and 4 until the cluster centers no longer change.

According to the k-means algorithm, the estimated

F_{m s e, n}

is formed by the k-mean centers

m_{i}

and the corresponding probabilities

p_{i} = \frac{n_{i}}{N}

for

i = 1, \dots, n

.

Although we can generate as many training samples from MixN as possible, as a non-parametric method, the k-mean algorithm is not always reliable. Particularly, the performance of the k-means algorithm is strongly influenced by the initial values.

4.2. The Fang-He Algorithm

In order to generate reliable numerical solutions of MSE-RPs from MixN, the objective Function (8) can be minimized by taking the first-order partial derivatives with respect to

(a_{1}^{(n)}, \dots, a_{n}^{(n)})

correspondingly. Then, the optimal solutions to (8) can be obtained by solving a system of n non-linear equations, for

i = 1, \dots, n

,

α \int_{Δ_{i}} (x - a_{i}^{(n)}) ϕ (x; μ_{1}, σ_{1}^{2}) d x + (1 - α) \int_{Δ_{i}} (x - a_{i}^{(n)}) ϕ (x; μ_{2}, σ_{2}^{2}) d x = 0 .

(9)

For a normal distribution

ϕ (x; μ, σ^{2})

, we have

σ^{2} ϕ^{'} (x; μ, σ^{2}) = (μ - x) ϕ (x; μ, σ^{2})

, which implies

\int x ϕ (x; μ, σ^{2}) d x = μ \int ϕ (x; μ, σ^{2}) d x - σ^{2} \int ϕ^{'} (x; μ, σ^{2}) d x

. Then, the system of Equations (9) can be expressed as, for

i = 1, \dots, n

,

\begin{matrix} α [μ_{1} {Φ (x; μ_{1}, σ_{1}^{2})|}_{Δ_{i}} - σ_{1}^{2} {ϕ (x; μ_{1}, σ_{1}^{2})|}_{Δ_{i}}] \\ + (1 - α) [μ_{2} {Φ (x; μ_{2}, σ_{2}^{2})|}_{Δ_{i}} - σ_{2}^{2} {ϕ (x; μ_{2}, σ_{2}^{2})|}_{Δ_{i}}] \\ - a_{i}^{(n)} {F (x)|}_{Δ_{i}} = 0, \end{matrix}

(10)

where F is the cdf of X. It is very difficult to determine MSE-RPs analytically by solving (10) for a MixN. The strategy of Fang and He [26] is summarized below:

Step 1. Set an initial value $a_{1}^{(n)}$ .
Step 2. Solve the 1st equation in (10) to obtain $a_{2}^{(n)}$ by the bisection method or the iterative Newton’s method.
Step 3. Solve the 2nd equation in (10) to calculate $a_{3}^{(n)}$ based on the values of $a_{1}^{(n)}$ and $a_{2}^{(n)}$ obtained in Steps 1 and 2.
Step 4. Given the values of $a_{i - 1}^{(n)}$ and $a_{i}^{(n)}$ , solve the $i th$ equation in (10) to get $a_{i + 1}^{(n)}$ for $i = 3, \dots, n - 1$ .
Step 5. Given the solution $a_{n - 1}^{(n)}$ of the $(n - 2) th$ equation, solve the $n th$ equation in (10) to obtain another solution of $a_{n}^{(n) *}$ .
Step 6. Modify the initial value of $a_{1}^{(n)}$ and repeat the above procedure until

$|a_{n}^{(n)} - a_{n}^{(n) *}| < ϵ,$

where $ϵ$ represents the error tolerance, which is a very small number.

Next, we prove the convergence of the Fang-He algorithm for MixN. The equations in (10) can be classified into three types. Type I equation is the first equation of (10), i.e.,

\begin{matrix} α [μ_{1} Φ (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}; μ_{1}, σ_{1}^{2}) - σ_{1}^{2} ϕ (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}; μ_{1}, σ_{1}^{2})] \\ + (1 - α) [μ_{2} Φ (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}; μ_{2}, σ_{2}^{2}) - σ_{2}^{2} ϕ (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}; μ_{2}, σ_{2}^{2})] \\ - a_{1}^{(n)} F (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}) = 0 . \end{matrix}

(11)

Type II is the last equation of (10), i.e.,

\begin{matrix} α \{μ_{1} [1 - Φ (\frac{a_{n - 1}^{(n)} + a_{n}^{(n)}}{2}; μ_{1}, σ_{1}^{2})] + σ_{1}^{2} ϕ (\frac{a_{n - 1}^{(n)} + a_{n}^{(n)}}{2}; μ_{1}, σ_{1}^{2})\} \\ + (1 - α) \{μ_{2} [1 - Φ (\frac{a_{n - 1}^{(n)} + a_{n}^{(n)}}{2}; μ_{2}, σ_{2}^{2})] + σ_{2}^{2} ϕ (\frac{a_{n - 1}^{(n)} + a_{n}^{(n)}}{2}; μ_{2}, σ_{2}^{2})\} \\ - a_{n}^{(n)} [1 - F (\frac{a_{n - 1}^{(n)} + a_{n}^{(n)}}{2})] = 0 \end{matrix}

(12)

Type III equation is the 2nd equation to the

(n - 1) th

equation of (10), i.e., for

i = 2, \dots, n - 1

,

\begin{matrix} α μ_{1} [Φ (\frac{a_{i}^{(n)} + a_{i + 1}^{(n)}}{2}; μ_{1}, σ_{1}^{2}) - Φ (\frac{a_{i - 1}^{(n)} + a_{i}^{(n)}}{2}; μ_{1}, σ_{1}^{2})] \\ + α σ_{1}^{2} [ϕ (\frac{a_{i - 1}^{(n)} + a_{i}^{(n)}}{2}; μ_{1}, σ_{1}^{2}) - ϕ (\frac{a_{i}^{(n)} + a_{i + 1}^{(n)}}{2}; μ_{1}, σ_{1}^{2})] \\ + (1 - α) μ_{2} [Φ (\frac{a_{i}^{(n)} + a_{i + 1}^{(n)}}{2}; μ_{2}, σ_{2}^{2}) - Φ (\frac{a_{i - 1}^{(n)} + a_{i}^{(n)}}{2}; μ_{2}, σ_{2}^{2})] \\ + (1 - α) σ_{2}^{2} [ϕ (\frac{a_{i - 1}^{(n)} + a_{i}^{(n)}}{2}; μ_{2}, σ_{2}^{2}) - ϕ (\frac{a_{i}^{(n)} + a_{i + 1}^{(n)}}{2}; μ_{2}, σ_{2}^{2})] \\ - a_{i}^{(n)} [F (\frac{a_{i}^{(n)} + a_{i + 1}^{(n)}}{2}) - F (\frac{a_{i - 1}^{(n)} + a_{i}^{(n)}}{2})] = 0 . \end{matrix}

(13)

Next, we discuss the proprieties of the Type I–III equations, respectively. Theorem 2 proves that there is a unique and non-decreasing solution of

a_{2}^{(n)}

within a specific searching range of

a_{1}^{(n)}

. Part (i) of Theorem 2 proves that the Type I equation has a unique solution of

a_{2}^{(n)}

if we set the initial searching range of

a_{1}^{(n)}

with an upper bound of

E (X)

. Note that

E (x)

is the MSE-RP when

n = 1

. It is feasible for finding the solution of

a_{2}^{(n)}

given

a_{1}^{(n)}

in (14) for

n \geq 2

. Based on Part (ii), the solution of

a_{2}^{(n)}

is strictly increasing with respect to

a_{1}^{(n)}

under the condition (15). In practice, the condition (15) is easy to meet and convenient to verify.

Theorem 2.

Given

n \geq 2

. Denote

E (X)

as the population mean of

X \sim MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

.

(i): The Type I Equation (11) in (10) has unique solution of $a_{2}^{(n)} = h_{1} (a_{1}^{(n)})$ , if and only if,

$- \infty < a_{1}^{(n)} < E (X) .$

(14)
(ii): $a_{2}^{(n)} = h_{1} (a_{1}^{(n)})$ is a strictly increasing function with respect to $a_{1}^{(n)}$ if

$\frac{a_{2}^{(n)} - a_{1}^{(n)}}{4} \cdot f (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}) - F (\frac{a_{1}^{(n)} + a_{2}^{(n)}}{2}) < 0,$

(15)

for $a_{1}^{(n)} \in (- \infty, E (X))$ .

The proof of Theorem 2 is presented in Appendix A. Due to the complexity of the density of MixN, it is difficult to prove that the inequality (15) holds over the entire searching range of

a_{1}^{(n)} \in (- \infty, E (X))

. Remarks 1 and 2 give sufficient conditions for (15) to be true.

Remark 1.

When

μ_{1} = μ_{2}

,

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

is a strictly increasing function with respect to

a_{1}^{(n)}

if

E (X) - \sqrt{2} max {σ_{1}, σ_{2}} < a_{1}^{(n)} < E (X) .

Remark 2.

When

μ_{2} < μ_{1} < μ_{2} + σ_{1} / (1 - α)

,

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

is a strictly increasing function with respect to

a_{1}^{(n)}

if

μ_{2} < a_{1}^{(n)} < E (X) .

Theorem 3 proves that the last equation in the non-linear system (10) has unique and non-decreasing solution of

a_{n}^{(n)}

given

a_{n - 1}^{(n)}

.The proof of Theorem 3 is presented in Appendix A.

Theorem 3.

Given

n \geq 2

and

a_{1}^{(n)} \in (- \infty, E (X))

, the Type II Equation (12) in (10) has unique solution of

a_{n}^{(n)} = h^{*} (a_{n - 1}^{(n)})

and

h^{*} (a_{n - 1}^{(n)})

is a strictly increasing function with respect to

a_{n - 1}^{(n)}

.

In the case of

n = 2

, the system of Equation (10) reduces to

\{\begin{matrix} α [- σ_{1}^{2} ϕ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{1}, σ_{1}^{2}) + μ_{1} Φ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{1}, σ_{1}^{2})] \\ + (1 - α) [- σ_{2}^{2} ϕ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{2}, σ_{2}^{2}) + μ_{2} Φ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{2}, σ_{2}^{2})] \\ - a_{1}^{(2)} F (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}) = 0, \\ α \{μ_{1} [1 - Φ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{1}, σ_{1}^{2})] + σ_{1}^{2} ϕ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{1}, σ_{1}^{2})\} \\ + (1 - α) \{μ_{2} [1 - Φ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{2}, σ_{2}^{2})] + σ_{2}^{2} ϕ (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2}; μ_{2}, σ_{2}^{2})\} \\ - a_{2}^{(2)} [1 - F (\frac{a_{1}^{(2)} + a_{2}^{(2)}}{2})] = 0 . \end{matrix}

(16)

For any given

a_{1}^{(2)}

that satisfies the conditions in Theorem 2, we can obtain

a_{2}^{(2)} = h_{1} (a_{1}^{(2)})

from the first equation in (16) and

a_{2}^{(2) *} = h^{*} (a_{1}^{(2)})

from the second equation in (16), respectively. Following the procedure of the Fang-He algorithm, the iteration will stop until

|a_{2}^{(2)} - a_{2}^{(2) *}| < ϵ

. For example, let

X \sim MixN (α = 0.7, μ_{1} = 4.6, σ_{1}^{2} = 2, μ_{2} = 1.5, σ_{2} = 1)

, we obtain a set of 2 MSE-RPs

a_{1}^{(2)} = - 0.552945

and

a_{2}^{(2)} = 1.989754

with probabilities

p_{1}^{(2)} = 0.696014

and

p_{2}^{(2)} = 0.303986

, respectively. The IG of the set of two MSE-RPs is about

59.18 %

.

When

n \geq 3

, we wish to obtain

a_{i + 1}^{(n)} = h_{i} (a_{1}^{(n)})

from the Type III equation in (10) based on the

a_{1}^{(n)}

and

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

for

i = 2, \dots, n

. Theorem 4 proves that, when

n = 3

, the Type III equation has unique solution of

a_{3}^{(3)}

given a

a_{1}^{(3)}

in the range of

(- \infty, a_{1}^{(2)})

, where

a_{1}^{(2)}

is the first point of a set of two MSE-RPs.

Theorem 4.

When

n = 3

, the Type III equations (13) in (10) have a unique solution of

a_{3}^{(3)} = h_{2} (a_{1}^{(3)})

, if and only if,

a_{1}^{(3)} < a_{1}^{(2)} .

The proof of Theorem 4 is presented in Appendix A. According to Theorem 4, for given

a_{1}^{(n)}

, where

n > 3

, we obtain

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

,

a_{3}^{(n)} = h_{2} (a_{1}^{(n)}), \dots, a_{n}^{(n)} = h_{n - 1} (a_{1}^{(n)})

from the 1st, 2nd, …,

(n - 1) th

equation of (10), in turn. Then, the Type III Equation (13) has a unique solution of

a_{j}^{(n)} = h_{j - 1} (a_{1}^{(n)})

if and only if

a_{1}^{(n)} < a_{1}^{(j - 1)}

for

j = 3, \dots, n

.

As we discussed above, given an initial value of

a_{1}^{(n)}

, we can calculate the values of

a_{(i + 1)}^{(n)} = h_{i} (a_{1}^{(n)})

, where

i = 2, \dots, n - 1

, and the value of

a_{(n)}^{(n) *} = h^{*} (a_{1}^{(n)})

. Based on Theorems 2–4, the non-linear Equation (10) has unique solution. Given

a_{1}^{(n)}

, when

a_{n}^{(n) *}

is significantly smaller than

a_{n}^{(n)}

, we can reduce the initial value of

a_{1}^{(n)}

to narrow down the difference between

a_{n}^{(n)}

and

a_{n}^{(n) *}

. When

a_{n}^{(n) *}

is significantly larger than

a_{n}^{(n)}

, we can increase the value

a_{1}^{(n)}

to shrink the difference between the two approximations of the last point.

In order to improve the efficiency of the algorithm, we need more specifications for the searching range of the initial value

a_{1}^{(n)}

and the rules for searching range reduction. Set an initial searching region of

a_{1}^{(n)}

, denoted as

(LB, UB)

. According to Theorem 2, we can set an upper bound of

UB = E (X)

to ensure a unique solution of the Type I equation and the monotonicity of

h_{1} (a_{1}^{(n)})

. For finding a fixed number of MSE-RPs, we recommend an initial lower bound of

LB = min {μ_{1} - 4 σ_{1}, μ_{2} - 4 σ_{2}}

. The initial value given in Step 1 is

a_{1}^{(n)} = 0.5 (LB + UB)

. Then, in Step 6, we follow the rules below for the iteration:

If $|a_{n}^{(n)} - a_{n}^{(n) *}| < ε$ , the desired solution set is obtained.
If $a_{n}^{(n)} < a_{n}^{(n) *} + ϵ$ , the initial value of $a_{1}^{(n)}$ is too small. Let $LB = a_{1}^{(n)}$ , $a_{1}^{(n)} = 0.5 (LB + UB)$ and go back to Step 1.
If $a_{n}^{(n)} > a_{n}^{(n) *} + ϵ$ , the initial value of $a_{1}^{(n)}$ is too large. Let $UB = a_{1}^{(n)}$ , $a_{1}^{(n)} = 0.5 (LB + UB)$ and go back to Step 1.

Therefore, the algorithm is convergent as the searching interval for

a_{1}^{(n)}

is reduced by half. When finding a set of

n = 3

MSE-RPs, the initial upper bound of

E (X)

is reduced to

a_{1}^{(2)}

. Iteratively, the initial upper bound of

a_{1}^{(n + 1)}

is adjusted to

a_{1}^{(n)}

when computing a set of

n + 1

MSE-RP. As a result, the Fang-He algorithm effectively finds the unique solution to the nonlinear system (10).

The first 30 MSE-RPs of

M i x N (0.7, 8, 10, 1.5, 1)

generated by the above computation procedure are presented in Table 1. The first column in Table 1 gives the information gain (IG). The corresponding probabilities of MSE-RPs are given in Table 2. The density of

M i x N (0.7, 8, 10, 1.5, 1)

is presented in Figure 2(C5).

5. Numerical Studies

Based on the six underlying distributions listed in Table 3, we compare the Fang-He algorithm with the k-means algorithm and apply MSE-RPs to the kernel density estimation. The corresponding densities of the six distributions are presented in Figure 2.

The distribution C1 is a location mixture that satisfies the log-concavity condition of a unique set of MSE-RPs. The distributions C2 and C3 represent classic mirrored mixtures with unimodal and bimodal distributions, respectively. The distributions C4–C6 are general location–scale mixtures. In particular, C5 and C6 have large population variances.

5.1. Algorithm Comparisons

In this subsection, the Fang-He algorithm is compared with the k-mean algorithm. We generate MSE-RPs from the distributions C1–C6 on the point sizes of

n = 2, 3, 4, 5, 10, 15,

and 30, respectively. A tolerance for errors of

ϵ = 10^{- 5}

is applied to both algorithms. The training sample size for the k-means algorithm is 100,000.

A solution that is optimal will satisfy the non-linear system (10). The total error of numerical solutions can be calculated. On the left-hand side of (10), we fill in the numerical solutions of MSE-RPs and then sum them up. Table 4 summarizes the total error of two algorithms in checking the non-linear system (10). It is evident that the Fang-He algorithm provides more accurate numerical approximations of the MSE-RPs from MixN compared to the k-means algorithm.

Table 5 summarizes the information gain (7) in percentage (IG%) of the numerical solutions generated by the Fang-He algorithm and the k-means algorithm, respectively. In all distributions and point sizes, the Fang-He algorithm generates MSE-RPs with higher IG.

Our numerical studies indicate that the Fang-He algorithm generates MSE-RPs from MixN with a high level of accuracy and efficiency.

5.2. Kernel Density Estimations

In many data-transmission systems, analog input signals are first converted to digital form at the transmitter, transmitted in digital form, and finally reconstituted at the receiver as analog signals [17]. Suppose the input signal

X \sim MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

with the cdf

F (x)

and the density function

f (x)

, and the output signal is a set of n points. Three methods for selecting points from a MixN signal are compared in this section. They are the Monte Carlo method (MC), the quasi-Monte Carlo method (QMC) and the minimized mean squared method (MSE).

In the MC setting, the cdf

F (x)

of X can be estimated by the empirical cdf of

\{X_{1}, \dots, X_{n}\}

F_{n} (x) = \frac{1}{n} \sum_{i = 1}^{n} 1 \{X_{i} \leq x\} .

Every support point in

F_{n}

has the same probability of

1 / n

. The MC method is often criticized for its slow convergence. In particular, given a fixed number of points n, random samples from a MixN are usually not inclusive enough and often do not accurately represent the full mixture population.

The quasi-Monte Carlo (QMC) methods have an asymptotically faster convergence rate than the MC, as demonstrated in the field of numerical integration. In the univariate case, the QMC methods are designed to sample points that are uniformly distributed on the interval

[0, 1]

, and aim to select support points from

F (x)

by minimum F-discrepancy. Define a set of n QMC points, where

n \geq 1

as

Q = \{\frac{2 j - 1}{2 n}, j = 1, \dots, n\} .

In the numeric–theoretic method, the set of QMC points is proved to have minimal discrepancy among all possible sets of n points lying in the interval

(0, 1)

[41]. Assume the inverse function of F exists, then the point set

\{F^{- 1} (\frac{2 j - 1}{2 n}), j = 1, \dots, n\},

are proved to have the minimal F-discrepancy of

1 / 2 n

from

F (x)

[42]. Hence, the approximating distribution to

F (x)

in the QMC setting is defined as

F_{q m c, n} (x) = \frac{1}{n} \sum_{i = 1}^{n} 1 \{F^{- 1} (u_{i}) \leq x\},

where

u_{1}, \dots, u_{n}

are QMC points. Each support point in

F_{q m c, n}

is not random but still has the same probability of

1 / n

.

For the best representation of MixN, the minimized mean squared method generates MSE-RPs from F with minimum mean squared error.

Next, we perform the kernel density estimation to reconstitute the output signal at the receiver as analog signals. The kernel density estimation method is proposed by [43,44]. Given a fixed number of points

x_{1}, \dots x_{n}

from the original signal, the density estimation of

f (x)

is given as

{\hat{f}}_{h} (x) = \frac{1}{n} \sum_{i = 1}^{n} k_{h} (x - x_{i}) = \frac{1}{n h} \sum_{i = 1}^{n} k (\frac{x - x_{i}}{h}),

where

k_{h} (y) = \frac{1}{h} k (x / h)

. We apply standard normal kernel

k (x) = \frac{1}{\sqrt{2 π}} e^{- \frac{1}{2} x^{2}}

in the simulation studies. In the MSE method, the points

x_{1}, \dots x_{n}

may have different probabilities. The density estimation of

f (x)

can be extended to

{\hat{f}}_{h} (x) = \sum_{i = 1}^{n} k_{h} (x - x_{i}) p_{i} = \frac{1}{h} \sum_{i = 1}^{n} k (\frac{x - x_{i}}{h}) p_{i} .

We present comparisons of three types of sampling methods using sample sizes of

10, 15

, and 30 for kernel density estimation among C1–C6 distributions. Our simulation results indicate that MSE-RPs-based kernel estimation is highly effective, followed by QMC-RPs-based estimation.

For the underlying distribution C6, the density comparisons are given in Figure 3 and the corresponding

L_{2}

-distance between the estimated kernel density and the population density

f (x)

are summarised in Table 6. The density estimate obtained from 30 Monte Carlo sampling points would be inaccurate because the

L_{2}

-distance is only

0.2082

, which is not sufficient to reconstitute the density of C6. From our simulation results, the MC method is affected by randomness. Increasing the sample size from 10 to 15 results in a larger

L_{2}

distance. In comparison with the MC method, the QMC method significantly improves the estimation results. The MSE method offers better performance in density fitting than the QMC method. According to Table 6, kernel density estimation based on 15 MSE points is more accurate than one based on 30 QMC points. Consistent results are observed for the estimation of the distribution C3 in Figure 4 and Table 7.

Based on the underlying distributions C1, C2, C4, and C5, Figure 5 and Table 8 present comparisons of the three methods in density fittings at a point size of

n = 30

. The MSE method is capable of re-constituting the four different MixN densities with low

L_{2}

-distances.

5.3. Real Data Example

The proposed algorithm and the application of MSE-RPs in density estimations are demonstrated through the analysis of the cell nucleus data from the diagnostic database of the University of Wisconsin Clinical Sciences Center [45]. Ref. [12] fit the perimeter of the cell nucleus using MixN models based on the revised penalized maximum likelihood estimation. The MixN model presented below is taken from [12] and is used for further analysis.

M i x N (0.5806, 75.4189, 89.4452, 117.1346, 667.5068)

(17)

We generate 50 MSE-RPs from the MixN model (17) using both the Fang-He algorithm and the k-means algorithm. Considering the large variance in the model (17), five million training samples are used for the k-means algorithm. Table 9 presents the comparisons of the two algorithms. MSE-RPs generated by the k-means algorithm have a total error of

0.0273

, which does not satisfy the non-linear system (10). By using the Fang-He algorithm, we are able to generate more accurate numerical approximations of the 50 MSE-RPs with a smaller mean squared error.

We also generate 50 QMC points and 50 MC points from the model (17). Table 10 summarizes the estimates of the mean, variance, skewness, and kurtosis of the distribution (17) based on the MC, QMC, and MSE methods. The first row of Table 10 displays the first fourth moments of the underlying distribution (17). MSE-RPs generated by the Fang-He algorithm have the same mean as the population expectation, which is consistent with the properties of MSE-RP as described in (6). MSE-RPs obtained using the k-means algorithm have a bias of

0.0054

to the population expectation. Additionally, MSE-RPs generated by the k-means algorithm have a higher bias in terms of variance, skewness, and kurtosis than the MSE-RPs obtained by the Fang-He algorithm. Therefore, the Fang-He algorithm is considered to be a more effective computational procedure for numerical approximations of MSE-RPs from a MixN. When comparing MSE-RPs to QMC points and MC points, the MSE method estimates the moments of the model more accurately.

A comparison of the kernel density estimates based on MC, QMC, and MSE methods are presented in Figure 6. The corresponding

L_{2}

-distances between the kernel estimates and the density of the model (17) are

0.0789

for MC,

0.0382

for QMC and

0.0215

for MSE.

6. Discussion

The Fang-He algorithm is recommended for numerical approximations of MSE-RPs from a mixture of two-component normal distributions. In this paper, we investigated the properties of the non-linear system used for generating MSE-RPs from MixN. The Fang-He algorithm is effective in finding the numerical solution of the non-linear system (10), as the upper bound of the searching range of

a_{1}^{(n)}

is gradually narrowed down from the population expectation

E (X)

as the number of points n increases. The results of our numerical studies confirm that the Fang-He algorithm is capable of providing accurate and reliable MSE-RPs.

Mixtures of two-component normal distributions have applications across many disciplines. As demonstrated in this paper, MSE-RPs from MixN provide outstanding performance for kernel density estimation. Further applications of MSE-RPs based on a Gaussian mixture model are being investigated for future research.

Author Contributions

Conceptualization, K.-T.F., P.H. and H.P.; methodology, K.-T.F., Y.L. and P.H.; software, Y.L.; validation, P.H.; writing—original draft preparation, Y.L.; writing—review and editing, K.-T.F., P.H. and H.P. All authors have read and agreed to the published version of the manuscript.

Funding

Our work was supported in part by the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College (2022B1212010006) and in part by the Internal Research Grant (R202010) of BNU-HKBU United International College.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The cell nucleus data from the diagnostic database of the University of Wisconsin Clinical Sciences Center is available at https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29 (accessed on 28 August 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In Appendix A section, we provide the proofs of Theorems 1–4. For simplicity, we denote

ϕ_{1} (x)

and

Φ_{1} (x)

as the pdf and the cdf of

N (μ_{1}, σ_{1}^{2})

, respectively, and denote

ϕ_{2} (x)

and

Φ_{2} (x)

as the pdf and the cdf of

N (μ_{2}, σ_{2}^{2})

, respectively.

Proof of Theorem 1.

Let a random variable

X \sim MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

with the pdf

f (x) = α ϕ_{1} (x) + (1 - α) ϕ_{2} (x) .

According to the second-order concavity test,

f (x)

is log-concave if

f (x) f^{″} (x) - {(f^{'} (x))}^{2} < 0,

(A1)

for all x in

R

. The first-order derivative of

f (x)

is

f^{'} (x) = α (- \frac{(x - μ_{1})}{σ_{1}^{2}}) ϕ_{1} (x) + (1 - α) (- \frac{(x - μ_{2})}{σ_{2}^{2}}) ϕ_{2} (x) .

The second-order derivative of

f (x)

is

f^{″} (x) = (\frac{{(x - μ_{1})}^{2}}{σ_{1}^{4}} - \frac{1}{σ_{1}^{2}}) α ϕ_{1} (x) + (\frac{{(x - μ_{2})}^{2}}{σ_{2}^{4}} - \frac{1}{σ_{2}^{2}}) (1 - α) ϕ_{2} (x) .

Accordingly, we have

\begin{matrix} f (x) f^{″} (x) & = α^{2} ϕ_{1}^{2} (x) (\frac{{(x - μ_{1})}^{2}}{σ_{1}^{4}} - \frac{1}{σ_{1}^{2}}) + {(1 - α)}^{2} ϕ_{2}^{2} (x) (\frac{{(x - μ_{2})}^{2}}{σ_{2}^{4}} - \frac{1}{σ_{2}^{2}}) \\ + α (1 - α) ϕ_{1} (x) ϕ_{2} (x) (\frac{{(x - μ_{1})}^{2}}{σ_{1}^{4}} + \frac{{(x - μ_{2})}^{2}}{σ_{2}^{4}} - \frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}), \end{matrix}

and

\begin{matrix} {(f^{'} (x))}^{2} & = α^{2} ϕ_{1}^{2} (x) (\frac{{(x - μ_{1})}^{2}}{σ_{1}^{4}}) + {(1 - α)}^{2} ϕ_{2}^{2} (x) (\frac{{(x - μ_{2})}^{2}}{σ_{2}^{4}}) \\ + α (1 - α) ϕ_{1} (x) ϕ_{2} (x) (\frac{2 (x - μ_{1}) (x - μ_{2})}{σ_{1}^{2} σ_{2}^{2}}) . \end{matrix}

(A2)

By (A1) and (A2), we would like to have

\begin{matrix} - \frac{α^{2} ϕ_{1}^{2}}{σ_{1}^{2}} - \frac{{(1 - α)}^{2} ϕ_{2}^{2}}{σ_{2}^{2}} - α (1 - α) ϕ_{1} (x) ϕ_{2} (x) [\frac{1}{σ_{1}^{2}} + \frac{1}{σ_{2}^{2}} - {(\frac{(x - μ_{1})}{σ_{1}^{2}} - \frac{(x - μ_{2})}{σ_{2}^{2}})}^{2}] < 0 . \end{matrix}

(A3)

If

\frac{1}{σ_{1}^{2}} + \frac{1}{σ_{2}^{2}} - {(\frac{(x - μ_{1})}{σ_{1}^{2}} - \frac{(x - μ_{2})}{σ_{2}^{2}})}^{2} \geq 0

holds for all

x \in R

, then (A3) holds. Assume that

σ_{1}^{2} = σ_{2}^{2} = σ^{2}

,

\frac{2}{σ^{2}} - {(\frac{(μ_{2} - μ_{1})}{σ^{2}})}^{2} \geq 0

if

| μ_{1} - μ_{2} | \leq \sqrt{2} σ

given any

α \in (0, 1)

. □

Proof of Theorem 2.

Part (i). Denote the left-hand side of (11) as a function of two variables, i.e.,

\begin{matrix} G_{u, v} (u, v) & = α [μ_{1} Φ_{1} (\frac{u + v}{2}) - σ_{1}^{2} ϕ_{1} (\frac{u + v}{2})] + (1 - α) [μ_{2} Φ_{2} (\frac{u + v}{2}) - σ_{2}^{2} ϕ_{2} (\frac{u + v}{2})] \\ - u F (\frac{u + v}{2}), \end{matrix}

where

u \in R

and

v \in (u, + \infty)

represent

a_{1}^{(n)}

and

a_{2}^{(n)}

, respectively, for shorts. Define a intermediate function of u by setting

v = u

in

G_{u, v} (u, v)

as

G_{u} (u) = α [μ_{1} Φ_{1} (u) - σ_{1}^{2} ϕ_{1} (u)] + (1 - α) [μ_{2} Φ_{2} (u) - σ_{2}^{2} ϕ_{2} (u)] - u F (u) .

(A4)

From (A4), it is easy to verify that

G_{u} (u) \to 0

as

u \to - \infty

and

G_{u}^{'} (u) = - F (u) < 0

for

\forall u \in R

. Hence,

G_{u} (u) < 0

for

u \in R

, which implies

lim_{v \to u} G_{u, v} (u, v) < 0 .

(A5)

The first-order partial derivative of

G (u, v)

with respect to v is positive, as

- \infty < u < v < \infty

,

\frac{\partial G_{u, v} (u, v)}{\partial v} = \frac{v - u}{4} \cdot f (\frac{u + v}{2}) > 0 .

(A6)

By (A5) and (A6), the function

G_{u, v} (u, v) = 0

has unique solution of v, if and only if,

lim_{v \to + \infty} G_{u, v} (u, v) = E (X) - u > 0,

where

E (X)

is the population mean of

X \sim MixN (α, μ_{1}, σ_{1}^{2}, μ_{2}, σ_{2}^{2})

. Therefore, when

a_{1}^{(n)} < E (X)

, the Type I Equation (11) has unique solution of

a_{2}^{(n)}

.

Part (ii). Given

v = h_{1} (u)

, according to the implicit function theorem, we have

\frac{\partial G_{u, v}}{\partial u} (u, h_{1} (u)) + \frac{\partial G_{u, v}}{\partial v} (u, h_{1} (u)) h_{1}^{'} (u) = 0,

which implies

h_{1}^{'} (u) = - \frac{\frac{\partial G_{u, v}}{\partial u} (u, v)}{\frac{\partial G_{u, v}}{\partial v} (u, v)} .

(A7)

Given (A6), if (A8) < 0 for all

u \in (- \infty, E (X))

so that (A7) > 0, then

v = h_{1} (u)

is a strictly increasing function with respect to u.

\frac{\partial G_{u, v}}{\partial u} (u, v) = \frac{v - u}{4} \cdot f (\frac{u + v}{2}) - F (\frac{u + v}{2}) .

(A8)

□

Next, the strictly decreasing conditions of (A8) is discussed. Denote

g_{u, v} (u, v) = \frac{\partial G_{u, v}}{\partial u} (u, v)

. Given

u \in (- \infty, E (x))

, we have

{lim}_{v \to u} g_{u, v} (u, v) < 0

and the partial derivative of

g_{u, v} (u, v)

with respect to v is

\begin{matrix} \frac{\partial g_{u, v} (u, v)}{\partial v} & = \frac{v - u}{8} f^{'} (\frac{u + v}{2}) - \frac{1}{4} f (\frac{u + v}{2}) \\ = - \frac{α}{4} ϕ_{1} (\frac{u + v}{2}) [\frac{(v - u) (u + v - 2 μ_{1})}{4 σ_{1}^{2}} + 1] \\ - \frac{(1 - α)}{4} ϕ_{2} (\frac{u + v}{2}) [\frac{(v - u) (u + v - 2 μ_{2})}{4 σ_{2}^{2}} + 1] . \end{matrix}

(A9)

g_{u, v} (u, v)

is strictly decreasing, i.e., (A9) < 0, if both (A10) and (A11) hold.

\frac{(v - u) (u + v - 2 μ_{1})}{4 σ_{1}^{2}} + 1 > 0,

(A10)

\frac{(v - u) (u + v - 2 μ_{2})}{4 σ_{2}^{2}} + 1 > 0

(A11)

Remark A1.

Assume that

μ_{1} = μ_{2} = E (X)

. If

v \geq E (X)

, (A10) and (A11) hold. If

c < u < v < E (X)

, where

c > - \infty

is a lower bound of the range of u. Let

σ^{2} = max \{σ_{1}^{2}, σ_{2}^{2}\}

, if

(v - u) (u + v - 2 E (X)) - 4 σ^{2} > 0

, which implies

u > v - \frac{4 σ^{2}}{2 E (X) - (u + v)},

(A12)

then (A10) and (A11) hold. Since

u + v > 2 u

and

v < E (X)

, we have

E (X) - \frac{2 σ^{2}}{E (X) - u} > v - \frac{2 σ^{2}}{E (X) - u} > v - \frac{4 σ^{2}}{2 E (X) - (u + v)} .

(A13)

Then, if

u > E (X) - \frac{2 σ^{2}}{E (X) - u}

, which implies

u > E (X) - \sqrt{2} σ

, then (A12) holds. Therefore, when

μ_{1} = μ_{2}

,

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

is a strictly increasing function with respect to

a_{1}^{(n)}

for

a_{1}^{(n)} \in [E (X) - \sqrt{2} σ, E (X))

.

Remark A2.

Assume that

μ_{2} < μ_{1}

, so that

μ_{2} < E (X) < μ_{1}

. Suppose

c < u < E (X)

, where

c > - \infty

is a lower bound for the range of u. If

c \geq μ_{2}

so that

u + v - 2 μ_{2} > 2 (c - μ_{2}) \geq 0

, then (A11) holds.

When

u + v - 2 μ_{1} \geq 0

, (A10) holds, we have

v > μ_{1} > E (X) > u > c \geq μ_{2}

because

v \geq 2 μ_{1} - u > 2 μ_{1} - E (X)

and

μ_{1} > E (X)

. Therefore, when

u + v - 2 μ_{1} \geq 0

,

g (u) = v

is strictly increasing with respect to u if

u \in (c, E (X)) \subseteq (μ_{2}, E (X))

, where

c \geq μ_{2}

.

When

u + v - 2 μ_{1} < 0

, we have

2 μ_{1} - u > v > u

, and (A10) holds if

0 < v - u < \frac{4 σ_{1}^{2}}{2 μ_{1} - (u + v)}

(A14)

Since

u + v > 2 u

, we have

\frac{2 σ_{1}^{2}}{μ_{1} - u} < \frac{4 σ_{1}^{2}}{2 μ_{1} - (u + v)}

. Then, if

v - u < \frac{2 σ_{1}^{2}}{μ_{1} - u}

so that

u > v - \frac{2 σ_{1}^{2}}{μ_{1} - u},

(A15)

the inequality (A14) holds. Since

2 μ_{1} - u > v

, we have

2 μ_{1} - u - \frac{2 σ_{1}^{2}}{μ_{1} - u} > v - \frac{2 σ_{1}^{2}}{μ_{1} - u}

. Then, if

u > μ_{1} - σ_{1}

and

μ_{2} < μ_{1} < μ_{2} + σ_{1} / (1 - α)

, we have

E (X) > u > μ_{1} - \frac{σ_{1}^{2}}{μ_{1} - u},

(A16)

which implies the inequality (A15) holds. When

μ_{2} < μ_{1} < μ_{2} + σ_{1} / (1 - α)

,

μ_{1} - σ_{1} < μ_{2}

. Therefore,

g (u) = v

is strictly increasing with respect to u if

μ_{2} < u < E (X)

given

μ_{2} < μ_{1} < μ_{2} + σ_{1} / (1 - α)

.

Therefore, for a general location scale mixture of two normal distributions with

μ_{2} < μ_{1} < μ_{2} + σ_{1} / (1 - α)

,

a_{2}^{(n)} = h_{1} (a_{1}^{(n)})

is a strictly increasing function with respect to

a_{1}^{(n)}

for

a_{1}^{(n)} \in (μ_{2}, E (X))

.

Proof of Theorem 3.

Denote the left-hand side of (12) as a function of two variables, i.e.,

\begin{matrix} D_{u, v} (u, v) & = α \{μ_{1} [1 - Φ_{1} (\frac{u + v}{2})] + σ_{1}^{2} ϕ_{1} (\frac{u + v}{2})\} \\ + (1 - α) \{μ_{2} [1 - Φ_{2} (\frac{u + v}{2})] + σ_{2}^{2} ϕ_{2} (\frac{u + v}{2})\} \\ - v [1 - F (\frac{u + v}{2})], \end{matrix}

(A17)

where

- \infty < u < v < + \infty

. In (A17), the variables u and v represent

a_{n - 1}^{(n)}

and

a_{n}^{(n)}

, respectively. Define a intermediate function of u by letting

v = u

as

\begin{matrix} D_{u} (u) & = α \{μ_{1} [1 - Φ_{1} (u)] + σ_{1}^{2} ϕ_{1} (u)\} + (1 - α) \{μ_{2} [1 - Φ_{2} (u)] + σ_{2}^{2} ϕ_{2} (u)\} \\ - u [1 - F (u)] . \end{matrix}

(A18)

From (A18), we have

D_{u} (u) \to + \infty

as

u \to - \infty

,

D_{u} (u) \to 0

as

u \to + \infty

, and

D_{u}^{'} (u) = F (u) - 1 < 0

for

u \in R

. Hence, we have, for

u \in R

,

\begin{matrix} lim_{v \to u} D_{u, v} (u, v) & > 0, \\ lim_{v \to + \infty} D_{u, v} (u, v) & = 0 . \end{matrix}

(A19)

D_{u, v} (u, v) = 0

has only one solution of v if

D_{u, v} (u, v)

firstly decreases from a positive value to a negative value and then increases and approaches 0 as v increases.

Next, we prove that the first-order partial derivative of

D_{u, v} (u, v)

with respective to v is firstly negative and then be positive as v increases, where

v \in (u, + \infty)

and

u \in R

. The first-order partial derivative of

D_{u, v} (u, v)

with respective to v is

\frac{\partial D_{u, v} (u, v)}{\partial v} = (\frac{v - u}{4}) \cdot f (\frac{u + v}{2}) + F (\frac{u + v}{2}) - 1 .

(A20)

For

u \in R

, we have

\begin{matrix} lim_{v \to u} \frac{\partial D_{u, v} (u, v)}{\partial v} & = F (u) - 1 < 0, \\ lim_{v \to + \infty} \frac{\partial D_{u, v} (u, v)}{\partial v} & = 0 . \end{matrix}

(A21)

The second-order partial derivative of

D_{u, v} (u, v)

with respect to v is

\begin{matrix} \frac{\partial^{2} D_{u, v} (u, v)}{\partial v^{2}} = & \frac{3}{4} f (\frac{u + v}{2}) + \frac{v - u}{8} f^{'} (\frac{u + v}{2}) \\ = & \frac{1}{4} [α ϕ_{1} (\frac{u + v}{2}) M_{1} (u, v) + (1 - α) ϕ_{2} (\frac{u + v}{2}) M_{2} (u, v)], \end{matrix}

(A22)

where

\begin{matrix} M_{1} (u, v) & = 3 - \frac{(v - u) (u + v - 2 μ_{1})}{4 σ_{1}^{2}}, \\ M_{2} (u, v) & = 3 - \frac{(v - u) (u + v - 2 μ_{2})}{4 σ_{2}^{2}} . \end{matrix}

Both

M_{1} (u, v)

and

M_{2} (u, v)

are quadratic functions with respect to v. For

i = 1, 2

,

{lim}_{v \to u} M_{i} (u, v) = 3

,

{lim}_{v \to + \infty} M_{i} (u, v) = - \infty

. Both

M_{1} (u, v)

and

M_{2} (u, v)

firstly increase from 3 then decrease from a positive number to

- \infty

with respect to v.

\begin{matrix} lim_{v \to u} \frac{\partial^{2} M_{u, v} (u, v)}{\partial v^{2}} & = \frac{3}{4} f (u) > 0, \\ lim_{v \to + \infty} \frac{\partial^{2} M_{u, v} (u, v)}{\partial v^{2}} & = 0 . \end{matrix}

(A23)

Based on (A23), the second-order partial derivative (A22) is firstly positive and then negative as v increases. When v approaches

+ \infty

, (A20) approaches 0 from a negative value, as

ϕ_{1}

and

ϕ_{2}

converge to 0 faster than

M_{1}

and

M_{2}

.

From (A21)–(A23), the first-order partial derivative (A20) increases from a negative number to a positive number and then decreases and approaches 0 with respect to v. Therefore, we prove that

D_{u, v} (u, v) = 0

has a unique solution of v given

u \in R

.

When v in the neighborhood of

(u, h^{*} (u))

, we have

\{\begin{matrix} \frac{\partial D_{u, v} (u, v)}{\partial v} < 0, \\ \frac{\partial D_{u, v} (u, v)}{\partial u} = \frac{v - u}{4} \cdot f (\frac{u + v}{2}) > 0 . \end{matrix}

According to the implicit function theorem, we have

h^{*'} (u) > 0

. Therefore,

v = h^{*} (u)

is a strictly increasing function with respect to u. □

Proof of Theorem 4.

Denote the left-hand side of (13) as a function of three variables, i.e.,

\begin{matrix} H_{u, v, z} (u, v, z) = & α \{μ_{1} [Φ_{1} (\frac{v + z}{2}) - Φ_{1} (\frac{u + v}{2})] + σ_{1}^{2} [ϕ_{1} (\frac{u + v}{2}) - ϕ_{1} (\frac{v + z}{2})]\} \\ + (1 - α) \{μ_{2} [Φ_{2} (\frac{v + z}{2}) - Φ_{2} (\frac{u + v}{2})] + σ_{2}^{2} [ϕ_{2} (\frac{u + v}{2}) - ϕ_{2} (\frac{v + z}{2})]\} \\ - v [F (\frac{v + z}{2}) - F (\frac{u + v}{2})], \end{matrix}

(A24)

where

- \infty < u < v < z < + \infty

. Variables u, v, z represent

a_{i - 1}^{(n)}

,

a_{i}^{(n)}

and

a_{i + 1}^{(n)}

in the ith equation (

i > 1

) of (10), respectively.

Similar to the proofs of Theorems 2 and 3, denote an intermediate function

H_{u, v} (u, v)

by letting

z = v

in (A24). We have

\begin{matrix} \frac{\partial H_{u, v} (u, v)}{\partial u} = \frac{v - u}{4} \cdot f (\frac{u + v}{2}) > 0, \end{matrix}

(A25)

\begin{matrix} lim_{u \to v} H_{u, v} (u, v) = 0, \end{matrix}

(A26)

\begin{matrix} lim_{u \to - \infty} H_{u, v} (u, v) < 0 . \end{matrix}

(A27)

When

u \to - \infty

, we have

{lim}_{u \to - \infty} H_{u, v} (u, v) = R (v)

, where

R (v) = α (μ_{1} Φ_{1} (v) - σ_{1}^{2} ϕ_{1} (v)) + (1 - α) (μ_{2} Φ_{2} (v) - σ_{2}^{2} ϕ_{2} (v)) - v F (v) .

Since

R^{'} (v) = - F (v) < 0

and

R (v) \to 0

as

v \to u

, where

u \to - \infty

. Therefore, (A27) holds.

By (A25)–(A27),

H_{u, v, z} (u, v, z)

strictly increases from a negative number and approaches to zero as u increases from

- \infty

to v.

Hence,

lim_{z \to v} H_{u, v, z} (u, v, z) < 0 .

(A28)

The first-order partial derivative of

H_{u, v, z} (u, v, z)

with respect to z is positive, i.e.,

\frac{\partial H_{u, v, z}}{\partial z} = \frac{z - v}{4} \cdot f (\frac{v + z}{2}) > 0, as z > v .

(A29)

Based on (A28) and (A29),

H_{u, v, z} (u, v, z)

has a unique solution of z if and only if

lim_{z \to + \infty} H_{u, v, z} (u, v, z) = D_{u, v} (u, v) > 0,

(A30)

where

D_{u, v} (u, v)

is the Type II function defined in (A17). From the proof of Theorem 3 and the discussion of a set of

n = 2

MSE-RPs,

D_{u, v} (u, v)

is positive if and only if

v < h^{*} (u)

and

h^{*} (u)

is a strictly increasing function with respect to u. Given a set of

n = 3

MSE-RPs,

H (a_{1}^{(3)}, a_{2}^{(3)}, a_{3}^{(3)}) > 0

when

a_{3}^{(3)} \to \infty

, if and only if,

a_{1}^{(3)} < a_{1}^{(2)}

. □

References

Andrew, G.; Qi, C.; Alan, D.K.; Ron, W. Pulse pileup rejection methods using a two-component Gaussian Mixture Model for fast neutron detection with pulse shape discriminating scintillator. Nucl. Instrum. Methods Phys. Res. A Accel. Spectrom. Detect. Assoc. Equip. 2021, 988, 164905. [Google Scholar]
Kong, L.; Chatzinotas, S.; Öttersten, B. Unified framework for secrecy characteristics with mixture of Gaussian (MoG) distribution. IEEE Wirel. Commun. 2020, 10, 1625–1628. [Google Scholar] [CrossRef]
Shen, X.; Zhang, Y.; Sata, K.; Shen, T. Gaussian mixture model clustering-based knock threshold learning in automotive engines. IEEE ASME Trans. Mechatron 2020, 6, 2981–2991. [Google Scholar] [CrossRef]
Mazzeo, D.; Oliveti, G.; Labonia, E. Estimation of wind speed probability density function using a mixture of two truncated normal distributions. Renew. Energy 2018, 115, 1260–1280. [Google Scholar] [CrossRef]
Ouarda, T.B.M.J.; Charron, C. On the mixture of wind speed distribution in a Nordic region. Energy Convers. Manag. 2018, 174, 33–44. [Google Scholar] [CrossRef]
Venkataraman, S. Value at risk for a mixture of normal distributions: The use of quasi- Bayesian estimation techniques. Econ. Perspect. Fed. Reserve Bank Chic. 1997, 21, 2–13. [Google Scholar]
Duan, R.; Ning, Y.; Wang, S.; Lindsay, B.G.; Carroll, R.J.; Chen, Y. A fast score test for generalized mixture models. Biometrics 2020, 76, 811–820. [Google Scholar] [CrossRef]
Di, C.Z.; Liang, K.Y. Likelihood ratio resting for admixture models with application to genetic linkage analysis. Biometrics 2011, 67, 1249–1259. [Google Scholar] [CrossRef]
Hartigan, J.A. A failure of likelihood asymptotics for normal mixtures. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer; Wadsworth: Belmont, CA, USA, 1985; Volume 2, pp. 807–810. [Google Scholar]
Day, N.E. Estimating the components of a mixture of normal distributions. Biometrika 1969, 56, 463–474. [Google Scholar] [CrossRef]
Panić, B.; Klemenc, J.; Nagode, M. Improved Initialization of the EM Algorithm for Mixture Model Parameter Estimation. Mathematics 2020, 8, 373. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Fang, K.T. A new approach to parameter estimation of mixture of two normal distributions. Commun. Stat. Simul. Comput. 2022, 1–27. [Google Scholar] [CrossRef]
Chen, J. Consistency of the mle under mixture models. Stat. Sci. 2017, 32, 47–63. [Google Scholar] [CrossRef]
Wu, X. Optimal quantization by matrix searching. J. Algorithms 1991, 12, 663–673. [Google Scholar] [CrossRef]
Graf, S.; Luschgy, H. Foundations of Quantization for Probability Distributions; Springer-Verlag: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
Gersho, A.; Gray, R.M. Vector Quantization and Signal Compression; Kluwer: Boston, MA, USA, 1992. [Google Scholar]
Max, J. Quantizing for minimum distortion. IEEE Trans. Inf. Theory 1960, 6, 7–12. [Google Scholar] [CrossRef]
Flury, B. Principal points. Biometrika 1990, 77, 33–41. [Google Scholar] [CrossRef]
Trushkin, A. Sufficient conditions for uniqueness of a locally optimal quantizer for a class of convex error weighting functions. IEEE Trans. Inf. Theory 1982, 28, 187–198. [Google Scholar] [CrossRef]
Tarpey, T. Two principal points of symmetric, strongly unimodal distributions. Stat. Probab. Lett. 1994, 20, 253–257. [Google Scholar] [CrossRef]
Yamamoto, W.; Shinozaki, N. On uniqueness of two principal points for univariate location mixtures. Stat. Probab. 2000, 46, 33–42. [Google Scholar] [CrossRef]
Lloyd, S.P. Least squares quantization in PCM. IEEE Trans. Inform. 1982, 28, 129–137. [Google Scholar] [CrossRef] [Green Version]
Kieffer, J. Exponential rate of convergence for Lloyd’s method. IEEE Trans. Inform. Theory 1982, 28, 205–210. [Google Scholar] [CrossRef]
Rowe, S. An algorithm for computing principal points with respect to a loss function in the unidimensional case. Stat. Comput. 1996, 6, 187–190. [Google Scholar] [CrossRef]
Chakraborty, S.; Roychowdhury, M.K.; Sifuentes, J. High precision numerical computation of principal points for univariate distributions. Sankhya B 2021, 83, 558–584. [Google Scholar] [CrossRef]
Fang, K.T.; He, S.D. The Problem of Selecting a Given Number of Representative Points in a Normal Population and a Generalized Mills’ Ratio; Technical Report SOLONR327; Department of Statistics, Stanford University: Stanford, CA, USA, 1964. [Google Scholar]
Zhou, M.; Wang, W. Representative points of Student’s t_n distribution and their applications in statistical simulation. Acta Math. Appl. Sin. 2016, 39, 620–640. [Google Scholar]
Fu, H. The problem of selecting a specified number of representative points from a Gamma population. J. China Univ. Min. Technol. 1985, 4, 107–117. [Google Scholar]
Wang, H. Problems of choosing representative points of given data in S-type distributions. J. Fuzhou Univ. 1995, 23, 7–13. [Google Scholar]
Fu, H. The problem of selecting a specified number of representative points from a Weibull population. J. Wuxi Inst. Light Ind. 1993, 22, 78–83. [Google Scholar]
Fei, R. The problem of selecting representative points from Pearson distributions population. J. Wuxi Inst. Light Ind. 1990, 9, 71–78. [Google Scholar]
Fang, K.T.; He, P.; Yang, J. Set of representative points of statistical distributions and their applications. Sci. Sin. Math 2020, 50, 1149–1168. [Google Scholar]
Eisenberger, I. Genesis of bimodal distributions. Technometrics 1964, 6, 357–363. [Google Scholar] [CrossRef]
Behboodian, J. On the modes of a mixture of two normal distributions. Technometrics 1970, 12, 131–139. [Google Scholar] [CrossRef]
Aitkin, M.; Wilson, G.T. Mixture models, outliers, and the em algorithm. Technometrics 1980, 22, 325–331. [Google Scholar] [CrossRef]
Fei, R. Statistical relationship between the representative point and the population. J. Wuxi Inst. Light Ind. 1991, 10, 78–81. [Google Scholar]
Flury, B. Estimation of principal points. J. R. Stat. Soc. C Appl. Stat. 1993, 42, 139–151. [Google Scholar] [CrossRef]
Li, L.; Flury, B. Uniqueness of principal points for univariate distributions. Stat. Probab. Lett. 1995, 25, 323–327. [Google Scholar] [CrossRef]
Bagnoli, M.; Bergstrom, T. Log-concave probability and its applications. Econ. Theory 2005, 26, 445–469. [Google Scholar] [CrossRef] [Green Version]
Saumard, A.; Wellner, J.A. Log-concavity and strong log-concavity: A review. Stat. Surv. 2014, 8, 45–114. [Google Scholar] [CrossRef] [PubMed]
Fang, K.T.; Wang, Y. Number-Theoretic Methods in Statistics, 1st ed.; Chapman and Hall: London, UK, 1994. [Google Scholar]
Fang, K.T.; Wang, Y.; Bentler, P.M. Some applications of number-theoretic methods in statistics. Stat. Sci. 1994, 9, 416–428. [Google Scholar] [CrossRef]
Parzen, E. On estimation of a probability density function and mode. Ann. Math. Statist. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 1956, 27, 832–837. [Google Scholar] [CrossRef]
Wolberg, W.; Street, W.; Mangasarian, O. Breast Cancer Wisconsin (Diagnostic). UCI Mach. Learn. Repos. 1995. Available online: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29 (accessed on 28 August 2022).

Figure 1. Examples of MixN densities. Density (A) represents

MixN (0.1, 0, 100, 0, 1)

; density (B) represents

MixN (0.3, 1.2, 4, - 0.2, 1)

; density (C) represents

MixN (0.5, 1, 1, - 1, 1)

; density (D) represents

MixN (0.5, 1, 0.25, - 1, 0.25)

. The solid lines are MixN densities. The dashed lines stand for the standard normal density for comparison.

Figure 1. Examples of MixN densities. Density (A) represents

MixN (0.1, 0, 100, 0, 1)

; density (B) represents

MixN (0.3, 1.2, 4, - 0.2, 1)

; density (C) represents

MixN (0.5, 1, 1, - 1, 1)

; density (D) represents

MixN (0.5, 1, 0.25, - 1, 0.25)

. The solid lines are MixN densities. The dashed lines stand for the standard normal density for comparison.

Figure 2. Densities of underlying distributions for simulation. The parameter settings of distributions (C1–C6) are given in Table 3.

Figure 3. Results of kernel density estimation comparisons (based on the distribution C6). The solid lines are fitted densities. The dashed lines stand for the density of C6.

Figure 4. Results of kernel density estimation comparisons (based on the distribution C3). The solid lines are fitted densities. The dashed lines stand for the density of C3.

Figure 5. Results of kernel density estimation comparisons (based on the point size of

n = 30

). The solid lines are fitted densities. The dashed lines stand for the density of the correspond underlying distribution.

Figure 5. Results of kernel density estimation comparisons (based on the point size of

n = 30

). The solid lines are fitted densities. The dashed lines stand for the density of the correspond underlying distribution.

Figure 6. Results of kernel density estimation comparisons. The solid lines are fitted densities. The dashed lines stand for the density of the distribution (17).

Table 1. The MSE-RPs from MixN(

0.7, 8, 10, 1.5, 1

) generated by the Fang-He algorithm.

Table 1. The MSE-RPs from MixN(

0.7, 8, 10, 1.5, 1

) generated by the Fang-He algorithm.

Size	IG(%)	n = 1	n = 2	n = 3	n = 4	n = 5	n = 6	n = 7	n = 8	n = 9	n = 10	n = 11	n = 12	n = 13	n = 14	n = 15
n = 1	0	6.05
n = 2	73.96867346	2.423613543	9.348758993
n = 3	88.05491179	1.791094154	6.958435648	11.2740348
n = 4	92.63599987	1.578947831	5.686264735	8.875275128	12.41300952
n = 5	94.8041442	1.359320939	4.411547025	7.247043817	9.887325817	13.08347976
n = 6	96.27601491	0.874142484	2.777767093	5.689605266	8.07288434	10.47036442	13.48934091
n = 7	97.19355685	0.721547318	2.432989485	4.978270697	7.092697743	9.077360483	11.21789261	14.02854444
n = 8	97.78091815	0.590565454	2.169171019	4.283239715	6.255118286	8.021071278	9.789159766	11.76997495	14.43927428
n = 9	98.20500143	0.380162232	1.790678064	3.347424327	5.318647716	7.004963804	8.600761793	10.24988317	12.13627649	14.71782742
n = 10	98.52422536	0.239819589	1.562145486	2.880667144	4.710947518	6.30573222	7.767300571	9.212233319	10.74770084	12.53926685	15.02845211
n = 11	98.76043213	0.133566173	1.399444615	2.585563018	4.193896755	5.719336628	7.08614538	8.397705465	9.732586337	11.18033667	12.8953116	15.30709701
n = 12	98.94303836	0.008599829	1.218033572	2.284440454	3.615134473	5.110029435	6.427649977	7.659566762	8.874284538	10.13383388	11.51870386	13.17678786	15.52993185
n = 13	99.08923755	−0.108377781	1.056999406	2.036526303	3.161785634	4.587501185	5.867069117	7.039498911	8.169244053	9.306265285	10.50274013	11.83310719	13.44080446	15.74095873
n = 14	99.20635891	−0.203011483	0.932442594	1.855030921	2.855603653	4.153620796	5.396210451	6.519930545	7.584211116	8.6328177	9.705262108	10.84759276	12.13029145	13.69267353	15.94396209
n = 15	99.30172785	−0.29980057	0.810063673	1.684262651	2.586658213	3.723095049	4.937162824	6.029382725	7.04819063	8.034822997	9.0224751	10.04458986	11.14397262	12.38790997	13.9129648	16.12264763
n = 16	99.38099587	−0.397412309	0.691626392	1.525271182	2.350810489	3.338672949	4.506794964	5.5768593	6.562313213	7.503042441	8.429158158	9.367664013	10.34815779	11.41056413	12.62052747	14.11295278
n = 17	99.44737405	−0.485192498	0.589176996	1.392128437	2.162551368	3.042665875	4.129813389	5.177772808	6.13525477	7.038712328	7.91663556	8.792274267	9.688605761	10.63278608	11.66254558	12.84233566
n = 18	99.50344049	−0.571828837	0.491812217	1.269172171	1.995295568	2.791623221	3.773842904	4.799051507	5.736150584	6.611428786	7.452441414	8.28057358	9.115139619	9.97629137	10.88945494	11.89121853
n = 19	99.55136666	−0.66008555	0.396525787	1.152105407	1.841299097	2.57076108	3.446966179	4.437820157	5.359804761	6.213672075	7.026204885	7.817675083	8.605241459	9.405393281	10.23665219	11.1229878
n = 20	99.59264581	−0.744557874	0.308685016	1.046738329	1.706464312	2.38502281	3.173673557	4.108561179	5.015150185	5.851445576	6.640528149	7.402097575	8.152222797	8.905109605	9.6754321	10.48030964
n = 21	99.62688834	−0.807393885	0.245705351	0.972803654	1.614016128	2.261506603	2.995836819	3.877267405	4.771000758	5.596662068	6.371402271	7.114885147	7.842354992	8.56690891	9.301567899	10.06021893
n = 22	99.65267792	−0.870229896	0.18467363	0.902417461	1.527423029	2.148251749	2.836712777	3.661464882	4.538754269	5.355747462	6.119001391	6.847298881	7.555595105	8.2564044	8.961304462	9.682558988
n = 23	99.68714839	−1.00264325	0.061578323	0.763675619	1.360571692	1.93647453	2.549550382	3.262541556	4.082228812	4.882146188	5.62653707	6.329686539	7.006023507	7.667600776	8.324251635	8.985487835
n = 24	99.71006407	−1.065479261	0.006070463	0.702703328	1.288991946	1.84784457	2.433452432	3.102572966	3.884230722	4.674021592	5.411091343	6.104787035	6.769153696	7.415835925	8.054367628	8.693504163
n = 25	99.72667915	−1.128315272	−0.047690507	0.644552282	1.221535747	1.765644791	2.327686503	2.959453199	3.700582468	4.477186241	5.208030648	5.893640702	6.547658551	7.181513639	7.804599568	8.425070077
n = 26	99.75125918	−1.253987293	−0.15014384	0.536126724	1.097771571	1.617523743	2.141690864	2.712750935	3.374707329	4.113054579	4.83194039	5.5054193	6.143465574	6.75717493	7.355574511	7.946356234
n = 27	99.7644673	−1.316823304	−0.198892331	0.485716419	1.04132981	1.551252832	2.060371533	2.607712131	3.23559288	3.948775949	4.660267019	5.328921948	5.960757926	6.566486661	7.154979624	7.733845897
n = 28	99.78349091	−1.442495326	−0.292262094	0.390813208	0.93628443	1.42946724	1.913078033	2.420923189	2.990243483	3.646175602	4.337656222	4.997598338	5.619209446	6.211748648	6.783925299	7.343011187
n = 29	99.79899701	−1.576585624	−0.386434972	0.297203347	0.834313039	1.313203717	1.775255231	2.250702302	2.770884396	3.366307402	4.024390942	4.673732574	5.286833944	5.868286401	6.426797921	6.969342579
n = 30	99.81170752	−1.682670291	−0.457277753	0.228182859	0.760148622	1.22985285	1.678073187	2.1331801	2.622495132	3.176383558	3.800806147	4.43873492	5.046032586	5.620844886	6.170757585	6.702586356
Size	IG(%)	n=16	n=17	n=18	n=19	n=20	n=21	n=22	n=23	n=24	n=25	n=26	n=27	n=28	n=29	n=30
n = 16		16.28634947
n = 17		14.30367481	16.44282304
n = 18		13.04446388	14.4792947	16.58833697
n = 19		12.10037508	13.2300635	14.64126979	16.7219755
n = 20		11.34285496	12.29780724	13.40525617	14.79379596	16.84910295
n = 21		10.8607438	11.72780172	12.70058597	13.84935175	15.33707755	17.28340665
n = 22		10.4339771	11.23436194	12.11083106	13.10798308	14.31132602	15.93482661	17.34624266
n = 23		9.661538521	10.36362581	11.10688902	11.91274135	12.81312891	13.86725272	15.20051003	17.19050787
n = 24		9.342118103	10.01037951	10.71000566	11.45689056	12.27427554	13.19971393	14.30402627	15.7507717	17.54149202
n = 25		9.051060731	9.69077992	10.3544829	11.05458254	11.80913245	12.64330719	13.60228946	14.77474311	16.38044961	17.60432804
n = 26		8.536378676	9.132715108	9.742633602	10.37472033	11.04041181	11.75447389	12.53836963	13.42875426	14.49230883	15.88156445	17.55395655
n = 27		8.309644238	8.888833378	9.478009352	10.08458152	10.71813121	11.3899024	12.11712718	12.92564178	13.85914468	15.00363721	16.57893718	17.79283607
n = 28		7.895239264	8.446310291	9.001792295	9.567957905	10.1508894	10.75969929	11.40496423	12.10040839	12.86983695	13.75034321	14.81094566	16.21633053	17.74246458
n = 29		7.501685726	8.029355922	8.557203492	9.090117312	9.633724516	10.19370059	10.77724624	11.39451777	12.05901994	12.78828981	13.61421997	14.59337111	15.8468082	17.73883763
n = 30		7.222328932	7.735220638	8.245656964	8.758184601	9.277234144	9.808036926	10.35632242	10.92893176	11.535665	12.1903433	12.91039171	13.7267179	14.69620142	15.938859	17.81798303

Table 2. The corresponding probabilities of the MSE−RPs from MixN(

0.7, 8, 10, 1.5, 1

) generated by the Fang−He algorithm.

Table 2. The corresponding probabilities of the MSE−RPs from MixN(

0.7, 8, 10, 1.5, 1

) generated by the Fang−He algorithm.

Size	n = 1	n = 2	n = 3	n = 4	n = 5	n = 6	n = 7	n = 8	n = 9	n = 10	n = 11	n = 12	n = 13	n = 14	n = 15
n = 1	1
n = 2	0.476345265	0.523654735
n = 3	0.387464318	0.359100239	0.253435443
n = 4	0.353593647	0.233434993	0.271896131	0.141075228
n = 5	0.312139375	0.160211605	0.227469427	0.205544828	0.094634765
n = 6	0.206142213	0.174695598	0.172388178	0.20611581	0.1677893	0.0728689
n = 7	0.17402589	0.182924593	0.130105109	0.170452309	0.168524295	0.123657199	0.050310606
n = 8	0.148364963	0.184887644	0.102463575	0.139101297	0.154034462	0.138352315	0.095527568	0.037268176
n = 9	0.111839247	0.175468356	0.098175866	0.110882403	0.136236259	0.139132192	0.118848093	0.079272555	0.030145028
n = 10	0.091047449	0.162027494	0.107962981	0.089711281	0.115464291	0.126865145	0.120988109	0.098694819	0.063649047	0.023589383
n = 11	0.077249903	0.149507214	0.115127365	0.075580739	0.097256732	0.112512798	0.115078771	0.104639772	0.082469925	0.051786667	0.018790114
n = 12	0.06311153	0.133495811	0.119914378	0.070361904	0.081276208	0.098651851	0.106732365	0.104436071	0.09200911	0.070801066	0.043625389	0.015584317
n = 13	0.051819253	0.118223832	0.119878686	0.074659113	0.068597117	0.085488051	0.096487045	0.099411465	0.093983698	0.080694416	0.06086164	0.036896332	0.012999352
n = 14	0.043956182	0.106194926	0.116981805	0.080419452	0.060017208	0.073892167	0.086064206	0.092054959	0.091352386	0.084004277	0.070598114	0.052327967	0.031260696	0.010875655
n = 15	0.036991832	0.094522794	0.111997602	0.085713267	0.056120593	0.063784217	0.07627941	0.084242426	0.086815907	0.083831215	0.075501554	0.062405967	0.045612056	0.026915914	0.009265246
n = 16	0.030970785	0.08359798	0.105561628	0.089129393	0.057476898	0.055547814	0.067262533	0.076343083	0.081116235	0.081277246	0.076817424	0.068008853	0.055438445	0.040065225	0.023405491
n = 17	0.026329752	0.074577724	0.099046204	0.090265596	0.061212707	0.049819795	0.059270497	0.068725831	0.07483451	0.077153565	0.075560653	0.070150495	0.061208778	0.049295436	0.035243746
n = 18	0.022388871	0.066477408	0.092329138	0.089744645	0.065383529	0.046969307	0.052327092	0.061641025	0.068566526	0.072387282	0.072915943	0.070124071	0.064152989	0.05530825	0.04408941
n = 19	0.018954148	0.05906334	0.085494176	0.087881986	0.068975212	0.047249092	0.046643591	0.055133522	0.062493136	0.067346577	0.069425059	0.068645952	0.065044622	0.058785066	0.050171849
n = 20	0.016148251	0.05269626	0.079099701	0.085167118	0.071336031	0.049540776	0.042608453	0.049336144	0.056760187	0.062239745	0.065404713	0.066132161	0.064389253	0.060245283	0.053877025
n = 21	0.014331958	0.048430958	0.074565544	0.082767546	0.072337221	0.05179405	0.040835527	0.045512277	0.052752376	0.058534445	0.062303139	0.063893207	0.063259337	0.060417543	0.055480215
n = 22	0.012721458	0.044540251	0.070242658	0.08013962	0.072760187	0.054104658	0.040211832	0.042221455	0.049059046	0.055010951	0.059210555	0.061477411	0.061736997	0.059989253	0.05627992
n = 23	0.00990561	0.03737658	0.061822934	0.074211324	0.072097587	0.058151382	0.041863221	0.037225831	0.042159573	0.048087511	0.052845873	0.056113322	0.057779445	0.057780153	0.056134082
n = 24	0.008805846	0.034455817	0.058232304	0.07137654	0.071229446	0.059487707	0.043422501	0.035940772	0.039375256	0.045084501	0.049979047	0.053555599	0.055688975	0.056310847	0.055408925
n = 25	0.007835227	0.031801335	0.054879126	0.068590254	0.070114173	0.060461631	0.045137918	0.035392009	0.03698507	0.0423196	0.047261938	0.051067098	0.053577742	0.054715607	0.054452555
n = 26	0.006224218	0.027205375	0.048850998	0.063229792	0.067380876	0.061343996	0.048329267	0.036125954	0.033490464	0.037494457	0.042337381	0.046421368	0.049455062	0.05135021	0.052054826
n = 27	0.005558847	0.025219572	0.046180942	0.060732959	0.065893864	0.061385524	0.049660122	0.037078651	0.032429443	0.035458909	0.0401411	0.044290323	0.047496275	0.049661125	0.05073257
n = 28	0.004453538	0.021751247	0.041353249	0.056001064	0.062714317	0.060793876	0.051663756	0.03947053	0.03163836	0.03212298	0.036155425	0.040303073	0.043747931	0.046319341	0.047957622
n = 29	0.00354088	0.018663736	0.036880282	0.051383805	0.059243655	0.059503741	0.052872241	0.04201032	0.032382329	0.029760912	0.032550275	0.036479152	0.040032496	0.042890716	0.044945877
n = 30	0.002970755	0.016592742	0.033778207	0.04805665	0.056550782	0.058180936	0.053260427	0.043721964	0.033665097	0.028835929	0.030238379	0.033812616	0.037361267	0.040340641	0.042626351
Size	n=16	n=17	n=18	n=19	n=20	n=21	n=22	n=23	n=24	n=25	n=26	n=27	n=28	n=29	n=30
n = 16	0.007980967
n = 17	0.020402411	0.0069023
n = 18	0.031238477	0.017935945	0.006020091
n = 19	0.039643705	0.027868553	0.015883139	0.005297276
n = 20	0.045564607	0.035709574	0.024931537	0.014128309	0.00468487
n = 21	0.048648071	0.040209581	0.030572419	0.020378759	0.009969009	0.003006819
n = 22	0.050750019	0.043622132	0.035190543	0.025884626	0.016344924	0.006300653	0.002200855
n = 23	0.052872361	0.048102178	0.042006149	0.0347745	0.026744056	0.018362538	0.0102406	0.00334319
n = 24	0.053025196	0.049212753	0.044074126	0.037789391	0.030594504	0.022825265	0.014931638	0.007004033	0.002189011
n = 25	0.052785998	0.049773524	0.04550369	0.040111187	0.033742595	0.026657665	0.019211571	0.011825	0.004236389	0.001561098
n = 26	0.051557543	0.049867076	0.047016612	0.04310501	0.038228517	0.032505294	0.02616132	0.019458127	0.012740819	0.006022389	0.002043046
n = 27	0.050683184	0.049514808	0.047249249	0.043962351	0.039712734	0.034622285	0.028862251	0.022619073	0.016162828	0.009879687	0.003525194	0.001286131
n = 28	0.048627358	0.048312446	0.047034448	0.0447928	0.041663535	0.037728136	0.033023665	0.027736878	0.02204455	0.016132384	0.010332258	0.004543942	0.001581292
n = 29	0.046171777	0.046550019	0.046060452	0.044731083	0.042572485	0.03961438	0.035958816	0.031692546	0.026860541	0.021641685	0.016251299	0.010907742	0.005947253	0.001899505
n = 30	0.044185774	0.04497709	0.044983309	0.044206637	0.042668036	0.040397221	0.037419586	0.033819188	0.029698301	0.025094318	0.020149648	0.015081924	0.010092536	0.005486728	0.00174696

Table 3. Parameters of underlying distributions for simulation.

Distribution	$α$	$μ_{1}$	$σ_{1}^{2}$	$μ_{2}$	$σ_{2}^{2}$	E(X)	Var(X)
C1	0.80	0.0	1.00	1.0	1.00	0.20	1.16
C2	0.50	1.0	1.00	−1.0	1.00	0.00	2.00
C3	0.50	1.0	0.25	−1.0	0.25	0.00	1.25
C4	0.70	4.6	2.00	1.5	1.00	3.67	3.72
C5	0.70	8.0	10.00	1.5	1.00	6.05	16.17
C6	0.65	−4.6	15.00	10.0	9.00	0.51	61.39

Table 4. Comparisons of the Fang-He algorithm and the k-means algorithm in terms of the total error on the non-linear system.

Size	Fang-He	k-Means	Fang-He	k-Means
	C1		C2
2	0.00000	0.00240	0.00000	0.00334
3	0.00000	0.00215	0.00000	0.00217
4	0.00000	0.00158	0.00000	0.00198
5	0.00000	0.00142	0.00000	0.00180
10	0.00000	0.00054	0.00000	0.00064
15	0.00000	0.00037	0.00000	0.00052
30	0.00000	0.00020	0.00000	0.00023
	C3		C4
2	0.00000	0.00184	0.00000	0.00223
3	0.00000	0.00113	0.00000	0.00255
4	0.00000	0.00152	0.00000	0.00292
5	0.00000	0.00093	0.00000	0.00259
10	0.00000	0.00053	0.00000	0.00111
15	0.00000	0.00030	0.00000	0.00072
30	0.00000	0.00013	0.00000	0.00036
	C5		C6
2	0.00000	0.00511	0.00000	0.01290
3	0.00000	0.00674	0.00000	0.00797
4	0.00000	0.00490	0.00000	0.00742
5	0.00000	0.00383	0.00000	0.00764
10	0.00000	0.00146	0.00000	0.00309
15	0.00000	0.00102	0.00000	0.00206
30	0.00000	0.00072	0.00000	0.00096

Table 5. Comparisons of the Fang-He algorithm and the k-means algorithm in terms of the information gain (IG) in percentage.

Size	Fang-He	k-Means	Fang-He	k-Means
	C1		C2
2	63.6508	63.6483	68.0514	68.0489
3	80.9919	80.9881	83.4876	83.4856
4	88.2662	88.2637	89.9059	89.9039
5	92.0203	92.0189	93.1672	93.1661
10	97.7140	97.7132	98.0585	98.0576
15	98.9306	98.9290	99.0936	99.0892
30	99.7174	99.7154	99.7608	99.7602
	C3		C4
2	81.3643	81.3636	70.0302	70.0296
3	88.4198	88.4181	84.7832	84.7818
4	93.5704	93.5684	90.6243	90.6219
5	95.3915	95.3903	93.6827	93.6815
10	98.7165	98.7152	98.2091	98.2072
15	99.3997	99.3988	99.1646	99.1608
30	99.8417	99.8410	99.7797	99.7790
	C5		C6
2	73.9687	73.9680	80.6102	80.6094
3	88.0549	88.0521	89.8819	89.8816
4	92.6360	92.6350	93.2878	93.2874
5	94.8041	94.8034	95.6431	95.6424
10	98.5242	98.5234	98.7264	98.7249
15	99.3017	99.3007	99.4036	99.4024
30	99.8117	99.8082	99.8426	99.8418

Table 6. Under the distribution C6, the

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the underlying mixture density

f (x)

.

Table 6. Under the distribution C6, the

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the underlying mixture density

f (x)

.

Size	MC	QMC	MSE
10	0.2937	0.1352	0.1084
15	0.3033	0.0966	0.0478
30	0.2082	0.0611	0.0207

Table 7. Under the distribution C3, the

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the underlying mixture density

f (x)

.

Table 7. Under the distribution C3, the

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the underlying mixture density

f (x)

.

Size	MC	QMC	MSE
10	1.8123	0.6206	0.4688
15	1.8441	0.5819	0.3040
30	1.5683	0.3890	0.2157

Table 8. The

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the corresponding underlying density

f (x)

, where

{\hat{f}}_{h} (x)

is based on the point size of

n = 30

.

Table 8. The

L_{2}

-distance between the kernel estimate

{\hat{f}}_{h} (x)

and the corresponding underlying density

f (x)

, where

{\hat{f}}_{h} (x)

is based on the point size of

n = 30

.

Distribution	MC	QMC	MSE
C1	0.8201	0.3625	0.0694
C2	0.5901	0.1639	0.0607
C4	0.5237	0.1271	0.0480
C5	0.4474	0.2605	0.0992

Table 9. Comparisons of the Fang-He algorithm and the k-means algorithm in terms of the total error on the non-linear system and the mean squared error (MSE).

Algorithm	Total Error	MSE
Fang-He	0.0000	0.5943
k-means	0.0273	1.2921

Table 10. The estimations of the first moments of the fitted MixN model based on 50 sampling points from the model.

Method	Mean	Variance	Skewness	Kurtosis
Model	92.9145	755.6292	0.9853	0.2471
MSE (Fang-He)	92.9145	755.0325	0.9851	0.2392
MSE (k-means)	92.9091	754.2046	0.9833	0.2066
QMC	92.8866	741.1019	0.9538	0.0196
MC	96.9411	781.8546	0.6179	−0.6027

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Fang, K.-T.; He, P.; Peng, H. Representative Points from a Mixture of Two Normal Distributions. Mathematics 2022, 10, 3952. https://doi.org/10.3390/math10213952

AMA Style

Li Y, Fang K-T, He P, Peng H. Representative Points from a Mixture of Two Normal Distributions. Mathematics. 2022; 10(21):3952. https://doi.org/10.3390/math10213952

Chicago/Turabian Style

Li, Yinan, Kai-Tai Fang, Ping He, and Heng Peng. 2022. "Representative Points from a Mixture of Two Normal Distributions" Mathematics 10, no. 21: 3952. https://doi.org/10.3390/math10213952

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Representative Points from a Mixture of Two Normal Distributions

Abstract

1. Introduction

2. Mixtures of Two-Component Normal Distributions

3. MSE-RPs from a MixN

4. Numerical Approximations to MSE-RPs from a MixN

4.1. The k-Means Algorithm

4.2. The Fang-He Algorithm

5. Numerical Studies

5.1. Algorithm Comparisons

5.2. Kernel Density Estimations

5.3. Real Data Example

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI