Next Article in Journal
Can You Identify These Celebrities? A Network Analysis on Differences between Word and Face Recognition
Next Article in Special Issue
Birnbaum-Saunders Quantile Regression Models with Application to Spatial Data
Previous Article in Journal
Exact Traveling and Nano-Solitons Wave Solitons of the Ionic Waves Propagating along Microtubules in Living Cells
Previous Article in Special Issue
Diagnostic Analytics for an Autoregressive Model under the Skew-Normal Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study on the X ¯ and S Control Charts with Unequal Sample Sizes

1
Applied Statistics Laboratory, Department of Industrial Engineering, Pusan National University, Busan 46241, Korea
2
Department of Management Science and Statistics, The University of Texas at San Antonio, San Antonio, TX 78249, USA
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(5), 698; https://doi.org/10.3390/math8050698
Submission received: 12 March 2020 / Revised: 24 April 2020 / Accepted: 30 April 2020 / Published: 2 May 2020
(This article belongs to the Special Issue Statistical Simulation and Computation)

Abstract

:
The control charts based on X ¯ and S are widely used to monitor the mean and variability of variables and can help quality engineers identify and investigate causes of the process variation. The usual requirement behind these control charts is that the sample sizes from the process are all equal, whereas this requirement may not be satisfied in practice due to missing observations, cost constraints, etc. To deal with this situation, several conventional methods were proposed. However, some methods based on weighted average approaches and an average sample size often result in degraded performance of the control charts because the adopted estimators are biased towards underestimating the true population parameters. These observations motivate us to investigate the existing methods with rigorous proofs and we provide a guideline to practitioners for the best selection to construct the X ¯ and S control charts when the sample sizes are not equal.
MSC:
26A51; 26D20; 33B15; 62F99; 62P30; 65C60

1. Introduction

Control charts, also known as Shewhart control charts [1,2,3], have been used to determine if a manufacturing process is in a state of control. In particular, the  X ¯ and S charts have been widely used to monitor or detect the mean and variability of variables. Here, a variable is a quality characteristic measured on a numerical scale. For example, variables include continuous measurement process data such as length, pressure, width, temperature, and volume, in a time-ordered sequence.
Due to their importance and usefulness in real life applications, these traditional types of univariate and control charts have still received much attention in the literature. We observe that these control charts are usually adopted for continuously monitoring numerous data and for solving the problem of process control in the Industry 4.0 framework; see, for example [4,5,6], among others. Of particular note is that [7] developed a nice qcr package in R to generate Shewhart-type charts and obtained numerical results of interest for a process quality control. More recently, based on the concept of data depth, Ref. [8] proposed a novel alternative way for constructing control charts when the critical to quality (CTQ) variables of the process are functional and also developed the Phase I and II control charts for stabilizing and monitoring the processes, respectively.
It deserves mentioning that these traditional Shewhart control charts mentioned above consist of the upper and lower control limits (for short, UCL and LCL) and the center line (CL). It is noteworthy that the American Standard is based on CL ± 3 · SE control limits with an ideal false alarm rate (FAR) of 0.27% while the British Standard is based on CL ± 3.09 · SE with an ideal FAR of 0.20%, where SE denotes the standard error.
The usual requirement behind these control charts is that the sample sizes from the process are all equal. In practice, however, it is often the case that this requirement can not be met due to wrong or missing observations in collecting them. In such setting, the three conventional approaches below are widely used to deal with unequal sample sizes:
(i)
A weighted average approach in calculating X ¯ ¯ and S ¯ 2 .
(ii)
Control limits based on an average sample size.
(iii)
A weighted average in calculating S ¯ .
For more details, see Subsection 6.3.2 of Montgomery [9] and Subsection 3.8.B of ASTM [10]. The first approach uses variable-width control limits which are determined by the sample-specific values such as n i , A 3 , B 3 , and B 4 . To estimate the scale parameter, a weighted average of sample variances is calculated first and then its square-root is taken to estimate the scale parameter. The second approach uses fixed-width control limits which is based on the average of the sample sizes. For more details on these two methods, see Subsection 6.3.2 of Montgomery [9]. The third approach is very similar to calculating S ¯ 2 in the first approach. However, it uses a weighted average of sample standard deviations directly to estimate the scale parameter. For more details, see Subsection 3.8.B of ASTM [10]. It is known that these three approaches may be satisfactory when the sample sizes are not very different. Given that the average of the sample sizes is not necessarily an integer in general, a practical alternative to the second approach is the use of a modal sample size.
However, when using these ad hoc approaches above, the parameter estimators are biased and they actually underestimate the true population parameters as will be shown in Remark 2 in Section 2 and Remarks 3 and 4 in Section 3. These underestimating ad hoc approaches could result in degraded performance of the control charts. Nonetheless, these biased methods are widely covered in many popular textbooks. These observations motivate us to clarify these conventional methods and investigate other existing methods, especially when the samples are not equal in size. Through the rigorous proofs, we provide a guideline for the best selection of the methods to construct the X ¯ and S control charts.
This paper is organized as follows. In Section 2, we provide two location estimators and four scale estimators with unequal sample sizes and show that they are all unbiased. In Section 3, we provide the variances of the estimators considered in this paper and show the inequality relations among them. In Section 4, we provide the relative efficiency of the methods and conduct simulation results to compare the performance of the location and scale estimators. In Section 5, we illustrate how to construct various Shewhart-type control charts (i.e., S, S 2 , and X ¯ charts) using the provided estimators. In Section 6, we provide the empirical estimates of the average run length (ARL) and the standard deviation of the run length (SDRL) through using the extensive Monte Carlo simulations. Three real-data examples are presented in Section 7 for illustrative purposes. Some concluding remarks are given in Section 8.

2. Estimation of Process Parameters with Unequal Sample Sizes

In this section, we provide two location estimators and three scale estimators for the process parameters under the assumption that each sample has different sample sizes. In parametric statistical quality control, the underlying distribution is used to construct the control charts. A quality characteristic is assumed to be normally distributed, which is most widely used in most practical cases. Under this assumption, we show that the estimators provided in this section are all unbiased.
We assume that we have m samples and that each sample has different sample sizes. Let X i j be the ith sample (subgroup) of size n i from a stable manufacturing process, where i = 1 , 2 , , m and j = 1 , 2 , , n i . We also assume that X i j are independent and identically distributed as normal with mean μ and variance σ 2 .

2.1. Location Parameter

Using the ith sample above, the sample mean and the sample variance are given by
X ¯ i = 1 n i j = 1 n i X i j , and S i 2 = 1 n i 1 j = 1 n i ( X i j X ¯ i ) 2 ,
where i = 1 , 2 , , m . Montgomery [9] provides two location estimators of the population mean parameter μ in Equations (6.2) and (6.30) in his book, which are given by
X ¯ ¯ A = X ¯ 1 + X ¯ 2 + + X ¯ m m = 1 m i = 1 m X ¯ i
and
X ¯ ¯ B = n 1 X ¯ 1 + n 2 X ¯ 2 + + n m X ¯ m n 1 + n 2 + + n m = 1 N i = 1 m n i X ¯ i ,
where n i 2 and N = i = 1 m n i .
These grand averages can be used as the CL on the X ¯ chart. Since E ( X ¯ i ) = μ for i = 1 , 2 , , m , it is easily seen that E ( X ¯ ¯ A ) = μ and E ( X ¯ ¯ B ) = μ , showing that these two estimators are unbiased. In addition, the variances of X ¯ ¯ A and X ¯ ¯ B are given by Var ( X ¯ ¯ A ) = σ 2 i = 1 m n i 1 / m 2 and Var ( X ¯ ¯ B ) = σ 2 / N , which results in Var ( X ¯ ¯ A ) Var ( X ¯ ¯ B ) . Thus, X ¯ ¯ B is preferred to X ¯ ¯ A . It should be noted that, for the case of an equal sample size, we have X ¯ ¯ A = X ¯ ¯ B = X ¯ ¯ , where X ¯ ¯ = i = 1 m X ¯ i / m .

2.2. Scale Parameter

It is well known that S i 2 is an unbiased estimator of σ 2 . However, S i is not an unbiased estimator of σ as below. Since ( n i 1 ) S i 2 / σ 2 is distributed as the gamma with shape ( n i 1 ) / 2 and scale 2, we have
E ( n i 1 ) S i 2 σ 2 1 / 2 = 2 1 / 2 · Γ ( n i / 2 ) Γ ( ( n i 1 ) / 2 ) .
Then, we have
E [ S i ] = 2 n i 1 Γ ( n i / 2 ) Γ ( ( n i 1 ) / 2 ) · σ = c 4 ( n i ) · σ ,
where
c 4 ( n i ) = 2 n i 1 Γ ( n i / 2 ) Γ ( ( n i 1 ) / 2 ) .
This shows that S i / c 4 ( n i ) is actually an unbiased estimator of σ . Note that c 4 ( · ) is the normal-consistent unbiasing factor, which is a function of the sample size and this c 4 notation was originally used in ASQC [11] to the best of our knowledge. For more details on c 4 ( · ) , the interested reader is referred to Vardeman [12]. In Appendix A, we also provide an approximate calculation of c 4 ( · ) which can be used for practically easier calculation.
Thus, we can estimate σ using S i / c 4 ( n i ) which are are all unbiased. Analogous to Equation (1), one can use
S ¯ A = S 1 / c 4 ( n 1 ) + S 2 / c 4 ( n 2 ) + + S m / c 4 ( n m ) m = 1 m i = 1 m S i c 4 ( n i ) ,
which is clearly unbiased for σ . Since i = 1 m E [ S i ] = i = 1 m c 4 ( n i ) · σ from Equation (3), the estimator below
S ¯ B = S 1 + S 2 + + S m c 4 ( n 1 ) + c 4 ( n 2 ) + + c 4 ( n m ) = i = 1 m S i i = 1 m c 4 ( n i )
is also unbiased for σ . In addition, we consider the following unbiased estimator proposed by Burr [13]
S ¯ C = i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
This estimator is actually the best linear unbiased estimator (BLUE) and we provide the proof below.
Theorem 1.
The estimator S ¯ C in Equation (7) is the BLUE.
Proof. 
First, we consider a linear unbiased estimator in the form of i = 1 m w i S i . Then, its variance and expectation are given by
Var i = 1 m w i S i = i = 1 m w i 2 { 1 c 4 ( n i ) 2 } σ 2 and E i = 1 m w i S i = i = 1 m w i c 4 ( n i ) σ .
To obtain the BLUE, we need to minimize Var i = 1 m w i S i with the unbiasedness condition E i = 1 m w i S i = σ . Thus, our objective is to minimize
i = 1 m w i 2 { 1 c 4 ( n i ) 2 } subject to i = 1 m w i c 4 ( n i ) = 1 ,
which can be easily solved by using the method of Lagrange multipliers. The auxiliary function with the Lagrange multiplier λ is given by
Ψ = i = 1 m w i 2 { 1 c 4 ( n i ) 2 } λ i = 1 m w i c 4 ( n i ) 1 .
It is immediate from Ψ / w k that 2 w k { 1 c 4 ( n k ) 2 } λ c 4 ( n k ) = 0 , which results in
w k = λ c 4 ( n k ) 2 { 1 c 4 ( n k ) 2 } .
Multiplying c 4 ( n k ) to Equation (8) and then making the sum of the two sides, we have
i = 1 m w i c 4 ( n i ) = λ 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
Since i = 1 m w i c 4 ( n i ) = 1 , we first solve the above for λ and then substitute it into Equation (8), which provides
w k = c 4 ( n k ) 1 c 4 ( n k ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 ,
which results in S ¯ C = i = 1 m w i S i . This completes the proof.  □
Remark 1.
It should be noted that, for the case of an equal sample size ( n 1 = n 2 = = n m = n ), we can easily show that S ¯ A = S ¯ B = S ¯ C = S ¯ / c 4 ( n ) , where S ¯ = i = 1 m S i / m .
One can also incorporate the pooled sample variance in estimating σ given by
S p 2 = i = 1 m ( n i 1 ) S i 2 N m ,
where N = i = 1 m n i again. However, S p is not unbiased for σ although S p 2 is. This is because
E [ S p ] = c 4 ( N m + 1 ) σ .
Based on this, Burr [13] suggested the following unbiased estimator of σ ,
S ¯ D = S p c 4 ( N m + 1 ) .
Remark 2.
The weighted average approach introduced in Subsection 6.3.2 of Montgomery [9] uses the pooled sample standard deviation S p from Equation (9) to estimate σ. However, since c 4 ( x ) < 1 , which will be shown in Lemma 1, it is immediate from Equation (10) that S p clearly underestimates the true parameter σ.
We have introduced the four unbiased scale estimators of σ which are denoted by S ¯ A , S ¯ B , S ¯ C , and  S ¯ D . A natural question appears: which of the four estimators should be recommended for estimating σ in practical applications? In the following section, we clarify this question by providing a guideline in terms of inequalities of their variances of the estimators under consideration.

3. Inequalities of the Variances of the Scale Estimators

We first obtain the variance of S ¯ A in Equation (5), which is obtained as
Var ( S ¯ A ) = 1 m 2 i = 1 m 1 c 4 ( n i ) 2 Var ( S i ) .
Using Equation (3) and the unbiasedness property of S i 2 , we have
Var ( S i ) = E ( S i 2 ) E ( S i ) 2 = σ 2 c 4 ( n i ) 2 σ 2 = σ 2 1 c 4 ( n i ) 2 .
Substituting Equation (13) into (12), we have
Var ( S ¯ A ) = σ 2 m 2 i = 1 m 1 c 4 ( n i ) 2 1 .
Similarly, the variance of S ¯ B in Equation (6) is easily obtained as
Var ( S ¯ B ) = i = 1 m Var ( S i ) { i = 1 m c 4 ( n i ) } 2 = σ 2 · i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2
and that of S ¯ C in Equation (7) is also obtained as
Var ( S ¯ C ) = σ 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
Finally, for the case of S ¯ D , we have
Var ( S ¯ D ) = E ( S ¯ D 2 ) E ( S ¯ D ) 2 .
Using Equation (11) and the unbiasedness property of S ¯ D , we can rewrite Equation (17) as
Var ( S ¯ D ) = 1 c 4 ( N m + 1 ) 2 E ( S p 2 ) σ 2 .
Since S p 2 is also unbiased for σ 2 , we have
Var ( S ¯ D ) = σ 2 1 c 4 ( N m + 1 ) 2 1 .
Next, as aforementioned, we here answer the question of how to choose the best one among the four estimators by comparing their variances of these four scale estimators. To be more specific, we prove the inequality relations among the four scale estimators as follows.
Lemma 1.
The function c 4 ( x ) defined in Equation (4) is monotonically increasing and
2 x 3 2 x 2 < c 4 ( x ) < 1 .
Proof. 
Using ( x 1 ) / 2 · Γ ( ( x 1 ) / 2 ) = Γ ( ( x + 1 ) / 2 ) , we can rewrite c 4 ( · ) in Equation (4) as
c 4 ( x ) = 2 x 1 Γ ( x / 2 ) Γ ( ( x 1 ) / 2 ) = Γ ( x / 2 ) 2 Γ ( ( x 1 ) / 2 ) Γ ( ( x + 1 ) / 2 ) 1 / 2 ,
where x 2 . Watson [14] showed that the function
θ ( x ) = x + x Γ ( x ) Γ ( x + 1 ) Γ ( x + 1 / 2 ) 2
is monotonically decreasing. It is noteworthy that the function θ ( x ) was motivated by Wallis’ famous infinite fraction for π [15]. We can rewrite c 4 ( x ) using θ ( x ) in Equation (20) as
c 4 ( x ) = 1 θ ( ( x 1 ) / 2 ) ( x 1 ) / 2 + 1 .
Since both θ ( ( x 1 ) / 2 ) and 1 / { ( x 1 ) / 2 } are positive and monotonically decreasing,
1 c 4 ( x ) = θ ( ( x 1 ) / 2 ) · 1 ( x 1 ) / 2 + 1
is also decreasing. Thus, c 4 ( x ) is monotonically increasing. Watson [14] and Mortici [16] also showed that
x + 1 4 < Γ ( x + 1 ) Γ ( x + 1 2 ) x + 1 π .
Multiplying 2 / ( 2 x + 1 ) on each of the above terms, some algebra shows that
4 x + 1 4 x + 2 < c 4 ( 2 x + 2 ) 2 x + 2 / π 2 x + 1 < 1 .
For convenience, we let x * = 2 x + 2 . Then, we have
2 x * 3 2 x * 2 < c 4 ( x * ) < 1 ,
which completes the proof.  □
Lemma 2
(Chebyshev’s sum inequality). If a 1 a 2 a m and b 1 b 2 b m , then we have
m i = 1 m a i b i i = 1 m a i · i = 1 m b i .
Similarly, if  a 1 a 2 a m and b 1 b 2 b m , then we have
m i = 1 m a i b i i = 1 m a i · i = 1 m b i .
Proof. 
First, we will consider the case where both a i and b i are increasing. In this case, both ( a i a j ) and ( b i b j ) have the same sign, or at least one of them can have the zero value. Then, the value of ( a i a j ) ( b i b j ) is positive or zero for any i and j, which results in
i = 1 m j = 1 m ( a i a j ) ( b i b j ) 0 .
After the tedious algebra of the above, we have
m i = 1 m a i b i i = 1 m a i · i = 1 m b i .
Next, we consider the case when a i is increasing and b i is decreasing. In this case, ( a i a j ) and ( b i b j ) have different signs, or at least one of them can have the zero value. Then, the value of ( a i a j ) ( b i b j ) is negative or zero for any i and j. Similar to the above approach, we have i = 1 m j = 1 m ( a i a j ) ( b i b j ) 0 , which results in m i = 1 m a i b i i = 1 m a i · i = 1 m b i . This completes the proof.  □
It should be noted that the above inequality name is coined after Pafnuty Lvovich Chebyshev (1821–1894) who mentioned it in a brief note [17]. He provided it in an integral form and his original proof can be found in Chebyshev [18]. For more details, the readers are also referred to Besenyei [19] and Section 2.17 of Hardy et al. [20].
Theorem 2.
We have
Var ( S ¯ A ) Var ( S ¯ B ) .
Proof. 
For convenience, we rearrange the sample sizes so that n 1 n 2 n m . Then, it is immediate from Equations (14) and (15) that it suffices to show
1 m 2 i = 1 m 1 c 4 ( n i ) 2 1 i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 .
Since c 4 ( x ) is increasing from Lemma 1, it is easily seen that 1 / c 4 ( x ) 2 1 is decreasing. For convenience, let a i = c 4 ( n i ) and b i = 1 / c 4 ( n i ) 2 1 . Then, we have a 1 a 2 a m and b 1 b 2 b m . Thus, we observe from Lemma 2 that
i = 1 m c 4 ( n i ) · i = 1 m 1 c 4 ( n i ) 2 1 m · i = 1 m 1 c 4 ( n i ) c 4 ( n i ) .
Applying Lemma 2 again with a i = c 4 ( n i ) (increasing) and b i = 1 / c 4 ( n i ) c 4 ( n i ) (decreasing), we have
i = 1 m c 4 ( n i ) · i = 1 m 1 c 4 ( n i ) c 4 ( n i ) m · i = 1 m 1 c 4 ( n i ) 2 ,
which results in
i = 1 m 1 c 4 ( n i ) c 4 ( n i ) m i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) .
Comparing Equations (22) and (23), we have
i = 1 m c 4 ( n i ) · i = 1 m 1 c 4 ( n i ) 2 1 m 2 i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) ,
which results in
1 m 2 i = 1 m 1 c 4 ( n i ) 2 1 i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 .
This completes the proof.  □
Theorem 3.
We have
Var ( S ¯ B ) Var ( S ¯ C ) .
Proof. 
We have the variances of S ¯ B and S ¯ C from Equations (15) and (16), which are given by
Var ( S ¯ B ) = σ 2 · i = 1 m 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 and Var ( S ¯ C ) = σ 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
Thus, it suffices to show
i = 1 m c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 · i = 1 m 1 c 4 ( n i ) 2 .
For convenience, we let a i = c 4 ( n i ) / 1 c 4 ( n i ) 2 and b i = 1 c 4 ( n i ) 2 . Then, it is immediate from the Cauchy–Schwarz inequality, a i b i 2 a i 2 b i 2 that the inequality in Equation (24) holds. This completes the proof.  □
Lemma 3.
The function
c 4 ( x ) 2 1 c 4 ( x ) 2
is concave.
Proof. 
It is immediate from Equation (21) that we have
c 4 ( x ) 2 1 c 4 ( x ) 2 = ( x 1 ) / 2 θ ( ( x 1 ) / 2 ) .
For convenience, we let y = ( x 1 ) / 2 . Then, it suffices to show y / θ ( y ) is concave. Bustoz and Ismail [21] showed that θ ( y ) is completely monotonic on [ 1 / 2 , ) using the representation by Watson [14]. For more details on complete monotonicity, refer to Section XIII.4 of Feller [22]. It is well known that completely monotonic functions are log-convex. For example, see Lemma 4.3 of Merkle [23], Theorem 1 of Fink [24], Exercise 6 in Section 2.1 of Niculescu and Persson [25], and Equation (3.4) of van Haeringen [26] with n = 0 and m = 1 , among others.
Thus, log θ ( y ) is convex so log θ ( y ) is concave. Since log y is also concave, log ( y / θ ( y ) ) = log y + log θ ( y ) is concave. This implies that y / θ ( y ) is log-concave. The log-concavity of y / θ ( y ) guarantees that it is concave. This completes the proof.  □
Theorem 4.
We have
Var ( S ¯ C ) Var ( S ¯ D ) .
Proof. 
From Equations (16) and (18), we have
Var ( S ¯ C ) = σ 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 and Var ( S ¯ D ) = σ 2 1 c 4 ( N m + 1 ) 2 1 .
Thus, it suffices to show
c 4 ( N m + 1 ) 2 1 c 4 ( N m + 1 ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
It is immediate from Lemma 3 that c 4 ( x ) 2 / { 1 c 4 ( x ) 2 } is concave. Then, using Jensen’s inequality, we have
c 4 ( n ¯ ) 2 1 c 4 ( n ¯ ) 2 1 m i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 ,
where n ¯ = i = 1 m n i / m . Thus, it suffices to show
c 4 ( N m + 1 ) 2 1 c 4 ( N m + 1 ) 2 m c 4 ( n ¯ ) 2 1 c 4 ( n ¯ ) 2 .
Using Equation (25) in Lemma 3, we have
c 4 ( N m + 1 ) 2 1 c 4 ( N m + 1 ) 2 = ( N m ) / 2 θ ( ( N m ) / 2 )
and
m c 4 ( n ¯ ) 2 1 c 4 ( n ¯ ) 2 = m ( n ¯ 1 ) / 2 θ ( ( n ¯ 1 ) / 2 ) = ( N m ) / 2 θ ( ( n ¯ 1 ) / 2 ) ,
where θ ( x ) is defined in Equation (20). Comparing Equations (27) and (28), we need to show
θ ( ( N m ) / 2 ) θ ( ( n ¯ 1 ) / 2 ) .
Since ( n ¯ 1 ) / 2 ( N m ) / 2 and θ ( x ) is decreasing from Watson [14], the above inequality in Equation (26) holds. This completes the proof.  □
In combination with the inequalities in Theorems 2–4, we have the following result:
Var ( S ¯ A ) Var ( S ¯ B ) Var ( S ¯ C ) Var ( S ¯ D ) .
Lemma 4.
The c 4 ( x ) defined in Equation (4) satisfies
1 m i = 1 m c 4 ( n i ) c 4 ( n ¯ ) .
Proof. 
First, we will show that c 4 ( x ) is concave. Taking the logarithm of c 4 ( x ) in Equation (19), we have
log c 4 ( x ) = log Γ x 2 1 2 log Γ x 1 2 1 2 log Γ x + 1 2 ,
where x 2 . It is well known that the second derivative of log Γ ( x ) can be expressed as the sum of the series
d 2 d x 2 log Γ ( x ) = k = 0 1 ( x + k ) 2 .
For more details, see Merkle [27] and Section 11.14 (iv) of Schilling [28].
Using Equation (30), we have
d 2 d x 2 log c 4 ( x ) = k = 0 1 ( x / 2 + k ) 2 1 2 k = 0 1 ( x / 2 1 / 2 + k ) 2 1 2 k = 0 1 ( x / 2 + 1 / 2 + k ) 2 = k = 0 1 ( x / 2 + k ) 2 1 2 ( x / 2 1 / 2 + k ) 2 1 2 ( x / 2 + 1 / 2 + k ) 2 = k = 0 3 4 ( x / 2 1 / 2 + k ) 2 + 3 4 ( x / 2 1 / 2 + k ) + 1 8 ( x / 2 + k ) 2 ( x / 2 1 / 2 + k ) 2 ( x / 2 + 1 / 2 + k ) 2 < 0 .
Thus, c 4 ( x ) is log-concave which guarantees that c 4 ( x ) is concave. Then, using Jensen’s inequality, we have
1 m i = 1 m c 4 ( n i ) c 4 ( n ¯ ) ,
where n ¯ = i = 1 m n i / m . This completes the proof.  □
Remark 3.
It is worth mentioning that one can argue an alternative way based on an average sample size n ¯ = i = 1 m n i / m . For example, see Section 6.3.2 of Montgomery [9]. In such setting, one may consider the following quotient for the estimator of σ
S ¯ * = S ¯ c 4 ( n ¯ ) ,
where
S ¯ = 1 m i = 1 m S i .
However, this estimator is not unbiased because
E [ S ¯ * ] = 1 m i = 1 m E [ S i ] c 4 ( n ¯ ) σ = 1 m i = 1 m c 4 ( n i ) c 4 ( n ¯ ) · σ .
Because 1 m i = 1 m c 4 ( n i ) c 4 ( n ¯ ) from Lemma 4, S ¯ * can underestimate the true parameter σ. However, for the case of an equal sample size ( n 1 = n 2 = = n m = n ) , we have S ¯ * = S ¯ / c 4 ( n ) so that S ¯ A = S ¯ B = S ¯ C = S ¯ * . Thus, in this special case of an equal sample size, S ¯ * is unbiased.
Remark 4.
There is another alternative in Subsection 3.8.B of ASTM [10], which is based on the weighted average of sample standard deviations given by
S ¯ w = n 1 S 1 + n 2 S 2 + + n m S m n 1 + n 2 + + n m = i = 1 m n i S i N .
Then, we have
E [ S ¯ w ] = i = 1 m n i c 4 ( n i ) N · σ .
It deserves mentioning that the estimator above still underestimates the true parameter σ because  c 4 ( x ) < 1 from Lemma 1. In addition, for the case of an equal sample size, we have S ¯ w = S ¯ , where S ¯ = 1 m i = 1 m S i .
Theorem 5.
Let S N 2 = i = 1 m j = 1 n i ( X i j X ¯ ¯ ) 2 / ( N 1 ) and X ¯ ¯ = 1 N i = 1 m j = 1 n i X i j . Then, X ¯ ¯ and S ¯ E = S N / c 4 ( N ) are the uniform minimum variance unbiased estimators of μ and σ, respectively. Thus, we have
Var ( S ¯ D ) Var ( S ¯ E ) .
Proof. 
One can obtain the uniform minimum variance unbiased estimator (UMVUE) using complete sufficient statistics as described in Theorem 7.3.23 of Casella and Berger [29]. For more details on the UMVUE, one can refer to Definition 1.6 of Lehmann and Casella [30].
The control charts we have developed are under the assumption that X i j are independent and identically distributed as normal with mean μ and variance σ 2 which leads to the joint complete sufficient statistics for μ and σ 2 given by X ¯ ¯ = 1 N i = 1 m j = 1 n i X i j and i = 1 m j = 1 n i ( X i j X ¯ ¯ ) 2 , respectively. Since E [ X ¯ ¯ ] = μ and E [ S ¯ E ] = σ , X ¯ ¯ and S ¯ E are the UMVUEs of μ and σ , respectively. This completes the proof.  □
Remark 5.
It is noteworthy that, analogous to Equation (18), we have
Var ( S ¯ E ) = σ 2 1 c 4 ( N ) 2 1 ,
which also results in Var ( S ¯ D ) Var ( S ¯ E ) since c 4 ( x ) is increasing from Lemma 1.
Remark 6.
Although S ¯ E can attain the minimum variance, we do not adopt this estimator to construct the control charts. The main reason can be discussed as follows. Consider the case that X i j are independent and identically distributed as normal with different means μ i for j = 1 , 2 , , n i and variance σ 2 . Then, the pooled sample variance, S p 2 , is a complete sufficient statistic. See Example 2.3 in Section 2.2 of Lehmann and Casella [30]. This implies that S ¯ E is better under the null hypothesis H 0 : μ 1 = μ 2 = = μ m (in control), whereas S ¯ D is better under the alternative (out of control). Thus, if the process is out of control, the control charts using S ¯ E could have wider control limits, which may result in an increase in the rate incorrectly signaling that the process is in-control when the process is actually out of control.
In addition, one can think of the case of non-constancy of σ. In this heteroscedasticity case, Burr [13] mentioned that S ¯ C seems preferable to S ¯ D . We think that this case should be investigated more thoroughly in a sequel paper.

4. Comparison of the Performance

In this section, we provide the relative efficiency of the methods to compare their performance. We also carried out Monte Carlo simulations to compare the empirical biases and variances.

4.1. Relative Efficiency

When we compare the performance of unbiased estimators (say, θ ^ 1 and θ ^ 0 ), the relative efficiency (RE) is widely used in the statistics literature. See Section 2.2 of Lehmann [31] for more details. The RE of θ ^ 1 with respect to θ ^ 0 is given by
RE ( θ ^ 1 θ ^ 0 ) = Var ( θ ^ 0 ) Var ( θ ^ 1 ) ,
where θ ^ 0 is a reference estimator. In general, the estimator with the smaller variance of the two estimators is used as a reference estimator so that RE ( θ ^ 1 θ ^ 0 ) 1 .
To estimate the location parameter, we considered X ¯ ¯ A in Equation (1) and X ¯ ¯ B in Equation (2). Then, the RE of X ¯ ¯ A with respect to X ¯ ¯ B is easily obtained as
RE ( X ¯ ¯ A X ¯ ¯ B ) = m 2 i = 1 m n i · i = 1 m n i 1 .
Note that RE ( X ¯ ¯ A X ¯ ¯ B ) 1 where the equality holds if and only if n 1 = n 2 = = n m due to the inequality of the arithmetic mean and the harmonic mean [32].
For the case of the scale parameter, we considered the five estimators. Among them, S ¯ E has the smallest variance. Thus, it can be used as a reference to compare the performance of the scale estimators and the RE is then given by
RE ( S ¯ j S ¯ E ) = Var ( S ¯ E ) Var ( S ¯ j ) ,
where j = A , B , C , D . For notational brevity, we denote
RE ( S ¯ j ) = RE ( S ¯ j S ¯ E ) .
It is immediate from Equations (14)–(16), (18), and (34) that we have
RE ( S ¯ A ) = 1 c 4 ( N ) 2 1 · m 2 i = 1 m 1 c 4 ( n i ) 2 1 , RE ( S ¯ B ) = 1 c 4 ( N ) 2 1 · i = 1 m c 4 ( n i ) 2 i = 1 m 1 c 4 ( n i ) 2 , RE ( S ¯ C ) = 1 c 4 ( N ) 2 1 · i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 ,
and
RE ( S ¯ D ) = 1 c 4 ( N ) 2 1 · c 4 ( N m + 1 ) 2 1 c 4 ( N m + 1 ) 2 .
It can be easily seen from Equations (29) and (34) that we have RE ( S ¯ A ) RE ( S ¯ B ) RE ( S ¯ C ) RE ( S ¯ D ) 1 . In particular, when n 1 = n 2 = = n m , we have RE ( S ¯ A ) = RE ( S ¯ B ) = RE ( S ¯ C ) RE ( S ¯ D ) .
We have considered the RE to compare the performance of the above unbiased estimators. However, S ¯ * in Equation (31) and S ¯ w in Equation (33) are not unbiased as mentioned in Remarks 2 and 3, respectively. In addition, S ¯ in Equation (32) is not unbiased as easily seen from Equation (3). Thus, it is reasonable to consider the mean square error (MSE) to obtain the RE since the MSE can be regarded as a overall measure of bias and dispersion. For other utilization and modification of the RE, one can refer to Park et al. [33,34] and Ouyang et al. [35] which considered the ratio of the determinants of the covariance matrices, that is, the generalized variances [36,37]. Since the MSE is the same as the variance for unbiased estimators ( S ¯ A , S ¯ B , S ¯ C , and S ¯ D ), the RE based on the MSE is the same as that based on the variance in this unbiased case. In addition, the variance of S ¯ E is the same as its MSE. Thus, we consider the RE of S ¯ , S ¯ * , and S ¯ w as follows. We denote them by
RE ( S ¯ ) = Var ( S ¯ E ) MSE ( S ¯ ) , RE ( S ¯ * ) = Var ( S ¯ E ) MSE ( S ¯ * ) , and RE ( S ¯ w ) = Var ( S ¯ E ) MSE ( S ¯ w ) .
We next obtain their biases which are easily obtained using Equation (3)
Bias ( S ¯ ) = 1 m i = 1 m c 4 ( n i ) 1 σ , Bias ( S ¯ * ) = i = 1 m c 4 ( n i ) m c 4 ( n ¯ ) 1 σ ,
and
Bias ( S ¯ w ) = i = 1 m n i c 4 ( n i ) N 1 σ .
In addition, the variances are also easily obtained by using Var ( S i ) = σ 2 1 c 4 ( n i ) 2 so that we have
Var ( S ¯ ) = σ 2 m 2 i = 1 m 1 c 4 ( n i ) 2 , Var ( S ¯ * ) = 1 c 4 ( n ¯ ) 2 σ 2 m 2 i = 1 m 1 c 4 ( n i ) 2 ,
and
Var ( S ¯ w ) = σ 2 N 2 i = 1 m n i 2 1 c 4 ( n i ) 2 .
Considering that the MSE is the variance plus the squared bias, we can obtain the RE based on the MSE
RE ( S ¯ ) = 1 c 4 ( N ) 2 1 · m 2 i = 1 m 1 c 4 ( n i ) 2 + i = 1 m c 4 ( n i ) m 2 , RE ( S ¯ * ) = 1 c 4 ( N ) 2 1 · m 2 c 4 ( n ¯ ) 2 i = 1 m 1 c 4 ( n i ) 2 + i = 1 m c 4 ( n i ) m c 4 ( n ¯ ) 2 ,
and
RE ( S ¯ w ) = 1 c 4 ( N ) 2 1 · N 2 i = 1 m n i 2 1 c 4 ( n i ) 2 + i = 1 m n i c 4 ( n i ) N 2 .

4.2. Empirical Biases and Variances

The RE is a useful statistical tool to compare the performance of unbiased estimators. However, the conventional methods such as in Equations (31)–(33) are biased. To compare these with the methods provided here, we obtain the empirical biases and variances by carrying out Monte Carlo simulations.
We consider two cases: an equal sample size and unequal sample sizes. We first generated samples of equal size with n 1 = n 2 = n 3 = 3 , with n 1 = n 2 = n 3 = 10 , and  with n 1 = n 2 = n 3 = 20 . We next generated samples of unequal sizes with n 1 = 3 , n 2 = 5 , n 3 = 7 , with n 1 = 5 , n 2 = 10 , n 3 = 15 , and with n 1 = 10 , n 2 = 20 , n 3 = 30 . Again, let X i j be the ith sample (subgroup) of size n i . Then, X i j were generated from the normal distribution with mean μ 0 = 100 and standard deviation σ 0 = 10 and we obtained the scale estimates including the unbiased estimators ( S ¯ A , S ¯ B , S ¯ C , S ¯ D , S ¯ E ) and the conventional methods ( S ¯ , S ¯ * , S ¯ w ). In order to obtain the empirical biases and variances of these estimates, we repeated this simulation ten million times ( I = 10 7 ) and the results are summarized in Table 1 for the case of an equal sample size and Table 2 for the case of unequal sample sizes. In addition, the MSEs along with the squared empirical biases (red) and variances (light blue) are plotted in Figure 1 for the case of an equal sample size and in Figure 2 for the case of unequal sample sizes. In the figures, S ¯ A , S ¯ B , S ¯ C , S ¯ D , S ¯ , S ¯ * , and  S ¯ w are denoted by A, B, C, D, S, S*, and Sw, respectively.
Comparing the simulation results in Table 1, we can observe that the empirical results of S ¯ A , S ¯ B , S ¯ C , and  S ¯ * are the same for the case of an equal sample size. These results are quite reasonable as pointed out in Remark 3. We can notice that the empirical biases are not exactly zero, although quite negligible, because these biases are due to a random phenomenon of Monte Carlo simulation. In addition, S ¯  and S ¯ w have the same results as pointed out in Remark 4. On the other hand, for the case of unequal sample sizes, they all have different results.
For the case of an equal sample size, the empirical variances and biases are noticeably different for a smaller sample size, but all of them are getting closer as the sample size is increasing as also shown in Figure 1. For the case of unequal sample sizes, the empirical biases are getting smaller as the sample sizes are increasing. However, the variances are still noticeably different even with large sample sizes.
Comparing the empirical variances only, S ¯ w performs very well, whereas it is seriously biased. Thus, it is reasonable to compare the MSEs along with the biases and we can conclude that S ¯ D is overall the best. Note that we do not recommend the use of S ¯ E , even though S ¯ E always has the best results in all the measures. This is because it can lead to degraded performance when the process is out of control, as aforementioned in Remark 6.

5. Construction of the Control Charts with Unequal Sample Sizes

We briefly introduce how to construct the control charts and then discuss how to implement the estimators provided here to construct the S chart in Section 5.1 and improve the S chart using the probability limits in Section 5.2. We also discuss the construction of the X ¯ chart in Section 5.3.
In general, we construct statistical quality control charts based on two phases [9,38], usually denoted by Phase-I and Phase-II. Then, one can establish control limits with a set of stable manufacturing process data in Phase-I. Then, we monitor the process in Phase-II using the control limits obtained in Phase-I. We assume that we have m samples from a stable manufacturing process (Phase-I) and each sample has different sample sizes, denoted by n i where i = 1 , 2 , , m . Then, we monitor the process with a sample of size n k (Phase-II).

5.1. The S Chart

From the statistical asymptotic theory (for example, Corollarys 6–10 of Arnold [39]), we can have an approximate distribution
S k E ( S k ) SE ( S k ) N ( 0 , 1 ) ,
where S k is the sample standard deviation with sample size n k . In order to construct the CL ± 3 · SE control limits, we can set up ( S k E ( S k ) ) / SE ( S k ) = ± 3 . Solving this for S k , we have
E ( S k ) ± 3 · SE ( S k ) = c 4 ( n k ) σ ± 3 1 c 4 ( n k ) 2 σ .
Since σ is generally unknown, we need to estimate σ . One can use S ¯ A in Equation (5), S ¯ B in Equation (6), S ¯ C in Equation (7), and  S ¯ D in Equation (11). Using S ¯ A , we can construct the S chart with sample size n k as follows:
UCL A = c 4 ( n k ) m i = 1 m S i c 4 ( n i ) + 3 1 c 4 ( n k ) 2 · 1 m i = 1 m S i c 4 ( n i ) CL A = c 4 ( n k ) m i = 1 m S i c 4 ( n i ) LCL A = c 4 ( n k ) m i = 1 m S i c 4 ( n i ) 3 1 c 4 ( n k ) 2 · 1 m i = 1 m S i c 4 ( n i ) ,
where we assign zero to LCL if it is negative. Next, using S ¯ B , we construct the S chart as follows:
UCL B = c 4 ( n k ) · i = 1 m S i i = 1 m c 4 ( n i ) + 3 1 c 4 ( n k ) 2 · i = 1 m S i i = 1 m c 4 ( n i ) CL B = c 4 ( n k ) · i = 1 m S i i = 1 m c 4 ( n i ) LCL B = c 4 ( n k ) · i = 1 m S i i = 1 m c 4 ( n i ) 3 1 c 4 ( n k ) 2 · i = 1 m S i i = 1 m c 4 ( n i ) .
In addition, using S ¯ C , we can construct the S chart as follows:
UCL C = c 4 ( n k ) · i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 + 3 1 c 4 ( n k ) 2 · i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 CL C = c 4 ( n k ) · i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 LCL C = c 4 ( n k ) · i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 3 1 c 4 ( n k ) 2 · i = 1 m c 4 ( n i ) S i 1 c 4 ( n i ) 2 i = 1 m c 4 ( n i ) 2 1 c 4 ( n i ) 2 .
Of particular note is that, for the case of an equal sample size ( n 1 = n 2 = = n m = n ) in Phase-I, it is easily seen that UCL A = UCL B = UCL C , CL A = CL B = CL C , and  LCL A = LCL B = LCL C . Thus, we have the S chart below with sample size n k to use in Phase-II
UCL = c 4 ( n k ) c 4 ( n ) S ¯ + 3 1 c 4 ( n k ) 2 c 4 ( n ) S ¯ CL = c 4 ( n k ) c 4 ( n ) S ¯ LCL = c 4 ( n k ) c 4 ( n ) S ¯ 3 1 c 4 ( n k ) 2 c 4 ( n ) S ¯ ,
where S ¯ = i = 1 m S i / m . We assign zero to LCL if it is negative. Furthermore, if we assume n k = n , we have
UCL = S ¯ + 3 1 c 4 ( n ) 2 c 4 ( n ) S ¯ = B 4 ( n ) S ¯ CL = S ¯ LCL = S ¯ 3 1 c 4 ( n ) 2 c 4 ( n ) S ¯ = B 3 ( n ) S ¯ ,
where B 3 ( n ) = max 1 3 1 c 4 ( n ) 2 / c 4 ( n ) , 0 , and B 4 ( n ) = 1 + 3 1 c 4 ( n ) 2 / c 4 ( n ) . This is a well-known S chart introduced in the quality control literature. For example, see the chart in Equation (6.27) of Montgomery [9]. This indicates that the proposed S chart includes the existing S chart as a special case.
Using S ¯ D , we construct the S chart as follows:
UCL D = c 4 ( n k ) · S p c 4 ( N m + 1 ) + 3 1 c 4 ( n k ) 2 · S p c 4 ( N m + 1 ) CL D = c 4 ( n k ) · S p c 4 ( N m + 1 ) LCL D = c 4 ( n k ) · S p c 4 ( N m + 1 ) 3 1 c 4 ( n k ) 2 · S p c 4 ( N m + 1 ) ,
where S p = i = 1 m ( n i 1 ) S i 2 / ( N m ) 1 / 2 and N = i = 1 m n i . Unlike the previous cases, this control chart is not the same as the one in Equation (36) even for the case of an equal sample size.

5.2. The S and S 2 Charts with Probability Limits

We can improve the above S charts by using probability limits as mentioned in Section 4.7.4 of Ryan [40]. It follows from the following Chi-square distribution result
( n k 1 ) S k 2 σ 2 χ n k 1 2
that
P χ 1 α / 2 , n k 1 2 ( n k 1 ) S k 2 σ 2 χ α / 2 , n k 1 2 = 1 α ,
where χ γ , ν 2 is the γ th upper quantile of the Chi-square distribution with ν degrees of freedom. Rewriting Equation (37) about S k , we then have
P σ · χ 1 α / 2 , n k 1 2 n k 1 S k σ · χ α / 2 , n k 1 2 n k 1 = 1 α .
Thus, we can construct the S chart with probability limits such that UCL = σ { χ α / 2 , n k 1 2 / ( n k 1 ) } 1 / 2 , CL = σ , and  LCL = σ { χ 1 α / 2 , n k 1 2 / ( n k 1 ) } 1 / 2 . In practice, since σ is unknown, we need to estimate σ . Thus, with the estimator σ ^ , we can obtain
UCL = σ ^ · χ α / 2 , n k 1 2 n k 1 CL = σ ^ LCL = σ ^ · χ 1 α / 2 , n k 1 2 n k 1 .
In the above, one construct the S chart with probability limits by using S ¯ A , S ¯ B , S ¯ C or S ¯ D instead of σ ^ .
Next, we consider the construction of the S 2 chart. We rewrite Equation (37) about S k 2 and we then have
P σ 2 · χ 1 α / 2 , n k 1 2 n k 1 S k 2 σ 2 · χ α / 2 , n k 1 2 n k 1 = 1 α .
Using the above along with σ ^ 2 = S p 2 where S p 2 is the pooled sample variance denoted in Equation (9), we can also construct the S 2 chart as follows:
UCL = S p 2 · χ α / 2 , n k 1 2 n k 1 CL = S p 2 LCL = S p 2 · χ 1 α / 2 , n k 1 2 n k 1 .
Note that none of S ¯ A 2 , S ¯ B 2 , S ¯ C 2 , or S ¯ D 2 is unbiased for σ 2 , whereas S p 2 is unbiased. Thus, it is not recommended to use any of the four estimators to construct the S 2 chart.

5.3. The X ¯ Chart

From the statistical asymptotic theory, we have
X ¯ k E ( X ¯ k ) SE ( X ¯ k ) N ( 0 , 1 ) ,
where X ¯ k is the sample mean with sample size n k . In order to construct the CL ± 3 · SE control limits, we can set up ( X ¯ k E ( X ¯ k ) ) / SE ( X ¯ k ) = ± 3 . Solving this for X ¯ k , we have
E ( X ¯ k ) ± 3 · SE ( X ¯ k ) = μ ± 3 σ n k .
Since μ and σ are unknown in practice, we need to estimate them. With the estimates μ ^ and σ ^ , we have
UCL = μ ^ + 3 σ ^ n k CL = μ ^ LCL = μ ^ 3 σ ^ n k .
To estimate μ , one can use X ¯ ¯ A defined in Equation (1) or X ¯ ¯ B defined in Equation (2). To estimate σ , one can use any of S ¯ A , S ¯ B , S ¯ C , and S ¯ D .
As an illustration, using X ¯ ¯ B and S ¯ A , we have X ¯ chart as follows:
UCL A = X ¯ ¯ B + 3 S ¯ A n k CL A = X ¯ ¯ B LCL A = X ¯ ¯ B 3 S ¯ A n k .
Similarly, we can construct various X ¯ charts using a total of eight combinations of μ and σ  estimators.
Remark 7.
It should be noted that, when S ¯ D is used, the X ¯ chart is somewhat different from the above chart and is given by
UCL D = X ¯ ¯ + 3 S p c 4 ( n m m + 1 ) n k CL D = X ¯ ¯ LCL D = X ¯ ¯ 3 S p c 4 ( n m m + 1 ) n k .
As shown in Section 4, S ¯ D performs better than S ¯ A , S ¯ B , and S ¯ C . However, to the best of our knowledge, the  X ¯ chart based on S ¯ D has not been widely used probably because the calculation of the normal-consistent unbiasing factor c 4 is difficult especially with a large argument value. Most textbooks provide the values of c 4 ( n ) only for n 25 . In Appendix A, for an easy calculation, we provide an approximation for c 4 ( n ) which is highly accurate within one unit in the ninth decimal place for n > 25 . Thus, using this, one can easily calculate the LCL and UCL of the X ¯ chart based on S ¯ D .

6. Average Run Length and Standard Deviation of Run Length

To compare the performance of the control charts based on the various estimators, we obtained the empirical estimates of the ARL and the SDRL through using the extensive Monte Carlo simulation method. For this simulation, we considered the X ¯ chart. In this chart, we only estimated the location with μ ^ = X ¯ ¯ B because the RE of X ¯ ¯ B is better than that of X ¯ ¯ A as shown in Section 4. For the scale, we considered seven different estimates ( S ¯ A , S ¯ B , S ¯ C , S ¯ D , S ¯ , S ¯ * , S ¯ w ) and we denoted the charts based on these scale estimates by A, B, C, D, S, S*, and Sw, respectively.
We assume that we have m samples ( n 1 , n 2 , , n m ) in Phase-I. Again, let X i j be the ith sample (subgroup) of size n i . Then, X i j ’s were generated from the normal distribution with mean μ 0 = 10 and standard deviation σ 0 = 5 and we obtained the location estimate ( X ¯ ¯ B ) and the seven scale estimates. Using these estimates, we constructed the seven control charts based on CL ± 3 · SE control limits with FAR 0.27%. Then, we monitored the process with a new sample of size n k from the same normal distribution and obtained the run length in Phase-II. We repeated this simulation one million times ( I = 10 6 ) to obtain the run lengths and then estimated the ARL and SDRL based on these run lengths. Note that the simulation results are the same as the ones under different parameter values of μ 0 and σ 0 . It deserves mentioning that the results are quite reasonable, since the normal distribution is a location-scale family, whereas the results are somewhat dependent on the number of samples and the combination of the sample sizes.
We generated m = 15 samples with the five different scenarios. The sample sizes of each scenario are given by
Scenario   I n 1 = n 2 = = n 5 = 3 , n 6 = n 7 = = n 10 = 10 , n 11 = n 12 = = n 15 = 17 ,
Scenario   II n 1 = n 2 = = n 5 = 5 , n 6 = n 7 = = n 10 = 10 , n 11 = n 12 = = n 15 = 15 ,
Scenario   III n 1 = n 2 = = n 5 = 7 , n 6 = n 7 = = n 10 = 10 , n 11 = n 12 = = n 15 = 13 ,
Scenario   IV n 1 = n 2 = = n 5 = 9 , n 6 = n 7 = = n 10 = 10 , n 11 = n 12 = = n 15 = 11 ,
Scenario   V n 1 = n 2 = = n 5 = 10 , n 6 = n 7 = = n 10 = 10 , n 11 = n 12 = = n 15 = 10 ,
We considered Scenarios I–IV for the cases of unequal sample sizes and Scenario V for the case of an equal sample size as a reference. Upon each of these scenarios, we estimated the ARL and SDRL as described above. The simulation results are provided in Table 3.
Note that the ideal ARL under the normal distribution is around 1 / 0.0027 370 with FAR 0.27%. However, this ideal value is obtained when using the true μ 0 and σ 0 . In practice, we need to estimate these parameters in Phase-I. Thus, with such an uncertainty due to estimation, the target ARL can be different from the ideal ARL of 370. We observed that the empirical results of A, B, C, and S* have the same value under the case of an equal sample size (Scenario V) and these results have the same tendency when the RE measure is used in Section 4.2. These results are expected as pointed out in Remark 3. The estimated ARL of D (362.58) is very close to that of A, B, C, and S* (364.36). In addition, the estimated SDRL of D (537.31) is very close to that of A, B, C, and S* (541.45). This minor difference may be due to a random phenomenon of the Monte Carlo simulation. Thus, it is quite reasonable to assume that the target ARL is around 360 and the target SDRL is around 540.
In what follows, we analyze and compare our results based on these target values. Note that we did not consider the values from the control charts based on S and Sw because  they have much smaller ARL values than the others, which is mainly due to the fact that they have serious negative bias even with the case of an equal sample size as seen in Table 1.
In Scenario I, there is a serious difference in terms of the sample sizes (3, 10, 17). The ARL values of the charts based on A and B seriously overshoot the target value and their SDRLs are far above the target value. The ARL values of the charts using S and Sw seriously undershoot the target value while the ARL of the chart using S* has a minor underestimate. These results are closely related to the RE. In Table 2, S ¯ and S ¯ w have noticeable negative values of the biases while S ¯ * has a decent negative value. This implies that the scale estimates underestimate the true value, which results in a narrower control chart. Thus, using a narrower control chart, one can have a smaller ARL. When it comes to the SDRL, the SDRLs of the charts using A and B seriously overshoot the target and those using S and Sw undershoot the target. The chart using S* overshoots the target. These results are similar to those with the RE.
In Scenarios II–IV, we have similar observations as noticed in Scenario I. Because there is a less serious difference in sample sizes, the observations are not so dramatic as seen in Scenario I, but their tendencies are quite similar. For example, in Scenario IV, the sample sizes are very slightly different, so these results are quite close to those in Scenario V (equal sample size).
In Scenario V, as mentioned earlier, we considered this as a reference. In this case, we can also observe that the charts using S and Sw have the same results and their ARL values seriously undershoot the target value. In Remark 4, we pointed out S ¯ w = S ¯ for the case of an equal sample size. We can also observe this in Figure 1a. As mentioned earlier, their underperformance is due to a serious negative bias as seen in Table 1. The results show that the charts based on C and D perform very well. However, when the samples are very small (small number of samples with small sample size), we expect that the chart based on D can be slightly better than that based on C as shown in Figure 1. However, in practice, with a decent size of samples, one can use the charts based on either C or D.

7. Illustrative Examples

Here, we provide three real-data examples to illustrate the applications of the proposed methods into the control charts. All computations were analyzed using the R language [41,42]. The R functions for the X ¯ and S charts can be obtained in Appendix B.
Example 1.
We consider the data set presented earlier in Table 30 in Section 3.31 of ASTM [10]. The data sets were obtained from ten shipments whose sample sizes were not equal. The sample sizes are 50, 50, 100, 25, 25, 50, 100, 50, 50, 50. The corresponding sample means are given by 55.7, 54.6, 52.6, 55.0, 53.4, 55.2, 53.3, 52.3, 53.7, 54.3 and the corresponding sample standard deviations are 4.35, 4.03, 2.43, 3.56, 3.10, 3.30, 4.18, 4.30, 2.09, 2.67, respectively. Using these, we have X ¯ ¯ A = 54.01 , X ¯ ¯ B = 53.8 , S ¯ A = 3.420251 , S ¯ B = 3.420254 , S ¯ C = 3.405517 , and  S ¯ D = 3.491055 .
Using the R functions provided in Appendix B, we can obtain the following results. In the R function, A, B, C, and D denote the control limits based on S ¯ A , S ¯ B , S ¯ C , S ¯ D , respectively. The R functionXbarchart()uses X ¯ ¯ B as a default for the CL of the X ¯ chart since X ¯ ¯ B performs better as seen in Section 4.1.
> ni = c(50, 50, 100, 25, 25, 50, 100, 50, 50, 50)
> Xbari = c(55.7, 54.6, 52.6, 55.0, 53.4, 55.2, 53.3, 52.3, 53.7, 54.3)
> Si = c(4.35, 4.03, 2.43, 3.56, 3.10, 3.30, 4.18, 4.30, 2.09, 2.67)
> Xbarchart(Xbari, Si, ni=ni, nk=25)
   LCL CL  UCL
A 51.74785 53.8 55.85215
B 51.74785 53.8 55.85215
C 51.75669 53.8 55.84331
D 51.70537 53.8~55.89463
 
> Schart(Si, ni, nk=25)
   LCL  CL  UCL
A 1.911697 3.384818 4.857940
B 1.911699 3.384822 4.857945
C 1.903462 3.370238 4.837013
D 1.951272 3.454889 4.958505
The control limits are calculated using Methods A–D and the results are summarized in Table 4. The results using Methods A–C are quite close but those using Method D are slightly different from others.
In addition, the variances of S ¯ A , S ¯ B , S ¯ C , S ¯ D , and S ¯ E are obtained from Equations (14)–(16), (18), and (34), respectively, and we have Var ( S ¯ A ) = 0.0011375146 · σ 2 , Var ( S ¯ B ) = 0.0011348232 · σ 2 , Var ( S ¯ C ) = 0.0009301593 · σ 2 , Var ( S ¯ D ) = 0.0009263542 · σ 2 , and  Var ( S ¯ E ) = 0.0009111612 · σ 2 . Thus, the REs with respect to S ¯ E are easily obtained from Equation (35) and we have RE ( S ¯ A ) = 80.10 % , RE ( S ¯ B ) = 80.29 % , RE ( S ¯ C ) = 97.96 % , and  RE ( S ¯ D ) = 98.36 % .
Example 2.
We consider the data set presented earlier in Table 32 in Section 3.31 of ASTM [10]. The data sets were obtained from 21 tension testing machines whose sample sizes were not equal. The sample sizes are 5, 5, 5, 5, 5, 5, 4, 5, 5, 5, 5, 5, 5, 5, 5, 4, 5, 5, 5, 5, 5. The corresponding sample means are given by 73.8, 71.0, 74.2, 71.0, 70.0, 67.0, 73.5, 71.2, 71.2, 71.2, 71.6, 71.2, 74.2, 74.6, 72.4, 75.3, 69.0, 71.8, 72.8, 69.8, 69.00, and the corresponding sample standard deviations are 1.10, 0.71, 0.45, 1.41, 0.00, 2.35, 1.91, 1.79, 0.45, 0.45, 0.55, 0.55, 0.84, 0.55, 0.55, 0.50, 0.71, 0.84, 0.45, 1.30, 0.00, respectively. Using these, we have X ¯ ¯ A = 71.70476 , X ¯ ¯ B = 71.65243 , S ¯ A = 0.8869858 , S ¯ B = 0.8861882 , S ¯ C = 0.8762927 , and  S ¯ D = 1.014672 .
Using the R functions in Appendix B with the above data sets, the control limits are easily calculated and we summarized the results in Table 5.
As was illustrated in Example 1, the variances of S ¯ A , S ¯ B , S ¯ C , S ¯ D , and S ¯ E are obtained from Equations (14)–(16), (18), and (34), respectively, and we have Var ( S ¯ A ) = 0.006484797 · σ 2 , Var ( S ¯ B ) = 0.006477515 · σ 2 , Var ( S ¯ C ) = 0.006434091 · σ 2 , Var ( S ¯ D ) = 0.006116037 · σ 2 , and  Var ( S ¯ E ) = 0.004913916 · σ 2 . Then, the REs with respect to S ¯ E are given by RE ( S ¯ A ) = 75.78 % , RE ( S ¯ B ) = 75.86 % , RE ( S ¯ C ) = 76.37 % , and  RE ( S ¯ D ) = 80.34 % .
Example 3.
We consider the data set presented earlier in Table 6.4 of Montgomery [9] which includes 113 measurements (in millimeters) of the diameters of piston rings for an automotive engine produced by a forging process. The data were obtained from 25 samples and the sample sizes are given by 5, 3, 5, 5, 5, 4, 4, 5, 4, 5, 5, 5, 3, 5, 3, 5, 4, 5, 5, 3, 5, 5, 5, 5, 5. The corresponding sample means are calculated as 74.010, 73.996, 74.008, 74.003, 74.003, 73.996, 73.999, 73.997, 74.004, 73.998, 73.994, 74.001, 73.994, 73.990, 74.008, 73.997, 73.999, 74.007, 73.998, 74.008, 74.000, 74.002, 74.002, 74.005, 73.998, and the corresponding sample standard deviations are 0.0148, 0.0046, 0.0147, 0.0091, 0.0122, 0.0099, 0.0055, 0.0123, 0.0064, 0.0063, 0.0029, 0.0042, 0.0100, 0.0153, 0.0087, 0.0078, 0.0115, 0.0070, 0.0085, 0.0068, 0.0122, 0.0074, 0.0119, 0.0087, 0.0162. Using these, we have X ¯ ¯ A = 74.00068 , X ¯ ¯ B = 74.00066 , S ¯ A = 0.01010231 , S ¯ B = 0.01012067 , S ¯ C = 0.01030545 , and  S ¯ D = 0.01032266 .
Similar to the two examples above, we can obtain the control limits using these data sets. The control limits are calculated and summarized in Table 6. In addition, the variances of S ¯ A , S ¯ B , S ¯ C , S ¯ D , and S ¯ E are obtained as Var ( S ¯ A ) = 0.006472658 · σ 2 , Var ( S ¯ B ) = 0.006390116 · σ 2 , Var ( S ¯ C ) = 0.006020000 · σ 2 , Var ( S ¯ D ) = 0.005697867 · σ 2 , and  Var ( S ¯ E ) = 0.004474206 · σ 2 . Using these, the REs with respect to S ¯ E are calculated as RE ( S ¯ A ) = 69.12 % , RE ( S ¯ B ) = 70.02 % , RE ( S ¯ C ) = 74.32 % , and  RE ( S ¯ D ) = 78.52 % .

8. Conclusions

In this paper, we have considered several unbiased location and scale estimators for the process parameters of the X ¯ and S control charts when the sample sizes are not necessarily equal. These estimators are essential for constructing the control limits of the Shewhart-type control charts. A natural question is: among these unbiased estimators, which one should be recommended in practical applications? We clarified this question by providing the inequality relations among the variances of these estimators through the rigorous proofs. We also showed that the conventional ad hoc methods could result in degraded performance of the control charts, mainly because the adopted estimators are all biased and they actually tend to underestimate the true scale parameter.
We also provided the relative efficiency of the methods along with the conventional methods and the empirical estimates of the ARL and the SDRL through using the extensive Monte Carlo simulations. We observed that the chart based on S ¯ D outperforms the others under consideration from both theoretical and numerical points of view. The only difficulty of using S ¯ D lies in calculating the normal-consistent unbiasing factor c 4 for a large argument value (large sample size) without using a professional software. To resolve this problem, we provided an approximation of the c 4 as a function of a sample size which can be easily calculated with a general calculator for the case of a large argument value and is accurate within one unit in the ninth decimal place.
All of the theoretical results revealed an interesting and useful connection between the two fields of statistics and mathematics. For example, the normal-consistent unbiasing factor c 4 can be expressed as a function through using the Watson representation, which helps one to understand the behavior of the c 4 in depth. We thus expect that the new findings about the c 4 can help quality engineers develop more useful results in various engineering statistics fields.
It is noteworthy to mention that all the theoretical and empirical results of this paper require the assumption of the homoscedasticity case of σ . In an ongoing work, we investigate these estimators more thoroughly for the heteroscedasticity case.

Author Contributions

C.P. developed methodology and R functions; M.W. investigated mathematical formulas; and C.P. and M.W. wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2017R1A2B4004169).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CLcenter line
LCLlower control limit
UCLupper control limit
ARLaverage run length
SDRLstandard deviation of run length
FARfalse alarm rate
MSEmean square error
BLUEbest linear unbiased estimator
UMVUEuniform minimum variance unbiased estimator
RErelative efficiency

Appendix A. Calculation of c 4 (n)

Many textbooks provide the table for the values of c 4 ( n ) . However, to our knowledge, this is very limited to the case for n 25 . To construct the control charts using the proposed methods, it is important to calculate c 4 ( n ) accurately, especially for n > 25 .
The calculation of c 4 ( n ) needs to calculate the gamma function or the factorial which runs into overflow errors for a large value of n. To avoid this problem, one can use the log-gamma function to calculate c 4 ( n ) , or an approximation technique for c 4 ( n ) can be used for an easier and simpler calculation. For example, two well-known approximations below are widely used for c 4 ( n ) :
c 4 a ( n ) 4 n 4 4 n 3 and c 4 b ( n ) 4 n 5 4 n 3 .
For more details, refer to Chapter 3 of ASTM [10]. These approximations are based on the Stirling’s formula [43] and accurate within one unit in the fourth and fifth decimal places for n > 25 , respectively.
However, we can obtain a better approximation using the Wallis’ inequality. For more details on the Wallis’ inequality, one can refer to Wallis [15], Stedall [44], and Kazarinoff [45]. Here, we provide two approximations such that
c 4 c ( n ) 1 n 1 · n 2 3 n + 5 2 4
and
c 4 d ( n ) 1 n 1 · n 4 6 n 3 + 14 n 2 15 n + 6 8 ,
which are accurate within one unit in the seventh and ninth decimal places, respectively, for  n > 25 . For more details, see Mortici [16].
To compare the approximations above with the true value, we calculate the relative error times 10 6 of each approximation which is given by
ϵ j ( n ) = c 4 j ( n ) c 4 ( n ) c 4 ( n ) × 10 6 ,
where j = a , b , c , d . The relative errors for n = 10 , 20 , 30 , 40 , 50 are calculated and summarized in Table A1. As shown in the table, c 4 d ( n ) provides the best approximation and the accuracy gets better for a larger value of n as expected. We can also observe that the approximation is quite accurate even for smaller values of n so that it can be practically used for n 10 , say. This approximation can be useful for field engineers and practitioner because c 4 d ( n ) can be calculated using a regular calculator.
Table A1. The relative errors ϵ j ( n ) for n = 10 , 20 , 30 , 40 , 50 .
Table A1. The relative errors ϵ j ( n ) for n = 10 , 20 , 30 , 40 , 50 .
n1020304050
c 4 ( n ) 128 105 2 π 65536 230945 38 π 33554432 145422675 58 π 34359738368 172308161025 78 π 70368744177664 56433306445425 2 π
ϵ a ( n ) 322.516691879.76163248435.240593213819.756642839012.6174044235
ϵ b ( n ) 63.48467956.8141164241.91955174020.78966776900.3982547452
ϵ c ( n ) 5.79069640.2648614320.04722140770.01419959680.0056418668
ϵ d ( n ) 0.15476490.0015356050.00011588040.00001912780.0000047817

Appendix B. R Codes for Illustrative Examples

Xbarchart = function(Xbari,Si,ni,nk,CL=c(“XB”,“XA”),FAR=0.002699796){
 CL = match.arg(CL)
 if (CL==“XA”) {
  Xbarbar = mean(Xbari)
 }
 else if (CL==“XB”) {
  Xbarbar = sum(ni*Xbari) / sum(ni)
 }
 else {
   stop(“Choose the Xbarbar: \“XA\” or \“XB\”.”)
 }
 N = sum(ni)
 m = length(ni)
 c4ni = sqrt(2/(ni-1))*exp(lgamma(ni/2) - lgamma((ni-1)/2))
 one.minus.c4sq = 1-c4ni^2
 S = numeric(4)
 S[1] = sum(Si/c4ni) / m
 S[2] = sum(Si) / sum(c4ni)
 S[3] = sum(c4ni*Si/one.minus.c4sq) / sum(c4ni^2/one.minus.c4sq)
 c4Nm1 = sqrt(2/(N-m))*exp(lgamma((N-m+1)/2) - lgamma((N-m)/2))
 S[4] = sqrt( sum((ni-1)*Si^2) / (N-m) ) / c4Nm1
 z.cut = qnorm(1-FAR/2)
 OUT = array(dim=c(4,3))
 for ( i in 1:4 ) {
    OUT[i,] = Xbarbar + c(-z.cut*S[i]/sqrt(nk),0,z.cut*S[i]/sqrt(nk))
 }
 colnames(OUT) = c(“LCL”, “CL”, “UCL”)
 rownames(OUT) = c(“A”, “B”, “C”, “D”)
 return(OUT)
}
Schart = function(Si,ni,nk,FAR=0.002699796){
 z.cut = qnorm(1-FAR/2)
 c4nk = sqrt(2/(nk-1))*exp(lgamma(nk/2) - lgamma((nk-1)/2))
 c4ni = sqrt(2/(ni-1))*exp(lgamma(ni/2) - lgamma((ni-1)/2))
 N = sum(ni)
 m = length(ni)
 one.minus.c4sq = 1-c4ni^2
 S = numeric(4)
 S[1] = sum(Si/c4ni) / m
 S[2] = sum(Si) / sum(c4ni)
 S[3] = sum(c4ni*Si/one.minus.c4sq) / sum(c4ni^2/one.minus.c4sq)
 c4Nm1 = sqrt(2/(N-m))*exp(lgamma((N-m+1)/2) - lgamma((N-m)/2))
 S[4] = sqrt( sum((ni-1)*Si^2) / (N-m) ) / c4Nm1
 OUT = array(dim=c(4,3))
 for ( i in 1:4 ) {
   CL = c4nk*S[i]
   LCL = max(CL - z.cut*sqrt(1-c4nk^2)*S[i], 0)
   UCL = CL + z.cut*sqrt(1-c4nk^2)*S[i]
   OUT[i,] = c(LCL, CL, UCL)
 }
 colnames(OUT) = c(“LCL”, “CL”, “UCL”)
 rownames(OUT) = c(“A”, “B”, “C”, “D”)
 return(OUT)
}

References

  1. Shewhart, W.A. Quality Control Charts. Bell Syst. Tech. J. 1926, 5, 593–603. [Google Scholar] [CrossRef]
  2. Shewhart, W.A. Quality Control. Bell Syst. Tech. J. 1927, 6, 722–735. [Google Scholar] [CrossRef]
  3. Shewhart, W.A. Economic Control of Quality of Manufactured Product; Van Nostrand Reinhold: Princeton, NJ, USA, 1931. [Google Scholar]
  4. Kourti, T.; MacGregor, J.F. Multivariate SPC Methods for Process and Product Monitoring. J. Qual. Technol. 1996, 28, 409–428. [Google Scholar] [CrossRef]
  5. Flores, M.; Fernández-Casal, R.; Naya, S.; Tarrío-Saavedra, J.; Bossano, R. ILS: An R package for statistical analysis in Interlaboratory Studies. Chemom. Intell. Lab. Syst. 2018, 181, 11–20. [Google Scholar] [CrossRef]
  6. Golshan, M.; MacGregor, J.F.; Bruwer, M.J.; Mhaskar, P. Latent Variable Model Predictive Control (LV-MPC) for trajectory tracking in batch processes. J. Process Control 2010, 20, 538–550. [Google Scholar] [CrossRef]
  7. Flores, M. qcr: Quality Control Review; R Package Version 1.2. Available online: https://cran.r-project.org/package=qcr (accessed on 21 April 2020).
  8. Flores, M.; Naya, S.; Fernández-Casal, R.; Zaragoza, S.; Raña, P.; Tarrío-Saavedra, J. Constructing a Control Chart Using Functional Data. Mathematics 2020, 8, 58. [Google Scholar] [CrossRef] [Green Version]
  9. Montgomery, D.C. Statistical Quality Control: An Modern Introduction, 7th ed.; John Wiley & Sons: Singapore, 2013. [Google Scholar]
  10. ASTM E11. Manual on Presentation of Data and Control Chart Analysis, 9th ed.; Luko, S.N., Ed.; American Society for Testing and Materials: West Conshohocken, PA, USA, 2018. [Google Scholar]
  11. ASQC. ASQC Standard A-1 (Proposed): Definitions, Symbols, Formulas and Tables for Control Charts. Ind. Qual. Control 1967, 24, 217–221. [Google Scholar]
  12. Vardeman, S.B. A brief tutorial on the estimation of the process standard deviation. IIE Trans. 1999, 31, 503–507. [Google Scholar] [CrossRef]
  13. Burr, I.W. Control Charts for Measurements with Varying Sample Sizes. J. Qual. Technol. 1969, 1, 163–167. [Google Scholar] [CrossRef]
  14. Watson, G.N. A Note on Gamma Functions. Edinb. Math. Notes 1959, 42, 7–9. [Google Scholar] [CrossRef] [Green Version]
  15. Wallis, J. Arithmetica Infinitorum; University of Oxford: Oxford, UK, 1656. [Google Scholar]
  16. Mortici, C. New approximation formulas for evaluating the ratio of gamma functions. Math. Comput. Model. 2010, 52, 425–433. [Google Scholar] [CrossRef]
  17. Chebyshev, P.L. Sur les expressions approximatives des intégrales définies par les autres prises entre les même limites. In Oeuvres de P. L. Tchebychef I–II, Vol. 2; Markov, A.A., Sonin, N., Eds.; Imprimerie de l’Academie Imperiale des Sciences: St. Petersbourg, Russia, 1882; pp. 716–719. [Google Scholar]
  18. Chebyshev, P.L. Sur une série qui fournit les valeurs extrêmes des intégrales, lorsque la fonction sous le signe est décomposée en deux facteurs. In Oeuvres de P. L. Tchebychef I–II, Vol. 2; Markov, A.A., Sonin, N., Eds.; Imprimerie de l’Academie Imperiale des Sciences: St. Petersbourg, Russia, 1883; pp. 405–419. [Google Scholar]
  19. Besenyei, A. Picard’s Weighty Proof of Chebyshev’s Sum Inequality. Math. Mag. 2018, 91, 366–371. [Google Scholar] [CrossRef]
  20. Hardy, G.H.; Littlewood, J.E.; Pólya, G. Inequalities; Cambridge University Press: London, UK, 1934. [Google Scholar]
  21. Bustoz, J.; Ismail, M.E.H. On Gamma Function Inequalities. Math. Comput. 1986, 47, 659–667. [Google Scholar] [CrossRef]
  22. Feller, W. An Introduction to Probability Theory and Its Applications, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1966; Volume II. [Google Scholar]
  23. Merkle, M. Completely Monotone Functions: A Digest. In Analytic Number Theory, Approximation Theory, and Special Functions; Milovanović, G.V., Rassias, M.T., Eds.; Springer: New York, NY, USA, 2014; pp. 347–364. [Google Scholar]
  24. Fink, A. Kolmogorov-Landau inequalities for monotone functions. J. Math. Anal. Appl. 1982, 90, 251–258. [Google Scholar] [CrossRef] [Green Version]
  25. Niculescu, C.P.; Persson, L.E. Convex Functions and Their Applications: A Contemporary Approach; Springer: New York, NY, USA, 2006. [Google Scholar]
  26. Van Haeringen, H. Inequalities for Real Powers of Completely Monotonic Functions. J. Math. Anal. Appl. 1997, 210, 102–113. [Google Scholar] [CrossRef] [Green Version]
  27. Merkle, M. Logarithmic Convexity and Inequalities for the Gamma Function. J. Math. Anal. Appl. 1996, 203, 369–380. [Google Scholar] [CrossRef] [Green Version]
  28. Schilling, R.L. Measures, Integrals and Martingales; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  29. Casella, G.; Berger, R.L. Statistical Inference, 2nd ed.; Duxbury: Pacific Grove, CA, USA, 2002. [Google Scholar]
  30. Lehmann, E.L.; Casella, G. Theory of Point Estimation, 2nd ed.; Springer: New York, NY, USA, 1998. [Google Scholar]
  31. Lehmann, E.L. Elements of Large-Sample Theory; Springer: New York, NY, USA, 1999. [Google Scholar]
  32. Gwanyama, P.W. The HM-GM-AM-QM Inequalities. Coll. Math. J. 2004, 35, 47–50. [Google Scholar] [CrossRef]
  33. Park, C.; Leeds, M. A Highly Efficient Robust Design Under Data Contamination. Comput. Ind. Eng. 2016, 93, 131–142. [Google Scholar] [CrossRef]
  34. Park, C.; Ouyang, L.; Byun, J.H.; Leeds, M. Robust design under normal model departure. Comput. Ind. Eng. 2017, 113, 206–220. [Google Scholar] [CrossRef]
  35. Ouyang, L.; Park, C.; Byun, J.H.; Leeds, M. Robust Design in the Case of Data Contamination and Model Departure. In Statistical Quality Technologies: Theory and Practice (ICSA Book Series in Statistics); Lio, Y., Ng, H., Tsai, T.R., Chen, D.G., Eds.; Springer: Cham, Switzerland, 2019; pp. 347–373. [Google Scholar]
  36. Anderson, T.W. An Introduction to Multivariate Statistical Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
  37. Johnson, R.A.; Wichern, D.W. Applied Multivariate Statistical Analysis, 6th ed.; Prentice-Hall Inc.: Englewood Cliffs, NJ, USA, 2007. [Google Scholar]
  38. Vining, G. Technical Advice: Phase I and Phase II Control Charts. Qual. Eng. 2009, 21, 478–479. [Google Scholar] [CrossRef]
  39. Arnold, S.F. Mathematical Statistics; Prentice-Hall: Englewood Cliffs, NJ, USA, 1990. [Google Scholar]
  40. Ryan, T.P. Statistical Methods For Quality Improvement, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2000. [Google Scholar]
  41. Gentleman, R.; Ihaka, R. The R language. In Proceedings of the 28th Symposium on the Interface; Billard, L., Fisher, N., Eds.; The Interface Foundation of North America: Fairfax Station, VA, USA, 1991. [Google Scholar]
  42. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: http://www.r-project.org (accessed on 29 April 2020).
  43. Robbins, H. A Remark on Stirling’s Formula. Am. Math. Mon. 1955, 62, 26–29. [Google Scholar] [CrossRef]
  44. Stedall, J.A. Arithmetica Infinitorum: John Wallis 1656; Springer: New York, NY, USA, 2004; (English Translation from the Original BookWritten in Latin). [Google Scholar]
  45. Kazarinoff, D.K. On Wallis’ formula. Edinb. Math. Notes 1956, 40, 19–21. [Google Scholar] [CrossRef] [Green Version]
Figure 1. (a) n 1 = n 2 = n 3 = 3 , (b) n 1 = n 2 = n 3 = 10 , and (c) n 1 = n 2 = n 3 = 20 .
Figure 1. (a) n 1 = n 2 = n 3 = 3 , (b) n 1 = n 2 = n 3 = 10 , and (c) n 1 = n 2 = n 3 = 20 .
Mathematics 08 00698 g001
Figure 2. (a) n 1 = 3 , n 2 = 5 , n 3 = 7 , (b) n 1 = 5 , n 2 = 10 , n 3 = 15 , and (c) n 1 = 10 , n 2 = 20 , n 3 = 30 .
Figure 2. (a) n 1 = 3 , n 2 = 5 , n 3 = 7 , (b) n 1 = 5 , n 2 = 10 , n 3 = 15 , and (c) n 1 = 10 , n 2 = 20 , n 3 = 30 .
Mathematics 08 00698 g002
Table 1. The empirical bias, variance, MSE, and RE with an equal sample size.
Table 1. The empirical bias, variance, MSE, and RE with an equal sample size.
S ¯ A S ¯ B S ¯ C S ¯ D S ¯ E S ¯ S ¯ * S ¯ w
( n 1 = n 2 = n 3 = 3 )
Bias0.00050.00050.00050.00040.0008 1.1372 0.0005 1.1372
Var9.10879.10879.10878.65036.43547.15399.10877.1539
MSE9.10879.10879.10878.65036.43548.44739.10878.4473
RE0.70650.70650.70650.74391.00000.76180.70650.7618
( n 1 = n 2 = n 3 = 10 )
Bias0.00010.00010.00010.00020.0002 0.2733 0.0001 0.2733
Var1.90071.90071.90071.86891.73881.79821.90071.7982
MSE1.90071.90071.90071.86891.73881.87291.90071.8729
RE0.91480.91480.91480.93041.00000.92840.91480.9284
( n 1 = n 2 = n 3 = 20 )
Bias0.00010.00010.00010.00010.0002 0.1305 0.0001 0.1305
Var0.88850.88850.88850.88110.85110.86550.88850.8655
MSE0.88850.88850.88850.88110.85110.88250.88850.8825
RE0.95780.95780.95780.96601.00000.96440.95780.9644
Table 2. The empirical bias, variance, MSE, and RE with unequal sample sizes.
Table 2. The empirical bias, variance, MSE, and RE with unequal sample sizes.
S ¯ A S ¯ B S ¯ C S ¯ D S ¯ E S ¯ S ¯ * S ¯ w
( n 1 = 3 , n 2 = 5 , n 3 = 7 )
Bias0.00080.00080.00090.00090.0007 0.7140 0.1211 0.6164
Var5.46275.29364.38504.25003.63374.56405.16533.8867
MSE5.46275.29364.38504.25003.63375.07385.18004.2667
RE0.66520.68640.82870.85501.00000.71620.70150.8517
( n 1 = 5 , n 2 = 10 , n 3 = 15 )
Bias0.00060.00060.00030.00030.0002 0.3496 0.0783 0.2792
Var2.50132.45121.89901.86851.73882.28252.41261.7990
MSE2.50132.45121.89901.86851.73882.40472.41881.8770
RE0.69520.70940.91570.93061.00000.72310.71890.9264
( n 1 = 10 , n 2 = 20 , n 3 = 30 )
Bias 0.0001 0.0001 0.00010.00010.0002 0.1634 0.0331 0.1319
Var1.12291.11380.88850.88110.85111.07771.10640.8657
MSE1.12291.11380.88850.88110.85111.10441.10750.8831
RE0.75790.76410.95790.96591.00000.77060.76840.9637
Table 3. Estimated ARL and SDRL with n k = 10 .
Table 3. Estimated ARL and SDRL with n k = 10 .
ABCD S ¯ S ¯ * S ¯ w
Scenario I
ARL475.03456.02363.61361.84257.78343.39270.79
SDRL1301.181184.11536.61531.45499.21777.03387.18
Scenario II
ARL390.41388.19364.59362.56269.79357.12274.27
SDRL654.28643.59540.99533.90421.37586.10391.26
Scenario III
ARL370.63370.25363.84361.77273.95361.84275.39
SDRL565.01563.36538.59530.64400.49549.63392.54
Scenario IV
ARL364.28364.23363.63361.77275.28363.39275.32
SDRL539.37539.16537.54531.26393.40537.95392.44
Scenario V (equal sample size)
ARL364.36364.36364.36362.58275.95364.36275.95
SDRL541.45541.45541.45537.31395.12541.45395.12
Table 4. Control limits for the X ¯ and S charts for Example 1.
Table 4. Control limits for the X ¯ and S charts for Example 1.
X ¯ Chart S Chart
Method LCLCLUCL LCLCLUCL
sample size, n k = 25
A 51.7478553.8055.85215 1.9116973.3848184.857940
B 51.7478553.8055.85215 1.9116993.3848224.857945
C 51.7566953.8055.84331 1.9034623.3702384.837013
D 51.7053753.8055.89463 1.9512723.4548894.958505
sample size, n k = 50
A 52.3489153.8055.25109 2.3690283.4028464.436665
B 52.3489153.8055.25109 2.3690303.4028504.436669
C 52.3551653.8055.24484 2.3588233.3881884.417553
D 52.3188753.8055.28113 2.4180703.4732904.528509
sample size, n k = 100
A 52.7739253.8054.82608 2.6833513.4116254.139899
B 52.7739253.8054.82608 2.6833543.4116294.139903
C 52.7783453.8054.82166 2.6717923.3969294.122065
D 52.7526853.8054.84732 2.7389003.4822504.225600
Table 5. Control limits for the X ¯ and S charts for Example 2.
Table 5. Control limits for the X ¯ and S charts for Example 2.
X ¯ Chart S Chart
Method LClCLUCL LCLCLUCL
sample size, n k = 4
A 70.3219571.6524372.98291 00.81719581.851804
B 70.3231471.6524372.98171 00.81646091.850139
C 70.3379971.6524372.96687 00.80734401.829480
D 70.1304271.6524373.17444 00.93483552.118381
sample size, n k = 5
A 70.4624171.6524372.84244 00.83375391.741710
B 70.4634871.6524372.84137 00.83300411.740144
C 70.4767671.6524372.82810 00.82370261.720713
D 70.2911071.6524373.01375 00.95377731.992439
Table 6. Control limits for the X ¯ and S charts for Example 3.
Table 6. Control limits for the X ¯ and S charts for Example 3.
X ¯ Chart S Chart
Method LCLCLUCL LCLCLUCL
sample size, n k = 3
A 73.9831774.0006674.01816 00.0089529370.02299266
B 73.9831374.0006674.01819 00.0089692070.02303445
C 73.9828174.0006674.01851 00.0091329680.02345501
D 73.9827874.0006674.01854 00.0091482160.02349417
sample size, n k = 4
A 73.9855174.0006674.01582 00.0093074350.02109109
B 73.9854874.0006674.01584 00.0093243490.02112941
C 73.9852174.0006674.01612 00.0094945950.02151520
D 73.9851874.0006674.01615 00.0095104460.02155112
sample size, n k = 5
A 73.9871174.0006674.01422 00.0094960240.01983717
B 73.9870974.0006674.01424 00.0095132810.01987322
C 73.9868474.0006674.01449 00.0096869760.02023607
D 73.9868174.0006674.01451 00.0097031480.02026986

Share and Cite

MDPI and ACS Style

Park, C.; Wang, M. A Study on the X ¯ and S Control Charts with Unequal Sample Sizes. Mathematics 2020, 8, 698. https://doi.org/10.3390/math8050698

AMA Style

Park C, Wang M. A Study on the X ¯ and S Control Charts with Unequal Sample Sizes. Mathematics. 2020; 8(5):698. https://doi.org/10.3390/math8050698

Chicago/Turabian Style

Park, Chanseok, and Min Wang. 2020. "A Study on the X ¯ and S Control Charts with Unequal Sample Sizes" Mathematics 8, no. 5: 698. https://doi.org/10.3390/math8050698

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop