Next Article in Journal
Classes of Harmonic Functions Related to Mittag-Leffler Function
Previous Article in Journal
Generalized Cauchy–Schwarz Inequalities and A-Numerical Radius Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Class of Quantile Regression Ratio-Type Estimators for Finite Population Mean in Stratified Random Sampling

Department of Statistics, Faculty of Science, Cankiri Karatekin University, Cankiri 18100, Türkiye
*
Author to whom correspondence should be addressed.
Axioms 2023, 12(7), 713; https://doi.org/10.3390/axioms12070713
Submission received: 1 June 2023 / Revised: 18 July 2023 / Accepted: 20 July 2023 / Published: 22 July 2023
(This article belongs to the Section Mathematical Analysis)

Abstract

:
Quantile regression is one of the alternative regression techniques used when the assumptions of classical regression analysis are not met, and it estimates the values of the study variable in various quantiles of the distribution. This study proposes ratio-type estimators of a population mean using the information on quantile regression for stratified random sampling. The proposed ratio-type estimators are investigated with the help of the mean square error equations. Efficiency comparisons between the proposed estimators and classical estimators are presented in certain conditions. Under these obtained conditions, it is seen that the proposed estimators outperform the classical estimators. In addition, the theoretical results are supported by a real data application.

1. Introduction

Sampling theory is recognized as the creation of a sample set. Today, the sampling method is used in many research fields, such as science, engineering, health and social sciences, opinion polls, and marketing research. It is easy to control the sample compared to the population. For these reasons, researchers prefer to work on the sample instead of the population. There are many sampling methods that can be used in applications. If the population is homogeneous, simple random sampling is the most frequently used sampling method, and if the population is heterogeneous, stratified random sampling is the most frequently used [1]. Stratified sampling is used when there are substrata or subunit groups in a framed population. With stratified sampling, inferences are made on the population in terms of the presence of substrates. This method ensures that a heterogeneous population is divided into homogeneous stratums and increases its sensitivity [2]. Also, stratified sampling is useful for comparing estimates among various groups of the population [3].
In sample research, it is very common to use auxiliary information to increase the precision and efficiency of estimators in estimating sum, mean, and variance for finite populations. Auxiliary information is used in ratio, multiplicative, regression, and difference estimators because of precision. These estimators provide an advantage in terms of the correlation between the auxiliary variable and the variable of study. Under some conditions, they give more sensitive estimations with small variance than estimators based on the simple mean [4]. The ratio-type estimator is one of the most commonly used estimators in estimating the sum of the finite population with the help of the auxiliary variable when the correlation coefficients between two variables are positive [5]. There are many studies on ratio-type estimators in the literature. Kadilar and Cingi [6] proposed a ratio estimator in stratified random sampling based on the Prasad [7] estimator. They showed that their proposed estimator gave more efficient results than the combined ratio estimation. Shabbir and Gupta [8] proposed an estimator using a simple transformation introduced by Bedi [9]. They demonstrated by theoretical and numerical results that the proposed estimator is more efficient than the classical combined ratio estimator and the Kadılar and Cingi [6] ratio estimator. On the other hand, Singh et al. [10], using the estimators of Bahl and Tuteja [11] and Kadilar and Cingi [12], proposed a ratio estimator for the estimation of population mean in stratified random sampling. They found that the proposed estimators are more efficient than other estimators with theoretical findings, and they supported it with a numerical example. Shahzad et al. [13] proposed an estimator for the estimation of the population mean for stratified randomness. Hussain et al. [14] proposed two estimators to estimate the finite population distribution function using additional information about the distribution function and the mean of the auxiliary variable under simple sampling. Muneer et al. [15] proposed a family of exponential ratio-type estimators for estimating the finite population mean in stratified random sampling. Cekim and Kadilar [16] proposed a new rate estimator for population variance using the ln function in stratified random sampling. Especially in recent years, there are many estimators proposed depending on the data structure. For example, when there is an outlier in the data, it has a negative effect on the estimators. Robust methods are used to eliminate this negative effect on estimators (Kadilar et al. [17], Subzar et al. [18], Zaman and Bulut [19], Zaman and Bulut [20], Zaman [21], Ali et al. [22], and Grover and Kaur [23]); Koc [24] providing a class of estimators for population mean using Poisson regression is a case of count data.
Quantile regression is useful for visualizing changes in the conditional distribution of datasets and is a very effective method, especially when there are extreme values [25]. Shahzad et al. [26] proposed a class of quantile regression–ratio-type estimators for the population mean when the data are non-normal and contaminated with outliers. Anas et al. [27] presented a class of quantile regression–ratio-type estimators using L-moments to estimate the population mean for the non-normal dataset having outliers in simple random sampling. Anas et al. [28] have presented a modified class of estimators by adapting the idea of Zaman and Bulut [19,20]. Subsequently, they have defined a class of quantile regression-type estimators, which is an effective technique in the presence of extreme observations. Thus, the utilization of quantile regression from Zaman and Bulut’s work has empowered the proposed class of estimators, especially for estimating the population mean in the presence of missing data. Shahzad et al. [29] introduced a robust class of separate-type quantile regression estimators specifically designed to estimate the population means under a stratified random sampling design. Rueda and Arcos [30] investigated the application of the exponentiation method in estimating population quantiles. They developed a modified ratio estimator that is applicable to any sampling design. This modified estimator exhibits a smaller mean squared error when compared to both the conventional estimator and the ratio estimator. Shahzad et al. [31] proposed a robust estimation technique for the population mean utilizing quantile regression in the context of systematic sampling. Shahzad et al. [32] proposed the utilization of quantile regression with minimum covariance determinant-based measures of the location to derive a class of quantile regression-type mean estimators.
This study proposes ratio-type estimators of a population mean using the information on quantile regression for stratified random sampling.
Let the N -sized population consist of L stratum, as in N 1 , N 2 , , N L , which do not intersect and form the whole population. We denote the stratum with the h and the unit with the i . The subscript “st” represents “stratified”. Let n 1 ,   n 2   n L   L  samples be drawn from these stratums, each of which is considered as a separate population. In stratified random sampling, the mean estimators of study variable Y and auxiliary variable X  are as follows:
y ¯ st = h = 1 L W h y ¯ h ,
x ¯ st = h = 1 L W h x ¯ h ,
where y ¯ h = 1 n h i = 1 n h y hi is the sample mean of the study variable in the h th stratum, x ¯ h = 1 n h i = 1 n h x hi is the sample mean of the auxiliary variable in the h th stratum, and W h = N h N is the stratum weight. The population mean for the variable of study Y ¯ and the auxiliary variable X ¯ are as given below:
Y ¯ = Y ¯ st = h = 1 L W h Y ¯ h ,
X ¯ = X ¯ st = h = 1 L W h X ¯ h ,
where Y ¯ h = 1 N h i = 1 N h Y hi is the population mean of the study variable in the h th stratum and X ¯ h = 1 N h i = 1 N h X hi is the population mean of the auxiliary variable in the h th stratum.
The combined regression estimator in stratified random sampling is given in Equation (5), as follows:
Y ¯ ^ lrc = y ¯ st + b c X ¯ x ¯ st ,
where b c is the coefficient of stratified random sampling and is obtained with the classical covariance matrix
b c = h = 1 L W h 2 λ h s yxh h = 1 L W h 2 λ h s xh 2 .
The mean square error of the stratified random sampling combined regression estimator given in Equations (1) and (2) is as given in Equation (7).
MSE ( Y ¯ ^ lrc ) h = 1 L W h 2 λ h S yh 2 + β c 2 h = 1 L W h 2 λ h S xh 2 2 β c h = 1 L W h 2 λ h S yxh ,
where β c = h = 1 L W h 2 λ h S yxh h = 1 L W h 2 λ h S xh 2 is computed by the classic covariance matrix for population and λ h = 1 n h N h n h denotes the correction term, n h indicates the number of units in the stratum h th. Also, S yh 2 is the population variances of the variables of study in the stratum h th, S xh 2 is the population variances of auxiliary variables in the stratum h th, and S yxh is the population covariance in the stratum h th.
In stratified random sampling, the separate regression estimator is as given in Equation (8).
Y ¯ ^ lrs = h = 1 L W h y ¯ h + b h X ¯ h x ¯ h ,
where b h = s xyh s xh 2 is obtained by the least squares method. Also, for the auxiliary variable, s xh 2 is the sample variance in the h th stratum and s xyh is the sample covariance in the h th stratum. The mean square error of the separate regression estimator given in Equation (8) is as given in Equation (9).
MSE ( Y ¯ ^ lrs ) h = 1 L W h 2 λ h S yh 2 + h = 1 L W h 2 λ h β h 2 S xh 2 2 h = 1 L W h 2 λ h β h S yxh ,
where β h = S xyh S xh 2 is the regression coefficient of the least squares method in the h th stratum and S xyh denotes the population covariance in the h th stratum.
Section 2 provides a description of the quantile regression model. The structure of the proposed estimator based on the quantile regression model in stratified random sampling is presented in Section 3. The efficiency comparisons of the proposed estimator with the classical estimator for stratified random sampling are given in Section 4. Section 5 provides an application of proposed estimators. Finally, Section 6 summarizes the results of this study.

2. Quantile Regression Model

The quantile regression is an alternative regression technique that neglects the normal distribution of error terms and constant variance assumption in the classical linear regression model. Since it is a flexible approach, it does not require some assumptions. Quantile regression is a way of estimating the conditional quantities of the distribution of the dependent variable in the linear model [33]. In the quantile regression model, the coefficients are determined depending on the quartiles [34]. In practice, quantile values are usually taken as 0.25, 0.50, and 0.75 [35]. The classical regression model for the average response is given below as follows:
y i = β 0 + β 1 x i 1 + + β k x ik ,   i = 1 , 2 , , n ,
where y i   is the dependent random variable, x ij is the j th independent variable for the i th observation, β 0 , , β k  are regression parameters, and the β j s are estimated by solving the least squares minimization problem.
min β 0 , , β k i = 1 n y i β 0 j = 1 k x ij β j 2 .
In contrast, the regression model for quantile level τ of the response is
Q τ ( y i ) = β 0 τ + β 1 τ x i 1 + + β k τ x ik ,   i = 1 , 2 , , n ,
and the β j τ s are estimated by solving the minimization problem
min β 0 τ , β 1 τ , β k τ i = 1 n ρ τ y i β 0 τ j = 1 k x ij β j τ ,
where ρ τ r = τ   max r , 0 + 1 τ max r , 0 .  The ρ τ r is referred to as the check [36]. The estimation of the covariance matrix in quantile regression models is important due to the examination of assumptions such as constant variance and symmetry. Let 0 < τ 1 < < τ k < 1 and β ^ τ j be the corresponding estimates of β τ j in the quantile regression model for j = 1,..., k.
Here, n β ^ τ β τ L N 0 , Λ τ is provided. β ^ τ has an asymptotic normal distribution [37].
Under alternative assumptions,
Λ τ = τ 1 τ E f U τ 0 x i x i x i 1 E x i x i E f U τ 0 x i x i x i 1
Λ τ = τ 1 τ f U τ 2 0 E x i x i 1
The asymptotic covariance of the estimated β ^ τ parameters in the quantile regression model are derived from the equations provided above. The covariance matrix can be estimated using various estimators [37].

3. Suggested Estimators

For the estimation of the population mean, we propose the following estimators that use the quantile regression method and the quantile variance–covariance matrix instead of the ratio estimators presented in Equations (5) and (8).
For the combined quantile regression estimator
Y ¯ ^ lrcq i ( tk ) = y ¯ st + b c q 1 X ¯ x ¯ st ,           for     q 1 = 0.25 y ¯ st + b c q 2 X ¯ x ¯ st ,           for     q 2 = 0.50 y ¯ st + b c q 3 X ¯ x ¯ st ,           for     q 3 = 0.75   i = 1 , 2 , 3 ,
where b c qi are obtained from the quantile regression covariance matrix for i = 1 , 2 , 3 . The mean squared error of the combined quantile regression estimator is as given in Equation (17).
MSE ( Y ¯ ^ lrcq i ( tk ) ) h = 1 L W h 2 λ h S yhq i 2 + β cq i 2 h = 1 L W h 2 λ h S xhq i 2 2 β cq i h = 1 L W h 2 λ h S yxhq i ,   i = 1 , 2 , 3 .
The mean squared error equations proposed here have the same form as the mean squared error equations given in Equation (7). However, in this case, the values of β c , S yh 2 , S xh 2 , and S yxh are utilized instead of β cq i   ,   S yhq i 2 , S xhq i 2 , and S yxhq i . These values are obtained from the quantile regression covariance matrix β cq i = h = 1 L W h 2 λ h S yxhqi h = 1 L W h 2 λ h S xhqi 2 .
For the separate quantile regression estimator,
Y ¯ ^ lrsq i ( tk ) = h = 1 L W h y ¯ h + b hq 1 X ¯ h x ¯ h       for     q 1 = 0.25 h = 1 L W h y ¯ h + b hq 2 X ¯ h x ¯ h       for     q 2 = 0.50 h = 1 L W h y ¯ h + b hq 3 X ¯ h x ¯ h       for     q 3 = 0.75   i = 1 , 2 , 3
where b hq i is the slope coefficient obtained from the quantile regression model for each stratum for i = 1 , 2 , 3 .
Using Equation (18), the mean squared error equations for the proposed estimator and the variance–covariance matrices related to the quantile regression method are obtained as follows:
MSE ( Y ¯ ^ lrsq i ( tk ) ) h = 1 L W h 2 λ h S yhq i 2 + h = 1 L W h 2 λ h β hq i 2 S xhq i 2 2 h = 1 L W h 2 λ h β hq i S yxhq i ,   i = 1 , 2 , 3   .  
The expression β hq i  for i = 1 , 2 , 3 is obtained by the quantile regression model for the h th stratum; the   S yhq i 2   expressions obtained from the quantile regression covariance matrix show the population variance for the variable of study in the h th stratum. S xhq i 2 is population variance for the auxiliary variable, and S yxhq i is the population covariance.

4. Efficiency Comparisons

We compare the mean square error of the proposed estimators given in Equations (16) and (18) with the mean square error of the classical combined and separate estimators given in Equations (5) and (8).
For the combined quantile regression estimator,
MSE ( Y ¯ ^ lrcq i ( tk ) ) < MSE Y ¯ ^ lrc ,   i = 1 , 2 , 3 ,
h = 1 L W h 2 λ h S yhq i 2 + β cq i 2 h = 1 L W h 2 λ h S xhq i 2 2 β cq i h = 1 L W h 2 λ h S yxhq i < h = 1 l W h 2 λ h S yh 2 + β c 2 h = 1 l W h 2 λ h S xh 2 2 β c h = 1 l W h 2 λ h S yxh
Let   K q i = h = 1 L W h 2 λ h S yhq i 2 ,   M q i = h = 1 L W h 2 λ h S xhq i 2 ,   N q i = h = 1 L W h 2 λ h S yxhq i ,   β cq i = N q i M q i ,   K = h = 1 l W h 2 λ h S yh 2 , M = h = 1 l W h 2 λ h S xh 2 , N   = h = 1 l W h 2 λ h S yxh , and β c = N M .
Thus, Equation (21) becomes
K q i + N q i M q i 2 M q i 2 N q i M q i N q i < K + N M 2 M 2 N M N ,
K q i K N q i 2 M q i N 2 M < 0 ,
when the condition in Equation (23) is satisfied, the proposed estimators given in Equation (16) are more efficient than the regression estimator given in Equation (5).
Similarly, for the separate quantile regression estimator,
MSE ( Y ¯ ^ lrsq i ( tk ) ) < MSE Y ¯ ^ lrs ,   i = 1 , 2 , 3 .
h = 1 L W h 2 λ h S yhq i 2 + h = 1 L W h 2 λ h β hq i 2 S xhq i 2 2 h = 1 L W h 2 λ h β hq i S yxhq i < h = 1 L W h 2 λ h S yh 2 + h = 1 L W h 2 λ h β h 2 S xh 2 2 h = 1 L W h 2 λ h β h S yxh
Let K q i = h = 1 L W h 2 λ h S yhq i 2 , T q i = h = 1 L W h 2 λ h β hq i 2 S xhq i 2 , H q i = h = 1 L W h 2 λ h β hq i S yxhq i   K = h = 1 l W h 2 λ h S yh 2 , T = h = 1 L W h 2 λ h β h 2 S xh 2 , and H = h = 1 L W h 2 λ h β h S yxh .
Thus, Equation (25) becomes
K q i K + T q i T 2 H q i H < 0 ,
when the condition in Equation (26) is satisfied, the proposed estimators belonging to the mean square error given in Equation (18) are more efficient than the regression estimator given in Equation (8).

5. Applications

Three stations (Keçiören, Çubuk, and Sincan) with different characteristics from the air quality monitoring stations in Ankara, Türkiye, were discussed. Particulate Matter (µg/m3) was chosen as the dependent variable from the air pollution parameters selected according to the air quality criteria recommended by the World Health Organization, and the relative humidity (%) from the climate elements was chosen as the independent variable. Daily data from 1 January 2021 to 20 May 2021 were used. The data were obtained from the Turkish State Meteorological Service [URL1]. Analyses were performed using R software, and quartiles of 25%, 50%, and 75% were used in the analysis. We randomly selected samples from each stratum using the proportional and Neyman allocations.
The total numbers of these selected districts are calculated from Equation (27), as follows:
n h = n N h S h h = 1 l N h S h ,   h = 1 , 2 , 3 .
The total numbers of these selected districts result from a proportional allocation from Equation (28), as follows:
n h = n N h N .
The statistics of the original dataset are given in Table 1 and Table 2.
We use a total sample size of n = 135 . According to the proportional allocation, a sample of n 1 = n 2 = n 3 = 45 units from each stratum was randomly selected. According to the Neyman allocation, n 1 = 57 units from the first stratum, n 2 = 23 from the second stratum, and n 3 = 55 from the third stratum were randomly selected. Also, the correlation between the auxiliary variable and the study variable is 0.704. With the help of the summarized information in Table 1 and Table 2, the efficiency conditions of the proposed estimators were obtained as follows:
a. A sample of n 1 = n 2 = n 3 = 45 units are taken from each stratum.
For the combined quantile regression estimator,
i.  q = 0.25 ;
K   = 2.182808962 ,     M = 0.96486519 ,   N = 1.391407393 ,
K q 0.25 = 0.085967 ,     M q 0.25 = 3.38713 × 10 5   N q 0.25 = 0.001536863 ,
K q 0.25 K N q 0.25 2 M q 0.25 N 2 M = 0.001606 < 0 ,
ii.  q = 0.5 ;
K = 2.31991 ,   M = 1.704895 , N = 1.391407393 ,
K q 0.5 = 0.050944 ,   M q 0.5 = 9.88639 × 10 5 ,   N q 0.5 = 0.00128
K q 0.5 K N q 0.5 2 M q 0.5 N 2 M = 1.15 < 0 ,
iii.  q = 0.75 ;
K = 2.31991 , M = 1.445169 , N = 1.391407393 ,
K q 0.75 = 0.159558 ,   M q 0.75 = 7.25245 × 10 5 ,   N q 0.75 = 0.00277 ,
K q 0.75 K N q 0.75 2 M q 0.75 N 2 M = 0.9267 < 0 .
The condition given in Equation (23) is satisfied for the proposed estimators. Under this condition, the proposed quantile-based estimators were more efficient than the classical estimators.
For the separate quantile regression estimator,
i.  q = 0.25 ;
  K = 2.182808962 , T = 0.566246 , H = 1.1755280 ,
K q 0.25 = 0.085967 , T q 0.25 = 0.0617939 , H q 0.25 = 0.0010878 ,
K q 0.25 K + T q 0.25 T 2 H q 0.25 H = 0.248 < 0 .
ii.   q = 0.50 ;
K = 2.319910097 , T = 1.175528076 ,   H = 1.1755280 ,
K q 0.5 = 0.050943988 ,   T q 0.5 = 3.31295 E 05 , H q 0.5 = 0.00100916 ,
K q 0.5 K + T q 0.5 T 2 H q 0.5 H = 1.0913 < 0 .
iii.   q = 0.75 ;
  K = 2.319910097 , T = 1.1146939 , H = 1.17552807 ,
K q 0.75 = 0.159558207 , T q 0.75 = 6.37745 E 05   ,   H q 0.75 = 0.0026841 ,
K q 0.75 K + T q 0.75 T 2 H q 0.75 H = 0.91855 < 0 .
The condition given in Equation (26) is satisfied for the proposed estimators. Under this condition, the proposed quantile-based estimators were more efficient than the classical estimators.
b. A sample of n 1 = 57 , n 2 = 23 , and n 3 = 55 units are taken from the first, second, and third stratum.
For the combined quantile regression estimator,
i.  q = 0.25 ;
K = 1.617135113 ,   M = 1.161654708 , N = 1.203265131 ,
K q 0.25 = 0.06308 , M q 0.25 = 2.82825 × 10 5 , N q 0.25 = 0.001143226 ,
K q 0.25 K N q 0.25 2 M q 0.25 N 2 M = 0.354 < 0 .
ii.   q = 0.5 ;
K = 1.4919338 , M = 1.333797359 ,   N = 0.896308 ,
K q 0.5 = 0.037234445 , M q 0.5 = 0.0001896 , N q 0.5 = 0.00125 ,
K q 0.5 K N q 0.5 2 M q 0.5 N 2 M = 0.861 < 0 .
iii.  q = 0.75 ;
K = 2.293129021 , M = 1.30291628 , N = 1.436153 ,
K q 0.75 = 0.124761986 , M q 0.75 = 8.74612 × 10 5 , N q 0.75 = 0.00232 ,
K q 0.75 K N q 0.75 2 M q 0.75 N 2 M = 0.647 < 0 .
The condition given in Equation (23) is satisfied for the proposed estimators. Under this condition, the proposed quantile-based estimators were more efficient than the classical estimators.
For the separate quantile regression estimator,
i.  q = 0.25 ;
K = 1.617135113   , T = 0.513538133 , H = 0.93379075 ,
K q 0.25 = 0.063089 , T q 0.25 = 0.042644709 , H q 0.25 = 0.0007815 ,
K q 0.25 K + T q 0.25 T 2 H q 0.25 H = 0.1558 < 0   .
ii.  q = 0.50 ;
K = 1.4919338   , T = 0.655266737 , H = 0.655266737 ,
K q 0.5 = 0.037234445 , T q 0.5 = 4.43256 × 10 5 , H q 0.5 = 0.00086146 ,
K q 0.5 K + T q 0.5 T 2 H q 0.5 H = 0 , 79767 < 0 .
iii.  q = 0.75 ;
K = 2.293129021 , T = 0.99851971 , H = 1.145106206   ,
K q 0.75 = 0.124761986 , T q 0.75 = 6.90035 × 10 5 , H q 0.75 = 0.00219519 ,
K q 0.75 K + T q 0.75 T 2 H q 0.75 H = 0.87221 < 0 .
The condition given in Equation (26) is satisfied for the proposed estimators. Under this condition, the proposed quantile-based estimators were more efficient than the classical estimators.
We calculate the mean square error values of the classical estimators given in Equations (5) and (8) and the proposed estimators given in Equations (16) and (18). These values are given in Table 3. Using these mean square error values, we compute the relative efficiency values of each proposed estimate with the help of the following equation:
RE Y ¯ ^ lrsq i tk = MSE Y ¯ ^ lrsq i tk MSE Y ¯ ^ lrs   and   RE Y ¯ ^ lrcq i = MSE Y ¯ ^ lrcq i MSE Y ¯ ^ lrc .
We proposed quantile regression–ratio-type estimators for stratified random sampling. The 25%, 50%, and 75% quartile-dependent separate-regression type estimators are given by Equation (16). The mean square error of this estimator is as in Equation (17). Similarly, the combined regression type estimation of 25%, 50%, and 75% quartiles is given by Equation (18). The mean square error of this estimator is as in Equation (19). When Equations (23) and (26) are satisfied, the proposed estimators based on quantile regression are more efficient than the classical estimators.
According to the cases where the sample size is equal and different for this dataset, 24 relative efficiency values were obtained. These values are given in Table 4. It can be seen from Table 4 that all of the relative efficiency values are less than 1. This shows that the mean square errors of the proposed quantile regression-based estimators are smaller than the mean square errors of the classical estimators when the sample size is both equal and different. This is an expected outcome due to Equations (23) and (26) being satisfied.

6. Conclusions

Ratio-type estimators are proposed utilizing quantile regression in the stratified random sampling method. The proposed estimators present a more effective method of estimation compared to traditional ratio estimators. The mean squared error equations of these estimators were derived. The proposed estimators in Equations (16) and (18), as well as the classical estimators in Equations (5) and (8), were theoretically compared using the mean squared error equations. Table 4 demonstrates that the estimators in Equations (16) and (18) yield more efficient predictions for the population mean in stratified random sampling. The mean squared errors of the estimators in Equations (16) and (18) are smaller than those of the estimator in Equations (5) and (8). Based on theoretical and numerical comparisons, it has been demonstrated that the proposed quantile ratio-type estimators have a lower mean squared error compared to other commonly used estimators. This indicates that these estimators provide more accurate and precise predictions. Furthermore, quantile regression provides us with the ability to obtain more reliable results in air pollution data, where outliers and asymmetrical distributions are common. By specifically estimating the conditional quantiles, quantile regression reduces the influence of outliers and extreme values on the forecasts. This results in more robust and accurate predictions, enhancing the overall forecasting process in practice.

Author Contributions

T.K. and H.K. contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lohr, S.L. Sampling: Design and Analysis; CRC Press: Boston, MA, USA, 2021. [Google Scholar]
  2. Kish, L. Survey Sampling; John Wiley & Sons: New York, NY, USA, 1965. [Google Scholar]
  3. Zaman, T. An efficient exponential estimator of the mean under stratified random sampling. Math. Popul. Stud. 2017, 28, 104–121. [Google Scholar] [CrossRef]
  4. Thompson, S.K. Sampling, 3rd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  5. Cochran, W.G. Sampling Technique; John Wiley and Son: New York, NY, USA, 1977. [Google Scholar]
  6. Kadilar, C.; Cingi, H. A new ratio estimator in stratified random sampling. Commun. Stat.-Theory Methods 2005, 34, 597–602. [Google Scholar] [CrossRef]
  7. Prasad, B. Some improved ratio type estimators of population mean and ratio in finite population sample surveys. Commun. Stat.-Theory Methods 1989, 18, 379–392. [Google Scholar] [CrossRef]
  8. Shabbir, J.; Gupta, S. A new estimator of population mean in stratified sampling. Commun. Stat.-Theory Methods 2006, 35, 1201–1209. [Google Scholar] [CrossRef]
  9. Bedi, P.K. Efficient utilization of auxiliary information at estimation stage. Biom. J. 1996, 38, 973–976. [Google Scholar] [CrossRef]
  10. Singh, R.; Kumar, M.; Singh, R.D.; Chaudhry, M.K. Exponential ratio type estimators in stratified random sampling. arXiv 2013, arXiv:1301.5086. [Google Scholar]
  11. Bahl, S.; Tuteja, R. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 1991, 12, 159–164. [Google Scholar] [CrossRef]
  12. Kadilar, C.; Cingi, H. Ratio estimators in stratified random sampling. Biom. J. 2003, 45, 218–225. [Google Scholar] [CrossRef]
  13. Shahzad, U.; Hanif, M.; Koyuncu, N.; Luengo, A.G. A family of ratio estimators in stratified random sampling utilizing auxiliary attribute alongside the nonresponse issue. J. Stat. Theory Appl. 2019, 18, 12–25. [Google Scholar]
  14. Hussain, S.; Ahmad, S.; Saleem, M.; Akhtar, S. Finite population distribution function estimation with dual use of auxiliary information under simple and stratified random sampling. PLoS One 2020, 15, e0239098. [Google Scholar] [CrossRef]
  15. Muneer, S.; Khalil, A.; Shabbir, J. A parent-generalized family of chain ratio exponential estimators in stratified random sampling using supplementary variables. Commun. Stat.-Simul. Comput. 2020; in press. [Google Scholar]
  16. Cekim, H.O.; Kadilar, C. In-type estimators for the population variance in stratified random sampling. Commun. Stat.-Simul. Comput. 2020, 49, 1665–1677. [Google Scholar] [CrossRef]
  17. Kadilar, C.; Candan, M.; Cingi, H. Ratio estimators using robust regression. Hacet. J. Math. Stat. 2007, 36, 181–188. [Google Scholar]
  18. Subzar, M.; Bouza, C.N.; Al-Omari, A.I. Utilization of different robust regression techniques for estimation of finite population mean in SRSWOR in case of presence of outliers through ratio method of estimation. Investig. Oper. 2019, 40, 600–609. [Google Scholar]
  19. Zaman, T.; Bulut, H. Modified ratio estimators using robust regression methods. Commun. Stat.-Theory Methods 2019, 48, 2039–2048. [Google Scholar] [CrossRef]
  20. Zaman, T.; Bulut, H. Modified regression estimators using robust regression methods and covariance matrices in stratified random sampling. Commun. Stat.-Theory Methods 2020, 49, 3407–3420. [Google Scholar] [CrossRef]
  21. Zaman, T. Improvement of modified ratio estimators using robust regression methods. Appl. Math. Comput. 2019, 348, 627–631. [Google Scholar] [CrossRef]
  22. Ali, N.; Ahmad, I.; Hanif, M.; Shahzad, U. Robust-regression-type estimators for improving mean estimation of sensitive variables by using auxiliary information. Commun. Stat.-Theory Methods 2021, 50, 979–992. [Google Scholar] [CrossRef]
  23. Grover, L.K.; Kaur, A. An improved regression type estimator of population mean with two auxiliary variables and its variant using robust regression method. J. Comput. Appl. Math. 2021, 382, 113072. [Google Scholar] [CrossRef]
  24. Koç, H. Ratio-type estimators for improving mean estimation using Poisson Regression method. Commun. Stat.-Theory Methods 2021, 50, 4685–4691. [Google Scholar] [CrossRef]
  25. Baur, D.; Saisana, M.; Schulze, N. Modeling the effects of meteorological variables on ozone concentration: A quantile regression approach. Atmos. Environ. 2004, 38, 4689–4699. [Google Scholar] [CrossRef]
  26. Shahzad, U.; Hanif, M.; Sajjad, I.; Anas, M.M. Quantile regression-ratio-type estimators for mean estimation under complete and partial auxiliary information. Sci. Iran. 2022, 29, 1705–1715. [Google Scholar] [CrossRef]
  27. Anas, M.M.; Huang, Z.; Alilah, D.A.; Shafqat, A.; Hussain, S. Mean estimators using robust quantile regression and L-moments’ characteristics for complete and partial auxiliary information. Math. Probl. Eng. 2021, 2021, 1–8. [Google Scholar] [CrossRef]
  28. Anas, M.M.; Huang, Z.; Shahzad, U.; Zaman, T.; Shahzadi, S. Compromised imputation based mean estimators using robust quantile regression. Commun. Stat.-Theory Methods 2022, 1–16. [Google Scholar] [CrossRef]
  29. Shahzad, U.; Ahmad, I.; Al-Noor, N.H.; Iftikhar, S.; Abd Ellah, A.H.; Benedict, T.J. Särndal Approach and Separate Type Quantile Robust Regression Type Mean Estimators for Nonsensitive and Sensitive Variables in Stratified Random Sampling. J. Math. 2022, 2022, 14. [Google Scholar] [CrossRef]
  30. Rueda, M.; Arcos, A. Improving ratio-type quantile estimates in a finite population. Stat. Pap. 2004, 45, 231–248. [Google Scholar] [CrossRef]
  31. Shahzad, U.; Ahmad, I.; Al-Noor, N.H.; Hanif, M.; Almanjahie, I.M. Robust estimation of the population mean using quantile regression under systematic sampling. Math. Popul. Stud. 2022, 30, 1–13. [Google Scholar] [CrossRef]
  32. Shahzad, U.; Al-Noor, N.H.; Afshan, N.; Alilah, D.A.; Hanif, M.; Anas, M.M. Minimum Covariance Determinant-Based Quantile Robust Regression-Type Estimators for Mean Parameter. Math. Probl. Eng. 2021, 2021. [Google Scholar] [CrossRef]
  33. Tareghian, R.; Rasmussen, P.F. Statistical downscaling of precipitation using quantile regression. J. Hydrol. 2013, 487, 122–135. [Google Scholar] [CrossRef]
  34. Chen, C.; Wei, Y. Computational issues for quantile regression. Sankhyā Indian J. Stat. 2005, 67, 399–417. [Google Scholar]
  35. Algamal, Z.Y.; Rasheed, K.B. Re-sampling in Linear Regression Model Using Jackknife and Bootstrap. Iraqi J. Stat. Sci. 2010, 10, 59–73. [Google Scholar] [CrossRef]
  36. Hao, L.; Naiman, D.Q. Quantile Regression; Sage Publications: London, UK, 2007. [Google Scholar]
  37. Buchinsky, M. Recent Advances in Quantile Regression Models: A Practical Guideline for Empirical Research. J. Hum. Resour. 1998, 33, 88–126. Available online: https://www.mgm.gov.tr/eng/forecast-cities.aspx (accessed on 22 May 2021). [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual au thor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Table 1. Descriptive statistics for the population.
Table 1. Descriptive statistics for the population.
Total
Population size N 420
Sample size n 135
Population mean of X X ¯ 34.98289
Population mean of Y Y ¯ 37.4703
Population variance of   X S x 2 703.9922
Population variance of Y S y 2 574.3866
Population correlation coefficient between X and Y ρ x y 0.704
Table 2. Descriptive statistics for the population in hth stratum.
Table 2. Descriptive statistics for the population in hth stratum.
Symbol   for   Stratum   h 123
N h 140140140
X ¯ h 52.0738.29244.583
Y ¯ h 39.13223.59749.681
S x h 2 493.754163.307360.492
S y h 2 664.912105.425614.283
S y x h 448.01877.003305.43
ρ x y h 0.7810.5870.649
β h 0.9080.4720.847
S x h q 0.25 2 0.00570.00140.0131
S y h q 0.25 2 15.00040.433835.8744
S y x h q 0.25 −0.2681−0.013−0.6358
β h q 0.25 0.84140.03020.6655
S x h q 0.5 2 0.00550.04180.0115
S y h q 0.5 2 15.33040.365414.7909
S y x h q 0.5 −0.2682−0.1190−0.3772
β h q 0.5 0.87000.41610.8467
S x h q 0.75 2 0.00940.01240.0214
S y h q 0.75 2 38.46693.800752.9633
S y x h q 0.75 −0.5549−0.1174−0.9822
β h q 0.75 0.93120.80831.0080
w h 0.330.330.33
Table 3. Data statistics used for simple random sampling.
Table 3. Data statistics used for simple random sampling.
Sample Sizes Are EqualSample Sizes Are Different
EstimatorMean Square ErrorMean Square Error
Classical Y ¯ ^ lrc 2.77502.2264
Y ¯ ^ lrs 1.36031.2831
Proposed Y ¯ ^ lrcq 0 . 25 ( tk ) 0.08590.0169
Y ¯ ^ lrcq 0 . 50 ( tk ) 0.05090.0290
Y ¯ ^ lrcq 0 . 75 ( tk ) 0.15940.0632
Y ¯ ^ lrsq 0 . 25 ( tk ) 0.08810.1072
Y ¯ ^ lrsq 0 . 50 ( tk ) 0.05300.0390
Y ¯ ^ lrsq 0 . 75 ( tk ) 0.16490.1292
Table 4. Theoretical results for the relative efficiencies.
Table 4. Theoretical results for the relative efficiencies.
Sample Sizes Are EqualSample Sizes Are Different
Relative Efficiency Y ¯ ^ lrc Y ¯ ^ lrs Y ¯ ^ lrc   Y ¯ ^ lrs
Y ¯ ^ lrcq 0 . 25 ( tk ) 0.03090.06310.00750.0131
Y ¯ ^ lrcq 0 . 50 ( tk ) 0.01830.03740.01300.0226
Y ¯ ^ lrcq 0 . 75 ( tk ) 0.05740.117250.02830.0492
Y ¯ ^ lrsq 0 . 25 ( tk ) 0.03170.06480.04810.0836
Y ¯ ^ lrsq 0 . 50 ( tk ) 0.01910.03890.01750.0303
Y ¯ ^ lrsq 0 . 75 ( tk ) 0.05940.12120.05800.8800
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Koç, T.; Koç, H. A New Class of Quantile Regression Ratio-Type Estimators for Finite Population Mean in Stratified Random Sampling. Axioms 2023, 12, 713. https://doi.org/10.3390/axioms12070713

AMA Style

Koç T, Koç H. A New Class of Quantile Regression Ratio-Type Estimators for Finite Population Mean in Stratified Random Sampling. Axioms. 2023; 12(7):713. https://doi.org/10.3390/axioms12070713

Chicago/Turabian Style

Koç, Tuba, and Haydar Koç. 2023. "A New Class of Quantile Regression Ratio-Type Estimators for Finite Population Mean in Stratified Random Sampling" Axioms 12, no. 7: 713. https://doi.org/10.3390/axioms12070713

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop