Next Article in Journal
An Explicit-Correction-Force Scheme of IB-LBM Based on Interpolated Particle Distribution Function
Next Article in Special Issue
MSIA-Net: A Lightweight Infrared Target Detection Network with Efficient Information Fusion
Previous Article in Journal
Skew Constacyclic Codes over a Non-Chain Ring
Previous Article in Special Issue
A Pedestrian Detection Network Model Based on Improved YOLOv5
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery

1
Department of Statistics, Wuhan University of Technology, Wuhan 430070, China
2
Department of Epidemiology and Biostatistics, University of South Florida, Tampa, FL 33612, USA
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(3), 524; https://doi.org/10.3390/e25030524
Submission received: 29 December 2022 / Revised: 6 March 2023 / Accepted: 14 March 2023 / Published: 17 March 2023
(This article belongs to the Special Issue Entropy in Soft Computing and Machine Learning Algorithms II)

Abstract

:
Sufficient variable screening rapidly reduces dimensionality with high probability in ultra-high dimensional modeling. To rapidly screen out the null predictors, a quantile-adaptive sufficient variable screening framework is developed by controlling the false discovery. Without any specification of an actual model, we first introduce a compound testing procedure based on the conditionally imputing marginal rank correlation at different quantile levels of response to select active predictors in high dimensionality. The testing statistic can capture sufficient dependence through two paths: one is to control false discovery adaptively and the other is to control the false discovery rate by giving a prespecified threshold. It is computationally efficient and easy to implement. We establish the theoretical properties under mild conditions. Numerical studies including simulation studies and real data analysis contain supporting evidence that the proposal performs reasonably well in practical settings.

1. Introduction

When the dimension p grows exponentially with n, the unbearable computational cost of the classical variable selection incurred by the ultra-high dimensionality will not only heavily slow down the computing speed of the algorithm but also result in unstable solutions [1]. To rapidly screen out the inactive predictors, variable screening methods for ultra-high dimensional data coupled with response have been examined to reduce dimension and effectively retain all the active variables with high probability in the reduced variable space [2]. This is referred to as the sure screening property. Fan and Lv (2008) proposed the concept of sure independence screening (SIS) based on the marginal Pearson correlation coefficient in the linear regression model [1]. Since then, a series of variable screening methods have been proposed successively, such as variable screening frameworks based on the generalized linear model, additive model, and general models [3,4,5,6]. The above methods are based on some specific model assumptions. In many scientific applications, the correlation between predictors and the response is difficult to assume for ultra-high dimensional data [7]. Model-based screening procedures enjoy quick computational speed but suffer the risk of model misspecification [7,8].
To avoid the inconsistency between the assumptions of the regression model and the actual distribution of data, model-free variable screening methods have been initially designed for the continuous outcome variables. For example, Zhu et al. (2011) reported a screening procedure to detect active predictors named SIRS; Li et al. (2012) proposed a distance-based sure screening procedure called DC-SIS; He et al. (2013) introduced a quantile-adaptive model-free variable screening for high dimensional data; Lin et al. (2013) discussed a nonparametric ranking feature screening (NRS) through local information flows of the predictors, and Lu and Lin (2020) studied feature screening procedure based on the unconditional model [7,8,9,10,11]. For ultra-high dimensional covariates coupled with the categorical response, Mai and Zou (2013) advocated the Kolmogorov–Smirnov distance for binary classification problems [12]. With a possibly diverging number of classes, the marginal feature screening procedure for ultra-high dimensional discriminant analysis was introduced by Huang et al. (2014) and Cui et al. (2015) [13,14]. Han (2019) researched a general and unified nonparametric screening framework under conditional strictly convex loss [15]. Zhou et al. (2020) established a forward screening procedure based on a new measure called cumulative divergence [16]. Xie et al. (2020) explored a category-adaptive screening procedure with ultrahigh dimensional heterogeneous categorical data [17].
As reported in Hao and Zhang (2017), the variable screening results depend on the signal-to-noise (SNR) level: when the signal is weak with massive noise variables, it could not be easy to detect the active variables from the noise variables, and the sure screening property may not be established [18]. In terms of this situation, one path considers controlling some false discoveries. In this regard, Tang et al. (2021) explored a quantile correlation-based screening framework (QCS), which could screen variables by conducting multiple testing to control the false discovery rate (FDR) [19]. Liu et al. (2022) posed a two-step approach to specify the threshold for feature screening with the help of knockoff features such that the FDR is controlled under a prespecified level [20]. For controlling the FDR, Guo et al. (2022) advocated a data-adaptive threshold selection procedure with FDR control based on sample-splitting [21].
However, most of the above sure screening methods are not sufficient variable screening (SVS) technology, which was first proposed in Cook (2004) and also discussed by Yin and Hilafu (2015) and Yuan et al. (2022) [22,23,24]. For illustration, consider a population with a response variable Y and a p-dimensional vector of predictors X = ( X 1 , , X p ) T , let X A be a subset of X , and X A ¯ denotes the orthogonal complement of X A . Based on the research of Yin and Hilafu (2015) and Yuan et al. (2022), sufficient variable screening means to find the smallest and unique active variable set X A such that Y X A ¯ X A , that is, given the set X A , Y is independent of X A ¯  [23,24].
In this paper, without any specific regression or parametric assumptions, we advocate a new sufficient variable screening procedure by using a robust multiple testing procedure by controlling the false discovery to distinguish active variables by splitting the continuous response at different quantile levels. Thus, we achieve quantile-adaptive sufficient variable screening by controlling the false discovery (QA-SVS-FD). The proposed procedure is based on a one-versus-rest (OVR) test statistic with an asymptotic chi-square distribution under the null hypothesis. Thus, with the asymptotic distribution, the sufficient variables set could be estimated precisely by controlling the FDR accurately at a given level for high dimensionality or controlling the number of the FD adaptively with the error of 1. In addition, the proposed procedure is a model-free method for the measurement of independence without any specified distribution model; thus, it is robust to detect sufficient relevant variables against different model types.
The rest of this paper is organized as follows: Section 2 develops the sufficient variable screening testing statistic by using the conditionally imputing marginal rank correlation at different quantile levels of response. The false discovery controlling procedure under mild conditions will be studied in Section 3. Section 4 and Section 5 evaluate the proposed procedure’s performance via extensive numerical research, which contains simulation studies and two real data examples, which verify the robustness and flexibility of our methods. In Section 6, we shall give a short concluding discussion. All the theoretical properties are proved in Appendix A.

2. Sufficient Screening Utility

As the statement of Yuan et al. (2022), the analysis of sufficient variable screening includes the iterative two-step screening procedure, which contains complex computation [24]. This section will propose a novel sufficient variable screening statistic by the quantile-adaptive correlation test (QA-SVS).

2.1. A Quantile-Adaptive Correlation Test Statistic

Throughout this paper, denote ( Y , X T ) as a pair of the continuous scalar of response and the p-dimensional vector of covariates, where X T = ( X 1 , , X p ) T . The complete observations { Y i , X i T } , i = 1 , , n are independent and identically distributed observation samples of ( Y , X T ) . Based on the research of Yin and Hilafu (2015) [23] and Yuan et al. (2022) [24], define two indicant sets, i.e, sufficient active or active variables index set A consists of variables that are relevant to Y, and A c contains all redundant predictors or noise predictors, where X A is a subset of X , and X A c denotes the orthogonal complement of X A . Sufficient variable screening means to find the smallest and unique active variable set X A such that Y X A c X A , that is, given the set X A , Y is independent of X A c . Clearly, full index set I = A A c = { 1 , 2 , , p } . Without any specific assumption on the functional correlation between the covariates and the response, inspired by Tang et al. (2021) [19], one may consider testing the sufficient independence simultaneously between each X j and Y for 1 j p to detect important variables in the high dimensional setting
H 0 , j : Y X j X 2 versus H 1 , j : Y / X j X 2
for X j X 1 , where represents independence, and ( X 1 , X 2 ) T is a split of X . When the part X 2 satisfies that X A X 2 , X j X 1 is regarded as a sufficient active predictor if and only if H 0 , j is rejected.
Recall that the category-adaptive screening procedure in Xie et al. (2020) [17] is to measure the marginal independence in high-dimensional heterogeneous data. Motivated by the marginal utility, a test statistic is introduced based on the quantile split response for testing (1) with high dimensionality without assuming any parametrical correlation between Y and X . Let 0 = γ 0 < γ 1 < γ 2 < < γ K = 1 be the sequence of quantile grid points of Y, where D is prespecified positive integers. Denote the γ k -th 1 k K quantile of Y as Q k . The theoretical quantiles can be estimated consistently by the sample quantiles, which is that Q ^ k = ( 1 γ k ) Y n γ k + γ k Y n γ k + 1 is the γ k -th 1 k K 1 sample quantile of Y according to Hyndman and Fan (1996) [25]. For convenience, let Q ^ 0 = and Q ^ K = + , denote that G k = [ Q k 1 , Q k ) and G ^ k = [ Q ^ k 1 , Q ^ k ) . It is obvious that p k = Pr ( Y G k ) = γ k γ k 1 .
To explore the nature and provide a complete picture of the conditional distribution of the outcome variable given the predictor vector, we assume the following assumptions:
(I) 
(Quantile-Heterogeneity) the index set of sufficient active predictors satisfies that A k = { 1 j p : Pr Y G k | X functionally depends on X j } may be different for different k = 1 , 2 , , K ;
(II) 
(Sparsity) the dimensionality p = o { exp ( n α ) } for some constant α > 0 , but | A k | = s k = o ( n ) , where | A k | is the cardinality of A k , and n is the sample size.
Assumption (I) describes heterogeneity of the sufficient correlated predictors set at different level quantiles of the response, and Assumption (II) provides the high-dimensional settings.
Remark 1.
The sufficient active variables index set at the γ k -th quantile defined in Assumption (I) screens active variables sufficiently when the response belongs to the grid [ Q k 1 , Q k ) . That is, A k c = 1 j p : I ( Y G k ) / X j | A k , where A k c is the orthogonal complement of A k . The proof of equivalence will be expanded in Appendix A.1. In other words, A = { 1 j p : H 0 , j is true } = k = 1 K A k .
Remark 2.
If response Y { y 1 , y 2 , , y K } is discrete, then A k in Assumption (I) should be rewritten as A k = 1 j p : Pr ( Y = y k | X ) functionally depends on X j .
Remark 3.
For the special case of K = 2 , by the definition of A k in Assumption (I), it is clear that A 1 = A 2 , which reduces to the active set A γ = { 1 j p : Q γ Y | X functionally depends on X j } defined in He et al. (2013). We do not elaborate on it again as we regard it as a special case of the proposed framework in this paper.
Lemma 1.
For any j = 1 , , p and k = 1 , , K , if j A k , then F j k ( x ) = F j ( x ) , where F j k ( x ) = Pr X j x | Y G k and F j x = Pr X j x .
Lemma 1 will be proved in Appendix A.2. According to Yuan et al. (2022) [24], the sufficient active variable set is actually screened based on the structure of ( Y , X A k , X A k c ) . Lemma 1 provides the information that the structure of ( Y , X A k , X A k c ) could be transformed into another structure based on the marginal structure of ( Y , X j ) by judging the difference of F j k ( x ) and F j ( x ) for each j = 1 , , p and k = 1 , , K .
In terms of the quantile-heterogeneity of response, consider a series of tests to detect sufficient active variables simultaneously at different quantile levels, that is,
H 0 , j , k : I ( Y G k ) X j X A k versus H 1 , j , k : I ( Y G k ) / X j X A k
for 1 j p and 1 k K . Rewrite the test in Equation (1) as
H 0 , j , k : Y X j X A versus H 1 , j , k : Y / X j X A ,
where A = k = 1 K A k . To investigate the difference of the conditional distribution of X j ( j = 1 , , p ) different quantile levels, for given k { 1 , , K } , a variable screening approach developed by capturing the dependence between I ( Y G k ) and x j is specified as the following screening utility:
υ j k = 12 · ( n + 1 ) · p k 1 p k · τ j k 2 ,
where τ j k = E X j { F j k x F j x } = E x { F j k ( x ) } 1 2 , and F j k x F j x reflects the difference between conditional cumulative distribution function (CDF) and marginal CDF of X j at each quantile level. Actually, υ j k = 12 · ( n + 1 ) · Var OVR { τ j k } = 12 · ( n + 1 ) · ( p k · τ j k + ( 1 p k ) · τ j k ¯ ) , where τ j k ¯ = E X j { F j k ¯ x F j x } = E x { F j k ( x ) } 1 2 , F j k ¯ ( x ) = Pr X j x | Y [ Q k 1 , Q k ) , and Var OVR { τ j k } represents the variance of τ j k in the one-versus-rest test of k-th series of H 0 , j , k . As the definition in Equation (4), the higher υ j k represents the stronger correlation between the variable X j and I ( Y G k ) .
Then, under the complete sample ( X , Y ) , a sample estimate of τ j k , j = 1 , , p , k = 1 , , K is suggested as
τ ^ j k = 1 n + 1 l = 1 n 1 n i = 1 n I ( X i j x l j , Y i G ^ k ) p ^ k 1 2 = 1 ( n + 1 ) n k i = 1 n η i · ξ i j 1 2 ,
where η i k = I ( Y i G ^ k ) represents whether the response of the i-th sample belongs to the k-th grid, p ^ k n k / n = n 1 i = 1 n η i k , and k = 1 , , K . Correspondingly, the test statistic specifies the following conditional rank correlation, which is
υ ^ j k = 12 · ( n + 1 ) · n k n n k τ ^ j k 2 = 12 · ( n + 1 ) · n k n n k 1 ( n + 1 ) n k i = 1 n η ^ i · ξ i j 1 2 2 .

2.2. Asymptotic Properties of the Test Statistic

According to approximation distribution for a sample sum in sampling without replacement from a finite population in Mohamed and Mirakhmedov (2016) [26], the asymptotic properties of τ ^ j k and υ ^ j k would be obtained as follows.
Lemma 2 (Asymptotic Distribution of τ ^ j k ).
if H 0 , j , k is true for all k = 1 , 2 , , K and j = 1 , 2 , , p , and lim n p ^ k ( 1 p ^ k ) > 0 , then we obtain the following asymptotic distribution:
τ ^ j k L N 0 , 1 p ^ k 12 ( n + 1 ) p ^ k .
Denote Φ ( u ) = 2 π 1 / 2 u exp { v 2 2 } d v ; then, F τ ^ j k ( u ) = Pr { τ ^ j k < u 1 p ^ k 12 ( n + 1 ) p ^ k } satisfies that
1 F τ ^ j k ( u ) 1 Φ ( u ) = F τ ^ j k ( u ) Φ ( u ) = 1 + O n 1 / 2 ,
where u = O ( n β ) > 0 and β 1 / 2 .
Corollary 1 (Asymptotic Distribution of υ ^ j k ).
If H 0 , j , k is true for any k = 1 , 2 , , K and j = 1 , 2 , , p , and lim n p ^ k ( 1 p ^ k ) > 0 , we have υ ^ j k D χ 1 2 , where χ m 2 follows the chi-square distribution with degree of freedom m. If H 0 , j is true for any j = 1 , 2 , , p , we obtain υ ^ j D χ K 1 2 , where υ ^ j = ( K 1 ) / K · k = 1 K υ ^ j k . Then, υ ^ j k and υ ^ j satisfies that
F υ ^ j k ( u ) F χ 1 2 ( u ) = 1 + O n 1 / 2 , F υ ^ j ( u ) F χ K 1 2 ( u ) = 1 + O n 1 / 2 ,
where F χ K 1 2 ( u ) = Pr ( χ K 1 2 u ) is the CDF of χ K 1 2 .
Lemma 2 and Corollary 1 will be proved in Appendix A.3 and Appendix A.4, respectively. According to Lemma 2, it could be found that the asymptotic normal distribution of τ ^ j k depends on p ^ k . Thus, to remove the influence of p ^ k on asymptotic distribution and consider the composite hypothesis testing in 3, υ ^ j k is established in this paper.
When giving additional conditions as below, we can obtain Theorems 1 and 2.
(C1) 
There exists constants c 1 > 0 and c 2 > 0 , s.t. c 1 / K min 1 k K p k max 1 k K p k c 2 / K ;
(C2) 
There exists ρ 0 = O ( 1 / p 2 ) > 0 , s.t. lim p min j 1 A k 1 υ j 1 k 1 ρ 0 lim inf p max j 2 A k 2 υ j 2 k 2 ;
(C3) 
The grids number of response satisfies K = O ( n ξ ) , where ξ > 0 and κ + ξ < 1 / 2 .
Condition (C1) requires that the proportion of samples in each grid should not be too small or too large. Condition (C2) guarantees that there exist thresholds ρ 0 ensuring that the smaller value of υ j k ρ 0 represents the weaker correlation. Condition (C3) allows the number of grids to diverge as n increases. This ensures the rationality of the series of hypothesis tests. Conditions of (C1)(C3) provide concise foundations and do not specify any distribution models and moment assumptions of variables.
Theorem 1 (Sure Screening Property).
Suppose conditions of (C1) and (C3) hold. Then, for any constant c 3 > 0 , we have
Pr ( max 1 j p υ ^ j k υ j k c 3 n κ ) 8 p exp c 4 n 1 2 κ 2 ξ ,
where c 4 > 0 and k = 1 , . . . , K .
Theorem 2 (Ranking Consistency Property).
Suppose conditions of (C1) and (C2) hold. If K log ( p ) = o ( n ρ 0 2 ) , then
lim inf n min j 1 A k 1 υ ^ j 1 k 1 max j 2 A k 2 υ ^ j 2 k 2 > 0 .
We shall provide the proofs of Theorems 1 and 2 in Appendices Appendix A.5 and Appendix A.6, respectively. Note that Theorem 1 is established for the fixed number of variables p. As long as 4 ( n + 2 ) p exp c 4 n 1 2 κ ξ to 0 asymptotically, the sure screening property of QA-SVS is robust to heavy-tailed distributions of the predictors and the presence of potential outliers. The ranking consistency property in Theorem 2 indicates that the values of υ ^ j k of sufficient active variables responding to the k-th grid can be ranked above that of all the inactive ones with a high probability, which implies that the QA-SVS can separate the active and inactive with a certain threshold. Theorems 1 and 2 mainly illustrate the properties of the marginal utility itself, and the estimation of the certain threshold for the partition of sufficient variable sets needs further work in Section 3.

3. False Discovery Control Model

Based on Theorem 1 and Theorem 2, we shall design two routes for screening sufficient active variables by considering the false discovery (FD): one is to control the cardinality of FD adaptively by detecting the outlier, and the other is to control false discovery rate (FDR) accurately by survival analysis function.

3.1. Adaptive False Discovery Control Model

Recall the property (iii) of controlling the false discovery in Xie et al. (2020)–[17], the number of screened variables is bounded with high probability. If we let ρ satisfy that S χ 1 2 ( ρ ) = 1 F χ K 1 2 ( u ) = 1 / p , where ρ indicates the 1 / p -th upper quantile of χ 1 2 , we can a capture adaptive path to control the false discovery by the outlier method. Without assuming any actual distribution, using the proposed test statistic υ ^ j k , the false discovery is
FD k , ρ = j A k c I ( υ ^ j k ρ ) ,
where the expectation of false discovery rate is EFD k , ρ = E ( FD k , ρ ) , and the variance of false discovery rate is VFD k , ρ = Var ( FD k , ρ ) for any given ρ . By Corollary 1 in Section 2.2, under H 0 , j , k , each υ ^ j k converges to distribution of χ 1 2 under certain mild conditions. Let q k = 1 s k be the cardinality of A k c , for any given ρ , we have s r / p 0 as p . Intuitively, the FDR k , ρ could be estimated by
Theorem 3 (Adaptively Controlling False Discovery).
Suppose conditions of (C1)(C3) hold. A fixed ρ = ρ ^ 0 satisfying S χ 1 2 ( ρ ^ 0 ) = 1 / p = O ( n 2 β ) , where β > 1 / 2 , we obtain that
lim p Pr { FD k , ρ ^ 0 > 0 } = 1 e 1 ( k = 1 , , K ) ,
lim n EFD k , ρ ^ 0 = lim n VFD k , ρ ^ 0 = 1 ,
where EFD k , ρ ^ 0 = 1 + O ( n 1 / 2 ) and VFD k , ρ ^ 0 = 1 + O ( n 1 / 2 ) .
We shall prove this property in Appendix A.7. Theorem 3 implies that the adaptive threshold ρ ^ 0 can separate the active and inactive variables with a low false discovery in a high probability, which converges to 1 e 1 as n increases. The expectation and variance of the FD number can be controlled at 1 + O ( n 1 / 2 ) , indicating that the number of the selected variables can be sufficiently controlled. The sufficient screened set is defined as
A ^ k , ρ ^ 0 j : υ ^ j k ρ ^ 0 , 1 j p .
The definition of A ^ k , ρ ^ 0 satisfies the estimation of the smallest and unique active variable set X A such that Y X A ¯ X A . Furthermore, we obtain the sufficient screening property of A ^ k , ρ ^ 0 .
Corollary 2 (Sufficient Screening Property by AFD).
Supposing that conditions of (C1)(C3) hold, we have
Pr ( A k A ^ k , ρ ^ 0 ) 1 8 s k exp c 6 n 1 2 κ 2 ξ ,
where c 6 is some positive constant, and s k = A k is the true model size, k = 1 , , K .
Corollary 2 will be proved in Appendix A.8. In fact, Corollary 2 also can be regarded as the sure screening property as in Fan and Lv (2008). Under the definition of the sufficient variable, the utility in this paper screening the sufficient variables by controlling the false discovery could lead to more precise results. Thus, rename the property in Corollary 2 as a sufficient screening property. We call the proposed AFD control procedure QA-SVS-AFD. The QA-SVS-AFD is computationally efficient and its validity to detect active variables is guaranteed by Corollary 2. A stock-in-trade in the existing screening methods such as Xie et al. (2020) [17] is to control the cardinality of the screened active variable set by setting a certain threshold, and reduce the number of screened variables to be negligible with the ultra-high dimensionality. However, the number of the FD is non-negligible. In this paper, the QA-SVS-AFD procedure can control false discovery precisely by sufficiently controlling the expectation and variance of false discovery to converge to 1.
The estimation of the certain threshold by Theorem 3 is to control the determination of the rejection region under the level of O ( 1 / p ) . In other words, we reject the null hypothesis H 0 , j , k with the significant level around 1 / p . As a result, the maximum subset of variables in the rejection region is the estimation of the sufficient active variable set by the AFD procedure. The AFD control path can be summarized as the following Algorithm 1:
Algorithm 1 QA-SVS-AFD algorithm.
  Input: Observation sample X , Y and the number of grids K
Output: The screened sufficient variable set A ^ k , ρ ^ 0 ( k = 1 , , K )
Step 1 Calculate υ ^ k , 1 , , υ ^ k , p of Equation (5) for different k = 1 , , K ;
Step 2 Compute ρ ^ 0 by S χ 1 2 ( ρ ^ 0 ) = 1 / p ;
Step 3 Search for the screened sufficient active variable set A ^ k , ρ ^ 0 in Equation (10).
Remark 4.
Alternatively, if one focuses on selecting sufficient predictors relevant to the response Y, one can consider a refined version
A ^ ρ ^ 0 j : υ ^ j ρ ^ 0 * , 1 j p
for testing the H 0 , j in Equation (1), where υ ^ j = ( K 1 ) / K · k = 1 K υ ^ j k , j = 1 , , p and ρ ^ 0 * satisfies that S χ K 1 2 ( ρ ^ 0 * ) = 1 / p = O ( n 2 β ) . The estimations in Algorithm 1 are replaced by υ ^ j and A ^ ρ ^ 0 . As a special case, when K = 2 , we have A ^ ρ ^ 0 = A ^ 1 , ρ ^ 0 = A ^ 2 , ρ ^ 0 . The result can be simply proved by Corollary 2, and we omit it.

3.2. False Discovery Rate Control Model

The adaptive error detection control model can adaptively set the rejection region, with the probability of rejection of the null hypothesis testing. It leads to a large type-II error in hypothesis testing (2). Therefore, similar to Tang et al. (2021) [19], considering the control of the type-I error in hypothesis testing (2), a false discovery rate (FDR) control procedure is developed for testing H 0 , j , k simultaneously for j = 1 , , p , k = 1 , , K . Without assuming any prespecified distribution, to sufficiently detect active variables at different quantile levels, we provide a suitable estimation for the threshold ρ to separate sufficient active variables by controlling the FDR of each H 0 , j , k .
With the proposed test statistic υ ^ j k , the false discovery proportion is
FDP k , ρ = FD k , ρ max { j I I ( υ ^ j k ρ ) , 1 } = j A k c I ( υ ^ j k ρ ) max { j I I ( υ ^ j k ρ ) , 1 } ,
for any given ρ , and the false discovery rate is FDR k , ρ = E ( FDP k , ρ ) . By Corollary 1 in Section 2.2, under H 0 , j , k , each υ ^ j k converges to distribution of χ 1 2 under conditions of (C1)(C3). Let q k = 1 s k be the cardinality of A k c , for any given ρ , under the assumption that s r / p 0 as p . Intuitively, the estimation for the FDR k , ρ can use the equation that FD k , ρ / q k max { j I I ( υ ^ j k ρ ) , 1 } / p . However, the separation of the null set A k c and q k is still intractable. Thus, we attempt to estimate the FDR, by replacing FD k , ρ / q k by S χ 1 2 ( ρ ) = 1 F χ 1 2 ( ρ ) , the survival function of the distribution of χ 1 2 . Hence, for any given ρ , the estimated FDR k , ρ is defined as
FDR ^ k , ρ = p S χ 1 2 ( ρ ) max { k I I ( υ ^ j k ρ ) , 1 } .
Consequently, similar to the procedures of Benjamini and Hochberg (1995) [27] and Tang et al. (2021) [19] to control the FDR at a prespecified level α ( 0 , 1 ) , we suggest selecting the estimation of the threshold ρ for screening the sufficient active variables by
ρ ^ k = inf 0 ρ ρ 0 : FDR ^ k , ρ α
for some constant ρ 0 given in Condition (C2). In practical implementation, adopt the appropriate value of υ ^ 1 k , , υ ^ p k as the estimation of FDR ^ k , ρ for ρ . Thus, the screened set could be defined as
A ^ k , α j : FDR ^ k , υ ^ j k α , 1 j p .
Define υ ^ l k arg max k A ^ k , α FDR ^ k , υ ^ j k . In other words, υ ^ l k is the threshold ρ such that FDR k , ρ is maximized subject to FDR ^ k , ρ α . Hence, the estimation of FDR is FDR ^ k , υ ^ j k . The proposed FDR control path can be summarized as the following Algorithm 2:
Algorithm 2 QA-SVS-FDR(K) algorithm.
  Input: Observation sample X , Y , the number of grids K, and the prespecified level α
Output: The screened sufficient variable sets A ^ k , α ( k = 1 , , K )
Step 1 Calculate υ ^ k , 1 , , υ ^ k , p of Equation (5) for different k = 1 , , K ;
Step 2 Compute each FDR ^ k , ρ of Equation (11) for ρ by taking each value of υ ^ k , 1 , , υ ^ k , p ;
Step 3 For given α , search for the set A ^ k , α { j : FDR ^ υ ^ j k α , 1 j p } in Equation (12);
Step 4 Find υ k , l arg max t A ^ k , α FDR ^ υ ^ j k and let ρ ^ k = υ ^ k , l ;
Step 5 Separate the screened sufficient active set A ^ k , α of Equation (11) by ρ ^ k .
We call the proposed FDR control path QA-SVS-FDR. The computational cost of the QA-SVS-FDR is on the order of K · O ( p ) . The QA-SVS-FDR is also computationally efficient, and its validity to detect active variables is guaranteed by the following theorem.
Theorem 4 (Sufficient Screening Property by Controlling FDR).
Supposing conditions (C1)(C3) hold, we obtain that
Pr ( A k A ^ k , α ) 1 8 exp c 7 n 1 2 κ ξ ,
where c 7 is some positive constant and s k = A k is the true model size, k = 1 , , K . For a prespecified level α, if s k = | A k | = O ( n ς ) for some ς < 1 / 2 , the FDR of the proposed multiple testing procedure satisfies
lim n FDR ^ ρ ^ k α = 1 ,
where ρ ^ k is given in Equation (11).
We shall prove Theorem 4 in Appendix A.9. Theorem 4 shows the sufficient screening property of the estimation by controlling FDR accurately. The result of the screened variable set by controlling the cardinality with an empirical threshold leads to the FDR being non-negligible. Hence, in terms of the asymptotic null distribution of the test statistic in Theorem 1, the FDR of the QA-SVS-FDR can be controlled accurately at a prespecified level α , as the estimation of FDR can be approximated sufficiently well by large n.
Remark 5.
Alternatively, if focusing on selecting sufficient predictors relevant to the response Y by testing the H 0 , j in Equation (1), one can consider a refined version that is
FDP ρ * = j A c I ( υ ^ j ρ * ) max { j I I ( υ ^ j ρ * ) , 1 } ,
and the estimation of FDR ρ * as
FDR ^ ρ * = p S χ ( K 1 ) 2 ( ρ * ) max { k I I ( υ ^ j ρ * ) , 1 } ,
where υ ^ j = ( K 1 ) / K · k = 1 K υ ^ j k . Consequently, select the threshold ρ * by
ρ ^ * = inf 0 ρ * ρ 0 : FDR k , ρ * α .
As a result, the screened sufficient active variable set is defined as
A ^ α j : FDR ^ υ ^ j α , 1 j p .
Define υ ^ l arg max k A ^ α FDR ^ υ ^ j . The estimation of the FDR is FDR ^ k , υ ^ j k . The path of A ^ α is summarized in Algorithm 3. Under the given level α, the FDR of the testing (3) satisfies that lim n FDR ^ ρ ^ * α = 1 . The conclusion can be simply proved by Corollary 2, and we omit it.
Algorithm 3 QA-SVS-FDR-S algorithm.
Input: Observation sample X , Y , the number of grids K, and the prespecified level α
Output: The screened sufficient variable sets A ^ α
Step 1 Calculate υ ^ 1 , , υ ^ p in Remark 5;
Step 2 Compute each FDR ^ ρ * in Remark 5 for ρ taking each value of υ ^ 1 , , υ ^ p ;
Step 3 For given α , separate for the set A ^ α in Remark 5;
Step 4 Find υ ^ l arg max k A ^ α FDR ^ υ ^ j and let ρ ^ * = υ ^ l ;
Step 5 Separate the screened sufficient active set A ^ α in Remark 5 by ρ ^ * .
Thus far, we have completely shown the two paths of sufficient variable screening by controlling the false discovery. The two paths have different essential frameworks: one is to give the adaptive threshold and outlier detecting model to control the false discovery, and the other is to control the false discovery rate accurately by using the survival functions for estimation under a given prespecified level α . These two paths both can control the false discovery to sufficiently screen active predictors, which is simply the two-step sufficient screening procedure in Yuan et al. (2022) [24].

4. Simulation Studies

In this section, the performance of the proposed procedure will be demonstrated via several simulated examples. In practice, the sample splitting idea is adopted to avoid mathematical challenges caused by the reuse of the sample. Let { ( Y i ( 1 ) , X i ( 1 ) T ) , i = 1 , , n 1 } and { ( Y i ( 2 ) , X i ( 2 ) T ) , i = 1 , , n 2 } be a random disjoint partition of { ( Y i , X i T ) , i = 1 , , n } . The proposed sufficient screening procedure consists of two steps: QA-SVS-SUP, to screen all active variables; QA-SVS-FD, to control the FD adaptively (QA-SVS-AFD) and to control the FDR accurately (QA-SVS-FDR). The two steps are specified as the following:
(1) QA-SVS-A: The p covariates are ranked in descending order according to Remark 5 based on { ( Y i ( 1 ) , X i ( 1 ) T ) , i = 1 , , n 1 } and evaluate the minimum model size that all active variables are included.
(2) QA-SVS-FD: Based on { ( Y i ( 2 ) , X i ( 2 ) T ) , i = 1 , , n 2 } , (i) the sufficient predictors are screened according to Equation (10) at different quantile levels, denoted by A ^ k AFD ; (ii) Given an FDR level α , the threshold ρ ^ k is estimated by Equation (11), and the selected set A ^ k , α AFD is defined by Equation (12).

4.1. Performance of QA-SVS-A

In this subsection, the variable screening performance of our proposed QA-SVS is compared with SIS (Fan and Lv, 2008) [1], the distance correlation-based screening (DC-SIS; Li et al., 2012) [8], the quantile-adaptive model-free sure independence screening (QA-SIS; He et al., 2013) [9], and the quantile-based correlation screening (QCS; Tang et al., 2013) [19]. The performance of each procedure is evaluated via 5%, 25%, 50%, 75%, and 95% quantiles of the minimum model size that all active variables belong to based on 100 replications. The size is closer to the true model size, which indicates the better performance of variable screening.
In the simulation, the predictors X = X 1 , , X p T are generated from a p-variate normal distribution with mean 0 and covariance matrix Σ = ( σ i j ) p × p , where σ i j = ρ | i j | . We set ρ = 0 and 0.5 . Let the number of quantile grid points K = 5 , 6 , , 11 . To simulate a high-dimensional scenario, we set n = 500 and p = 1000 or 5000 for each scenario. The response variable is sampled from the following models:
Scenario 1.1: Z 1 = 0.5 X 1 + 0.5 X 2 + 0.5 X 101 + ε ;
Scenario 1.2: Z 2 = 0.8 X 3 + 0.5 ( X 4 + 1 ) 2 + 0.5 tan π ( X 102 + 1 ) / 4 + ε ;
Scenario 1.3: Z 3 = 0.5 exp 3 X 5 + sin π X 6 / 2 + 5 X 103 I X 103 > Q 0.8 , X 103 + ε ;
Scenario 1.4: Z 4 = 1 + X 7 + X 8 3 exp 1 + 3 sin π X 104 / 2 + ε ;
Scenario 1.5: Z 5 = 0.5 X 9 + tan π ( X 10 1 ) ( X 105 + 1 ) / 4 + ε ;
Scenario 1.6: Z 6 = 2 ( X 11 + 1 ) 2 X 12 X 106 I X 12 > Q 0.5 , X 12 , X 106 < Q 0.5 , X 106 + ε .
The error term ε follows N ( 0 , 1 ) , independent of X . The quantiles of the minimum model size in Scenario 1.1 and Scenario 1.2 that include all active variables with p = 1000 and p = 5000 are shown in Table 1 and Table 2. Due to limited space, the simulation results of the rest Scenario are presented in Appendix B Table A1, Table A2, Table A3 and Table A4.
Under Scenario 1.1 with the linear model at a small signal-to-noise level, all five methods perform well. Under Scenario 1.2 with the additive model, QCS, DC-SIS and QA-SIS perform comparably to the proposed QA-SVS-S procedure, while SIS fails to detect the active predictors. Under Scenario 1.3 with a nonlinear relationship between the response and predictors, all methods perform well to effectively screen out the inactive predictors except for QA-SIS at high quantile level. Under Scenario 1.4 with a nonlinear relationship between the response and predictors, the proposed QA-SVS-S and QCS screening procedures behave effectively, QA-SIS behaves little weaker at the 0.5th quantile level, and the other screening procedures struggle to maintain a reasonable model size at all quantiles. Under Scenarios 1.5 and 1.6 with interactions, the proposed QA-SVS-S and QCS perform relatively stable, while both of them behave a little poorly when there are higher-order effects. The QA-SIS in extremely low or high quantile level suffers a major setback, but the proposed QA-SVS-S screens robustly. In addition, the performance of the proposed QA-SVS-S is only discounted slightly when p increases from 1000 to 5000, but the other methods are not. Furthermore, the results of the proposed QA-SVS-S in all Scenarios under ρ = 0.5 indicate that the correlation of covariates provides sufficient screening relationships. Through the different settings of the number of grid points K, it shows that the QA-SVS-S will be more effective at detecting the active predictors as the increase of K, whereas QCS has the opposite trend.

4.2. Performance of QA-SVS-FD

In this subsection, some scenarios are simulated to examine the proposed QA-SVS-FD as well as the sufficient screening property of the proposed procedure. We compare the variable screening performance of our proposed QA-SVS-FD with the quantile-based correlation screening under controlling FDR (QCS-FDR; Tang et al., 2013) [19]. The predictors X = ( X 1 , , X p ) T are generated from a p-variate normal distribution with mean 0 and covariance matrix Σ = ( σ i j ) p × p . The response is generated from the following models:
Scenario 2.1: Y = j = 1 10 X j + ε , Σ = ( ρ | i j | ) p × p , and ρ = 0.5 ;
Scenario 2.2: Y = j = 1 50 X j + ε , Σ = ( ρ | i j | ) p × p , and ρ = 0.5 ;
Scenario 2.3: Y = exp ( j = 1 10 X j ) + ε , Σ = ( ρ | i j | ) p × p , and ρ = 0.5 ;
Scenario 2.4: Y = exp ( j = 1 50 X j ) + ε , Σ = ( ρ | i j | ) p × p , and ρ = 0.5 ;
Scenario 2.5: Y = j = 1 10 ( 1 ) j 1 X j + ε ;
Scenario 2.6: Y = j = 1 50 ( 1 ) j 1 X j + ε ;
Scenario 2.7: Y = j = 1 10 X j / { 0.5 + ( 1.5 + j = 2 4 ( 1 ) j 1 X j ) 2 } + 0.1 ε ;
Scenario 2.8: Y = j = 1 50 X j / { 0.5 + ( 1.5 + j = 21 40 ( 1 ) j 1 X j ) 2 } + 0.1 ε .
Σ in Scenarios 2.5–2.8 has diagonal element 1 and sub-diagonal element 0.2. The covariates are independent in Scenarios 2.1–2.4 and weakly dependent in Scenarios 2.5–2.8. We consider n = 500 and p = 1000 or 5000 for all scenarios. Set the number of quantile grid points K = 2 , 3 , 4 , 5 , 6 . The nominal false discovery rate is α = 0.05 . We evaluate the performance based on the following criteria:
  • A ^ : the average number of screened variables;
  • FDR: the average of empirical FDP;
  • F 1 - score : the average of 2 · | { j : j A , j A ^ } | / ( | { j : j A } | + | { j : j A ^ } | ) .
Based on 100 replications, the results of the QA-SVS-FDR and the QCS procedure are stored in Table 3, and the results with p = 5000 are presented in Appendix B Table A5.
Under Scenarios 2.1–2.4, the proposed QA-SVS-FDR performs as well as QCS-FDR. The proposed QA-SVS-AFD has the same performance with a small K, whereas QA-SVS-AFD would miss some active predictors as the increase of K. It can be found that the three procedures control the empirical FDR under the prespecified level α for most scenarios. As the increase of the number of active predictors, F 1 - score of the proposed QA-SVS-FD (QA-SVS-AFD and QA-SVS-FDR) has a little improvement, such as 0.92 to 0.97. The QCS-FDR shows the opposite trend. Combined with the | A ^ | , we obtain that our proposed method screens out the null predictors more accurately but will lose some active predictors. With sufficient screening by controlling FDR, our procedure can retain active predictors as much as possible. Under Scenarios 2.5 and 2.7, it can be seen that our method works slightly better than in QCS, especially the FDR and F 1 - score of QA-SVS-AFD reach 0 and 1, respectively. Under Scenarios 2.6 and 2.8, the proposed QA-SVS-AFD and QCS-FDR both fail. However, it is worth mentioning that QA-SVS-FD has larger | A ^ | and F 1 - score than QCS-FDR, which indicates that the performance of QA-SVS-FDR is more effective. In addition, our QA-SVS-FD procedure works reasonably well as p increases from 1000 to 5000, where QCS behaves slightly poorly. In summary, our proposed method performs almost as well and is more effective than QCS-FDR in various practical settings.
In terms of the highly sensitivity of the model-free method to some factors that can distort the underlying relationships between the covariates and the response, we suggest that one can reduce the sensitivity by using the QA-SVS procedure with different numbers of unfixing grid points. This can lead to different model complexity, where the large K can lead to overfitting, and the small K can lead to under-fitting.

5. Real Dataset Research

In the era of rapid development of machine learning and pattern recognition, some image recognition technologies are applied in the medical field. For example, through the processing of lung CT images, we can identify whether the lung has a disease. The following two methods are often used to quantitatively evaluate the severity of emphysema: one is CT density measurement. Based on the pixel image of CT digital, calculate the average lung density of the patient, then establish the threshold, calculate the proportion of the area below the threshold, and evaluate the situation of emphysema. The other is the percentile density measurement (PD) technique. Analyze the attenuation distribution curve of lung density, give a percentile (commonly 5% and 95%), calculate the area below the percentile density curve, and evaluate the symptoms of emphysema [28]. In this section, we shall apply our proposed method to analyze the lung CT image dataset downloaded from Kaggle, which can segment lungs accurately.
There is a picture of a subject in Appendix C Figure A1. Among them, we regard 5% and 95% PD data as the corresponding continuous response variable, respectively. For smokers, these values are usually high, indicating that other substances in the lungs have accumulated. The data include 267 instances and 512*512 continuous covariates stretched by picture pixels.
By giving different values of quantile grid points K = 2 , 3 , , 6 and considering the threshold of FDR under the given prespecified level α = 0.05 , we obtain the different segmentations and extractions. The numbers of selected picture pixels are displayed in Table 4. It is clear that QCS-FDR loses efficacy, and QA-SVS-AFD works when K 3 . Fortunately, QA-SVS-FDR works effectively under all values of K = 2 , 3 , , 6 . Compared with the QA-SVS-AFD(K) and the QA-SVS-FDR(K), under the hypothesis testing (2), it could be found that the screened active variable set is estimated by the rejection region of the QA-SVS-AFD(K) path, which controls the probability around 1 / ( 512 512 ) with not enough number of active variables. The QA-SVS-FDR(K) selects the active variable sufficiently by testing the null hypothesis of testing (2) under the given prespecified FDR level α = 0.5 . We illustrate the extraction by plotting the segmented lung CT with the average of the values of the selected predictors, which are presented in Appendix C, Figure A2, Figure A3, Figure A4, Figure A5 and Figure A6. These results may provide some information for measuring important clinical parameters (lung volume, PD, etc); considering the length of this paper, we do not go further.

6. Conclusions

In this paper, we propose a multiple testing procedure with false discovery control to detect active variables sufficiently. The multiple testing procedure can be applied with the quantile-adaptive screening method when the dimensionality is ultra-high. Although the QA-SVS procedure is built on the quantile-adaptive marginal screening statistic, by controlling the FD of the marginal structural testings, the QA-SVS procedure can screen out sufficient variables through the precise separation of the sufficient variable set. As the results in this paper, if the grid points K grow faster than n and p, the QA-SVS statistic can capture more subtle values better than QC-SIS, which is in line with the definition of the sufficient variable. In addition, the convergent rate of the asymptotic null distribution of our proposed procedure is larger than the QCS under a large K. In the simulation studies, we set different values of K to inspect the performance of the QA-SVS. Nevertheless, it would be of interest to study a data-driven way to select K. We leave some space here for the future research.

Author Contributions

Conceptualization, J.C.; Data curation, Z.Y.; Formal analysis, Z.Y.; Funding acquisition, J.C.; Investigation, Z.Y.; Methodology, Z.Y.; Project administration, H.Q. and Y.H.; Supervision, J.C.; visualization, H.Q.; Validation, Y.H.; Writing—original draft, Z.Y.; Writing—review and editing, Z.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under Grant No. 81671633 to J.C.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data in this paper have been presented in the manuscript.

Acknowledgments

Many thanks to reviewers for their positive feedback, valuable comments, and constructive suggestions that helped improve the quality of this article. Many thanks to the editors’ great help and coordination for the publication of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Main Proof

Appendix A.1. Proof of Remark 1

Proof. 
According to the definition of A k in Assumption (I), for any x A k c R A k c , we have
Pr ( Y G k X ) = Pr ( Y G k X A k ) .
Due to that X = ( X A k , X A k c ) , for any x A k c R A k c , multiply Pr ( X A k c = x A k c X A k ) on both sides of the Equation (A1), we obtain that
Pr ( Y G k , X A k c = x A k c X A k ) = Pr ( X A k c = x A k c X A k ) · Pr ( Y G k X A k ) .
Thus, we obtain that
Pr ( Y G k , X A k c x A k c X A k ) = Pr ( X A k c x A k c X A k ) · Pr ( Y G k X A k ) .
Due to the multiplicative law of probability, it is clear that
I ( Y G k ) / A k c A k .
In terms of invertibility, we proved that
A k = { 1 j p : Pr ( Y G k X ) functionally depends on X j } = { 1 j p : I ( Y G k ) / X j X A k } .
Thus, A k is the sufficient screening variables index set. □

Appendix A.2. Proof of Lemma 1

Proof. 
Note that A k c = { 1 j p : I ( Y G k ) X j X A k } ; this indicates that
Pr ( Y G k , X A k c x A k c X A k ) = Pr ( Y G k X A k ) · Pr ( X A k c x A k c X A k ) Pr ( X A k c x A k c Y G k , X A k ) = Pr ( X A k c x A k c X A k ) E X A k c Pr ( X A k c x A k c Y G k , X A k ) = E X A k c Pr ( X A k c x A k c X A k ) Pr ( X A k c Y G k ) = Pr ( X A k c ) .
Thus, we obtain that F j ( x Y G k ) F j ( x ) = 0 holds when j A k . □

Appendix A.3. Proof of Lemma 2

Proof. 
Note Ω = { 0 , 1 , , n 1 } , and Ω k = { ξ 1 , ξ 2 , , ξ e } ( e = 1 , , n ) is an e size sample set random sampled in equal probability without replacement from the population, where ξ i ( i = 1 , 2 , , e ) represents the i-th random sample, which satisfies a discrete uniform distribution from 0 to n 1 , that is, ξ i DiscreteU ( 0 , n 1 ) . ξ i has the following properties:
E ξ i = n 1 2 , E ξ i 2 = ( n 1 ) ( 2 n 1 ) 6 , Var ξ i = n 2 1 12 .
In addition, for all i 1 i 2 , where i 1 , i 2 { 1 , 2 , , n } ,
E ξ i 1 ξ i 2 = ( n 2 ) ( 3 n 1 ) 12 , Cov ξ i 1 , ξ i 2 = n + 1 12 .
For continuous variables subject to arbitrary distribution X j = x 1 j , x 2 j , , x n j , let ξ i j = k i n I ( X i j x k j ) . It is easy to find that the random variable ξ i j ( i = 1 , 2 , , n ) is a special case in Mohamed and Mirakhmedov (2016) [26], where e = n . According to the definition of τ ^ j k and ξ i j , it is obviously established that Pr ( ξ i = r ) = 1 / n ( i = 1 , 2 , , n ; t = 0 , 1 , , n 1 ) and ξ i 1 ξ i 2 ( i 1 i 2 ) . By the definition of ξ i j , for all j = 1 , 2 , , p , k = 1 , 2 , , K ,
1 n + 1 k = 1 n 1 n i = 1 n I ( x i j x k j , Y i G ^ k ) p ^ k = 1 n ( n + 1 ) p ^ k k = 1 n i = 1 n I Y i G ^ k I ( x i j x k j ) = 1 n ( n + 1 ) p ^ k k = 1 n i k n I Y i G ^ k I ( x i j x k j ) + k = 1 n I Y i G ^ k = 1 n ( n + 1 ) p ^ k i = 1 n I Y i G ^ k ξ i j + 1 n + 1 .
If H 0 , j , k ( j = 1 , 2 , , p ; k = 1 , 2 , , K ) is true, the expectation and variance of τ ^ j k can be obtained as follows:
E τ ^ j k = E 1 n + 1 k = 1 n 1 n i = 1 n I ( x i j x k j , Y i G ^ k ) p ^ k 1 2 = E 1 n ( n + 1 ) p ^ k i = 1 n I Y i G ^ k ξ i j + 1 n + 1 1 2 = 1 n ( n + 1 ) p ^ k i = 1 n I Y i G ^ k E ξ i j + 1 n + 1 1 2 = 1 n ( n + 1 ) p ^ k n 1 2 n p ^ k + 1 n + 1 1 2 = 0
and
Var τ ^ j k = Var 1 n + 1 k = 1 n 1 n i = 1 n I ( x i j x k j , Y i G ^ k ) p ^ k 1 2 = Var 1 n ( n + 1 ) p ^ k i = 1 n I Y i G ^ k ξ i j = 1 n 2 ( n + 1 ) 2 p ^ k 2 i = 1 n I Y i G ^ k Var ξ i j + i 1 i 2 n I Y i 1 G ^ k , Y i 2 G ^ k Cov ξ i 1 j , ξ i 2 j = 1 n 2 ( n + 1 ) 2 p ^ k 2 n p ^ k n 2 1 12 n p ^ k n p ^ k 1 n + 1 12 = 1 p ^ k 12 ( n + 1 ) p ^ k .
Let Ω n = ( ζ 1 n , , ζ n n ) be a random permutation of { 0 , 1 , n 1 } , and r = ( r 1 , , r N ) is a random vector independent of ζ 1 n , , ζ n n satisfying P { r 1 = k 1 , , r n = k n } = 1 / ( n ! ) , where k = ( k 1 , , k n ) is also a random permutation of 1 , , n . Note that S n p ^ k , n = ζ r 1 n + + ζ r n p ^ k n represents a sum of n random vector samples chosen at random without replacement from the population Ω n . It can be expressed equivalently as S n p ^ k , n = I { Q ^ k 1 Y 1 < Q ^ k } ζ 1 N + + I { Q ^ k 1 Y n < Q ^ k } ζ n n , where I ( · ) represents indicative function. It is shown that i = 1 n I Y i G ^ k ξ i j and S n p ^ k , n have the same distribution. Denote
τ = n 1 2 , b 2 = n 2 1 12 , ζ ^ m n = ζ m n τ b , σ 2 = ( 1 p ^ k ) ( n 2 1 ) 12 , σ * 2 = ( 1 p ^ k ) ( n + 1 ) n 12 , A k = 1 n m = 1 n ζ ^ m n k , B k = 1 n m = 1 n | ζ ^ m n | k .
Let F n ( u ) = Pr S n p ^ k , n < u σ * n p ^ k + n p ^ k τ . Based on the same distribution of S n p ^ k , n and i = 1 n I ( Y i G ^ k ) ξ i j , for any j { 1 , , p } , we have
F n ( u ) = Pr i = 1 n I Y i G ^ k ξ i j < u σ * n p ^ k + n p ^ k τ = Pr 1 n p ^ k ( n + 1 ) i = 1 n I Y i G ^ k ξ i j n p ^ k n 1 2 < u σ * n p ^ k ( n + 1 ) = Pr τ ^ j k < u 1 p ^ k 12 ( n + 1 ) p ^ k : = F τ ^ j k ( u ) .
Considering
τ ^ j k = 1 n + 1 k = 1 n 1 n i = 1 n I ( x i j x k j , Y i G ^ k ) p ^ k 1 2 = 1 n ( n + 1 ) p ^ k k = 1 n i k n I Y i G ^ k I ( x i j x k j ) + k = 1 n I Y i G ^ k 1 2 = 1 n ( n + 1 ) p ^ k k = 1 n i k n I Y i G ^ k I ( x i j x k j ) n 1 2 ( n + 1 ) = n 1 2 ( n + 1 ) 1 n ( n + 1 ) p ^ k k = 1 n i k n I Y i G ^ k I ( x i j x k j ) = 1 n ( n + 1 ) p ^ k i = 1 n I Y i G ^ k ξ i j n 1 2 ( n + 1 ) ,
where continuous variable x j satisfies Pr ( x i j = x k j ) = 0 . Define ξ i j = k i n I ( X i j X k j ) ; then, ξ i j has the same distribution with S n p ^ k , n ; as a result, ξ i j has the same distribution with ξ i j . In other words, τ ^ j k and τ ^ j k have the same distribution, which is F τ ^ j k ( u ) = 1 F τ ^ j k ( u ) . If lim n p ^ k ( 1 p ^ k ) > 0 , we have the following relationship by using Theorem 3.4, Corollary 3.4, Corollary 3.5, and Corollary 3.6 of Mohamed and Mirakhmedov (2016) [26], and we can easily obtain that
  • For all u = o ( n ) 0 , we obtain that
    1 F τ ^ j k ( u ) 1 Φ ( u ) = F τ ^ j k ( u ) Φ ( u ) = exp u 3 n L n u n 1 + O u + 1 n .
    where L n ( v ) is an odd power series that, for all sufficiently large N, is majorized by a power series with coefficients not depending on N, and is convergent in some discussions, and L n ( v ) converges uniformly in n for sufficiently small values of v, where L n ( 0 ) = 0 .
  • For all u = o ( n 1 / 4 ) 0 , we obtain that
    1 F τ ^ j k ( u ) 1 Φ ( u ) = F τ ^ j k ( u ) Φ ( u ) = exp u 3 n O u n 1 + O u + 1 n = 1 + O u 4 n 1 + O u + 1 n .
  • For all u = o ( n 1 / 6 ) 0 , we obtain that
    1 F τ ^ j k ( u ) 1 Φ ( u ) = F τ ^ j k ( u ) Φ ( u ) = 1 + O ( u + 1 ) 3 n .
  • For all 0 u C min n * q ^ / max Y ^ m N , ( n * q ^ ) 1 / 6 / B 3 1 / 3 , we have
    1 F τ ^ j k ( u ) = ( 1 Φ ( u ) ) 1 + O ( u + 1 ) 3 B 3 / n * q ^
Based on the above cases, using Taylor’s expansion of the Equations (A2)–(A5), the following conclusions can be obtained through the order correlation of u and n: if H 0 , j , k ( j = 1 , 2 , , p ; k = 1 , 2 , , K ) is true, then
1 F τ ^ j k ( u ) 1 Φ ( u ) = F τ ^ j k ( u ) Φ ( u ) = 1 + O n 1 / 2 ,
where Φ ( u ) = 1 / 2 p = O ( n α ) and α 1 / 2 . In other words, τ ^ j k has the asymptotic normal distribution N ( 0 , ( 1 p ^ k ) / ( 12 ( n + 1 ) p ^ k ) ) for all j { 1 , 2 , , p } and all k { 1 , 2 , , K } . □

Appendix A.4. Proof of Corollary 1

Proof. 
By definition of υ j k ^ , we can easily find that 12 ( n + 1 ) υ ^ j k has the asymptotic distribution as χ 1 2 , where χ 1 2 is the chi-square distribution with degree of freedom 1. Since k = 1 K p ^ k τ ^ j k = 0 , then 12 ( n + 1 ) k = 1 K υ j k ^ = 12 ( n + 1 ) υ j ^ has the asymptotic distribution as χ K 1 2 with degree of freedom K 1 . □

Appendix A.5. Proof of Theorem 1

Proof. 
For some j = 1 , , p , let { X i j : i = 1 , , n } be a random sample of X j . Some notations are employed. Let p k = Pr Y G k and p k * = Pr Y G ^ k . Write W k = I ( Y G k ) , W k * = I ( Y G ^ k ) , W k i = I Y i G ^ k , f j ( k , x ) = I ( X j x , Y G k ) , f j * ( k , x ) = I X j x , Y G ^ k , f i j ( k , x ) = I X i j x , Y i G ^ k , ζ j ( k ) = E X { F j k ( x ) } ,
ζ j * ( k ) = E X { F j k * ( x ) } = E X { Pr ( X j x Y G ^ k ) } , ζ ˜ j ( k ) = 1 n 2 l = 1 n i = 1 n I ( X i j X l j , Y i G ^ k ) p ^ k ,
and
ζ ^ j ( k ) = 1 n ( n + 1 ) l = 1 n i = 1 n I ( X i j X l j , Y i G ^ k ) p ^ k .
Let F n ( · ) be the empirical distribution function. By Hoeffding’s inequality, for any k and certain constant c 8 > 0 ,
Pr F Q k F n Q k ϵ exp 2 n c 8 ϵ 2
hold for any ϵ ( 0 , 1 ) , where R x j is the support of a continuous variable x j , k = 1 , , K and = 1 , , p . Due to the fact that W k * W k = I ( Q ^ k 1 Y < Q k 1 ) + I ( Q k Y < Q ^ k ) , we obtain that
Pr E W k * W k 2 ϵ = 1 Pr { E W k * W k < 2 ϵ } 1 Pr { | F Q k F n Q k | < ϵ , | F Q k 1 F n Q k 1 | < ϵ } 1 ( 1 exp ( 2 n c 8 ϵ 2 ) ) 2 2 exp ( 2 n c 8 ϵ 2 ) + o ( exp ( 2 n c 8 ϵ 2 ) ) = 2 exp ( 2 n c 8 ϵ 2 ) .
Similarly, due to the fact that p k * p k = ( F Q k F n Q k ) ( F Q k 1 F n Q k 1 ) , we have
Pr p k * p k 2 ϵ = 1 Pr { | p k * p k | < 2 ϵ } 1 Pr { | F Q k F n Q k | < ϵ , | F Q k 1 F n Q k 1 | < ϵ } 2 exp ( 2 n c 8 ϵ 2 ) .
Note that
| ζ j * ( k ) ζ j ( k ) | = E { f j * ( k , x ) } p k * E f j ( k , x ) p k E { f j * ( k , x ) } | p k * p k | p k p k * + E { f j * ( k , x ) f j ( k , x ) } p k sup x R x j E { f j * ( k , x ) } | p k * p k | p k p k * + sup x R x j E { I ( X j x ) · ( W j * ( k ) W j ( k ) ) } p k = | p k * p k | + E W k * W k p k ,
where the last equality holds due to the fact that
sup x R x j E { f j * ( k , x ) } = sup x R x j Pr x j x , Y G ^ k = p k *
and
sup x R x j E { f j * ( k , x ) f j ( k , x ) } = sup x R x j E { I ( X j x ) · ( W j * ( k ) W j ( k ) ) } = sup x R x j Pr x j x , Y G k , Y G ^ k + sup x R x j Pr x j x , Y G k , Y G ^ k = E W k * W k .
According to Equations (A6) and (A8), we obtain that
Pr ( | ζ j * ( k ) ζ j ( k ) | ϵ ) Pr | p k * p k | + E W k * W k p k ϵ Pr | p k * p k | + E W k * W k ϵ c 1 / 2 K Pr | p k * p k | ϵ c 1 / 4 K + Pr E W k * W k ϵ c 1 / 4 K 4 exp ( n c 9 ϵ 2 / K 2 )
hold for any ϵ ( 0 , 1 ) , k = 1 , , K and = 1 , , p . According to Lemmas 1 and 2 of Xie et al. (2020) [17], under Conditions (C1) and (C3), for any ϵ ( 0 , 1 / 2 ) and j = 1 , , p , there exists a positive constant c 3 , which satisfies that
Pr ζ ˜ j ( k ) ζ j * ( k ) ϵ 4 ( n + 2 ) exp c 3 n ϵ 2 / R .
Let
ζ ^ j ( k ) ζ j * ( k ) n n + 1 ζ ˜ j ( k ) ζ j * ( k ) + Δ ,
where Δ = ζ j ( k ) * / ( n + 1 ) .
It follows from Condition (C3) and Equations (A5) and (A9) that K = O n ξ for ξ + κ < 1 / 2 . By letting ϵ = 2 * c 3 * n κ = 2 c 3 * n 1 / 2 β for 0 κ < 1 / 4 and c 3 * > 0 , we have
Pr max 1 j p τ ^ j k τ j , k 2 c 3 * n κ p Pr τ ^ j k τ j , k 2 c 3 * n κ = p Pr ζ ^ j ( k ) ζ j ( k ) 2 c 3 * n κ = p Pr ( ζ ^ j ( k ) ζ j * ( k ) ) + ( ζ j * ( k ) ζ j ( k ) ) 2 c 3 * n κ p 1 Pr ζ ^ j ( k ) ζ j * ( k ) < c 3 * n κ · Pr ζ j * ( k ) ζ j ( k ) < c 3 * n κ p 1 p I 1 · p I 2 ,
where p I 1 Pr ζ ^ j ( k ) ζ j * ( k ) < c 3 * n κ and p I 2 Pr ζ j * ( k ) ζ j ( k ) < c 3 * n κ . For p I 1 , we have that
p I 1 = 1 Pr ζ ^ j ( k ) ζ j * ( k ) c 3 * n κ 1 Pr ζ ˜ j ( k ) ζ j * ( k ) ( n + 1 ) c 4 * n κ | Δ | n 1 4 ( n + 2 ) exp c 10 n 1 2 κ ξ ,
where c 10 is a positive constant. For p I 2 , we have that
p I 2 = 1 Pr ζ j * ( k ) ζ j ( k ) c 3 * n κ 1 4 exp ( c 11 n 1 2 κ 2 ξ )
where c 11 > 0 is a constant. The last inequality above holds due to the fact that Δ = O ( 1 / n ) . Thus,
Pr max 1 j p τ ^ j k τ j , k 2 c 3 * n κ p 1 1 4 ( n + 2 ) exp c 10 n 1 2 κ ξ · 1 4 exp ( c 11 n 1 2 κ 2 ξ ) 8 p exp ( c 4 n 1 2 κ 2 ξ )
Let c 3 = 4 p ^ k / ( 1 p ^ k ) ( c 3 * ) 2 ; by the definition of υ ^ j k , we have
Pr ( max 1 j p | υ ^ j k υ j k | c 4 n κ ) 8 p exp c 4 n 1 2 κ ξ .

Appendix A.6. Proof of Theorem 2

Proof. 
It follows from Condition (C2) and Theorem 1 that
Pr min j 1 A k 1 υ ^ j k max j 2 A k 2 υ ^ j k < ρ 0 2 Pr min j 1 A k 1 υ ^ j k max j 2 A k 2 υ ^ j k min j 1 A k 1 υ j , k max j 2 A k 2 υ j , k < ρ 0 2 Pr min j 1 A k 1 υ ^ j k max j 2 A k 2 υ ^ j k min j 1 A k 1 υ j , k max j 2 A k 2 υ j , k > ρ 0 2 Pr 2 max 1 j p υ ^ j k υ j , k > ρ 0 2 8 p exp c 7 n ρ 0 2 / K ,
where c 7 > 0 is a constant. K log ( p ) = o n ρ 0 2 ensures that there exists some n 0 > 0 for n > n 0 , p exp c 7 n ρ 0 2 / 2 K . Consequently, we have that lim inf n min j 1 A k 1 υ ^ j k max j 2 A k 2 υ ^ j k > 0 almost surely.

Appendix A.7. Proof of Theorem 3

Proof. 
Due to the equivalence of 12 · ( n + 1 ) · υ ^ j k ρ and | τ ^ j k | ρ 1 p ^ k 12 ( n + 1 ) p ^ k , proof of Theorem 3 will obtained by | τ ^ j k | . Note that
F | τ ^ j k | ( u ) : = Pr | τ ^ j k | < u 1 p ^ k 12 ( n + 1 ) p ^ k , ( j { j : j A k } ; k = 1 , , K ) ; F | τ ^ k | ( u ) : = Pr | τ ^ j k | < u 1 p ^ k 12 ( n + 1 ) p ^ k , j { j : j A k } , ( k = 1 , , K ) ; F | τ ^ k | c ( u ) : = Pr | τ ^ j k | u 1 p ^ k 12 ( n + 1 ) p ^ k , j { j : j A k } , ( k = 1 , , K ) .
Then, denote q k = p s k , and we have
F | τ ^ j k | ( u ) = F τ ^ j k ( u ) F τ ^ j k ( u ) = 1 2 F τ ^ j k ( u ) , F | τ ^ k | ( u ) = [ F | τ ^ j k | ( u ) ] q k = [ 1 2 Pr τ ^ j k ( u ) ] q k , F | τ ^ k | c ( u ) = 1 F | τ ^ k | ( u ) = 1 [ 1 2 Pr τ ^ j k ( u ) ] q k .
When β ( 2 , + ) , we have ρ = o ( n 1 / 4 ) as a constant. Hence, we obtain
1 F τ ^ j k ( ρ ) 1 / 2 p = F τ ^ j k ( ρ ) 1 / 2 p = 1 + O n 1 / 2 .
By the definition of F | τ ^ k | c ( x ) , it implies that
F | τ ^ k | c ( k ) = 1 1 p 1 1 + O n 1 / 2 q k .
From the definition of ultra-high dimensional data, we have p = o exp { n α } , α > 0 and s k = o ( n ) . If α > 1 / 2 , according to Equations (A10) and (A11), we have
1 p 1 1 + O n 1 / 2 q k = exp q k · log 1 p 1 1 + O n 1 / 2 = exp q k · p 1 1 + O n 1 / 2 + o ( 1 / p ) = exp 1 + O ( n 1 / 2 ) exp o ( 1 / p ) e 1 .
To conclude, F | τ ^ k | c ( k ) 1 e 1 a.s. n ; in other words,
lim p Pr { FD k , ρ ^ 0 > 0 } = lim p Pr { j A k c I ( υ ^ j k ρ ) > 0 } = 1 e 1 ( k = 1 , , K ) .
The number of variables screened into the adaptive FD set is subjecting to the p times Bernoulli test; then, the expectation and variance of the number of the false discovery are
EFD = q k · 1 / p = ( 1 + O ( n 1 / 2 ) ) ( 1 o ( p 1 ) ) = 1 + O ( n 1 / 2 ) , VFD = q k · 1 / p · ( 1 1 / p ) = ( 1 + O ( n 1 / 2 ) ) ( 1 o ( p 1 ) ) 2 = 1 + O ( n 1 / 2 ) .

Appendix A.8. Proof of Corollary 2

Proof. 
According to the Condition (C2), the definition of ρ ^ 0 in Section 3.1 and max j A k υ ^ j , k υ j , k c n κ in Theorem 1, we have
min j A k | υ ^ j k | min j A k ( | υ j k | | υ ^ j , k υ j k | ) min j A k | υ j k | max j A k | υ j k υ ^ j , k | c n κ .
Therefore, we obtain that
Pr A k A ^ k , ρ ^ 0 Pr max j A k | υ ^ j , k υ j , k | c n κ 1 8 p s r exp c 6 n 1 2 κ ξ
holds for some constant c 6 > 0 .

Appendix A.9. Proof of Theorem 4

Proof. 
Similar to Proof of Corollary 2, it is clear that
Pr ( A k A ^ k , α ) 1 8 p s k exp c 7 n 1 2 κ ξ
holds for some constant c 7 > 0 .
In order to prove FDR ^ ρ ^ k α in probability, under the assumption that q k / p 1 as p and for any ρ > 0 , by Corollary 1 and the Hoeffding’s inequality, it suffices to show that
sup 0 ρ ϵ j A k c I ( υ ^ j k ρ ) / p S χ 1 2 ( ρ ) 1
in probability as n , where ϵ > 0 is a constant S χ 1 2 ( ρ ) = 1 F χ 1 2 ( ρ ) , the survival function of the distribution of χ 1 2 . Thus,
FDP k , ρ = FD k , ρ max { j I I ( υ ^ j k ρ ) , 1 } = j A k c I ( υ ^ j k ρ ) / q k max { j I I ( υ ^ j k ρ ) , 1 } / p = p S χ 1 2 ( ρ ) max { j I I ( υ ^ j k ρ ) , 1 } = p S χ 1 2 ( ρ ) max { p S χ 1 2 ( ρ ) + j A k I ( υ ^ j k ρ ) , 1 } .
Notice that j A k I ( υ ^ j k ρ ) is monotone in ρ and asymptotically converges to s k , and S χ 1 2 ( ρ ) is continuous and monotone. Then, there exists a unique constant 0 < ρ ˜ k C n β such that
p S χ 1 2 ( ρ ^ k ) max { j I I ( υ ^ j k ρ ^ k ) , 1 } = α
in probability as n . Therefore, according to the Equations (A12) and (A13), we obtain that
lim n FDR ^ ρ ^ k α = 1 .

Appendix B

Table A1. The quantiles of minimum model size in Scenario 1.3 of Section 4.1.
Table A1. The quantiles of minimum model size in Scenario 1.3 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.03.03.03.03.03.03.03.04.05.0
QA-SVS-A(5)3.03.03.03.03.03.03.04.04.05.0
QA-SVS-A(6)3.03.03.03.03.03.03.04.04.05.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.04.05.0
QA-SVS-A(8)3.03.03.03.03.03.03.04.04.05.0
QA-SVS-A(9)3.03.03.03.03.03.03.04.04.05.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.04.05.0
QCS(4)3.03.03.03.03.03.03.03.03.55.0
QCS(5)3.03.03.03.03.03.03.03.04.07.0
QCS(6)3.03.03.03.03.03.03.03.04.05.5
QCS(7)3.03.03.03.03.53.03.03.04.011.5
QCS(8)3.03.03.03.04.03.03.03.04.012.0
QCS(9)3.03.03.03.04.53.03.03.04.07.5
QCS(10)3.03.03.03.03.53.03.03.04.010.0
SIS3.03.03.03.06.53.03.04.05.06.0
DC-SIS3.03.03.03.03.03.03.03.04.04.5
QA-SIS(0.1)3.03.03.03.04.53.03.03.03.516.5
QA-SIS(0.3)3.03.03.03.03.03.03.03.03.04.0
QA-SIS(0.5)3.03.03.03.03.03.03.03.03.05.0
QA-SIS(0.7)3.03.03.03.04.03.03.04.04.010.5
QA-SIS(0.9)4.015.032.574.5217.04.514.536.581.5299.0
p = 5000
QA-SVS-A(4)3.03.03.03.03.03.03.03.04.05.0
QA-SVS-A(5)3.03.03.03.03.03.03.04.04.05.0
QA-SVS-A(6)3.03.03.03.03.03.03.04.04.06.0
QA-SVS-A(7)3.03.03.03.03.03.03.04.04.08.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.04.06.0
QA-SVS-A(9)3.03.03.03.03.53.03.03.04.07.0
QA-SVS-A(10)3.03.03.03.03.03.03.04.04.06.5
QCS(4)3.03.03.03.04.03.03.03.04.06.0
QCS(5)3.03.03.03.07.03.03.03.04.06.0
QCS(6)3.03.03.03.05.03.03.03.04.054.0
QCS(7)3.03.03.03.05.03.03.03.04.032.0
QCS(8)3.03.03.03.08.03.03.03.04.060.5
QCS(9)3.03.03.03.06.53.03.03.04.019.5
QCS(10)3.03.03.03.07.53.03.04.06.092.0
SIS3.03.03.03.018.53.03.04.05.011.5
DC-SIS3.03.03.03.03.03.03.03.04.05.0
QA-SIS(0.1)3.03.03.03.031.03.03.03.04.028.0
QA-SIS(0.3)3.03.03.03.03.03.03.03.03.04.5
QA-SIS(0.5)3.03.03.03.03.03.03.03.04.07.5
QA-SIS(0.7)3.03.03.04.018.53.03.03.07.027.5
QA-SIS(0.9)16.058.5130.5323.51382.06.031.0124.5421.51844.0
Notes: QA-SVS-A(4), QA-SVS-A(5), …, and QA-SVS-A(10), our proposed method defined in Remark 3 with different quantile grid points (K = 4, …, 10); QCS-A(4), QCS-A(5), …, and QCS-A(10), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 4, …, 10); SIS, the sure independence screening (Fan and Lv, 2008) [1]; DC-SIS, the distance correlation-based screening (Li et al., 2012) [8]; QA-SIS(0.1), QA-SIS(0.3), …, QA-SIS(0.9), the quantile-adaptive model-free sure independence screening (He et al., 2013) [9] at different quantile levels.
Table A2. The quantiles of minimum model size in Scenario 1.4 of Section 4.1.
Table A2. The quantiles of minimum model size in Scenario 1.4 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(5)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(6)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(9)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.03.03.0
QCS(4)3.03.03.03.03.03.03.03.03.04.0
QCS(5)3.03.03.03.03.03.03.03.03.07.0
QCS(6)3.03.03.03.03.03.03.03.03.05.0
QCS(7)3.03.03.03.03.03.03.03.03.05.0
QCS(8)3.03.03.03.03.03.03.03.03.08.0
QCS(9)3.03.03.03.03.03.03.03.03.06.5
QCS(10)3.03.03.03.03.03.03.03.03.07.5
SIS414.0686.0781.5894.0979.5558.5706.5824.5932.0988.0
DC-SIS441.5619.0742.5840.0962.0341.5601.0747.5895.0971.0
QA-SIS(0.1)135.5223.0321.5428.5626.582.5134.5201.5303.0543.0
QA-SIS(0.3)14.028.049.593.5237.510.024.565.0116.0593.5
QA-SIS(0.5)8.020.032.054.5162.53.05.06.08.015.0
QA-SIS(0.7)37.573.0146.5223.0394.59.015.520.032.556.5
QA-SIS(0.9)152.5291.0418.0562.0816.060.5145.0215.5307.5548.5
p = 5000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(5)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(6)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(9)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.03.03.0
QCS(4)3.03.03.03.03.03.03.03.03.03.0
QCS(5)3.03.03.03.03.03.03.03.03.03.0
QCS(6)3.03.03.03.03.03.03.03.03.03.0
QCS(7)3.03.03.03.03.03.03.03.03.03.0
QCS(8)3.03.03.03.03.03.03.03.03.03.0
QCS(9)3.03.03.03.03.03.03.03.03.03.0
QCS(10)3.03.03.03.03.03.03.03.03.03.0
SIS2126.03353.03956.54469.54874.53038.53633.54071.04528.54954.5
DC-SIS1949.03324.04045.54387.54832.01485.53097.53959.04368.04702.0
QA-SIS(0.1)507.51137.51470.01981.53183.0433.0747.51177.51573.52852.5
QA-SIS(0.3)54.597.0177.0348.51122.539.0126.0288.0595.52742.0
QA-SIS(0.5)33.595.0184.5351.5711.06.011.015.029.062.5
QA-SIS(0.7)143.0419.5690.01263.52217.534.560.5100.0175.0402.0
QA-SIS(0.9)670.51140.01775.52586.53707.5439.5717.01022.01547.02662.5
Notes: QA-SVS-A(4), QA-SVS-A(5), …, and QA-SVS-A(10), our proposed method defined in Remark 3 with different quantile grid points (K = 4, …, 10); QCS-A(4), QCS-A(5), …, and QCS-A(10), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 4, …, 10); SIS, the sure independence screening (Fan and Lv, 2008) [1]; DC-SIS, the distance correlation-based screening (Li et al., 2012) [8]; QA-SIS(0.1), QA-SIS(0.3), …, QA-SIS(0.9), the quantile-adaptive model-free sure independence screening (He et al., 2013) [9] at different quantile levels.
Table A3. The quantiles of minimum model size in Scenario 1.5 of Section 4.1.
Table A3. The quantiles of minimum model size in Scenario 1.5 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.04.09.028.5317.03.03.06.024.0156.5
QA-SVS-A(5)3.03.08.040.0179.53.03.04.08.042.5
QA-SVS-A(6)3.03.05.012.093.53.03.04.05.023.0
QA-SVS-A(7)3.03.05.013.082.03.03.04.09.559.5
QA-SVS-A(8)3.03.04.011.552.03.03.03.07.571.0
QA-SVS-A(9)3.03.04.011.047.53.03.03.06.053.5
QA-SVS-A(10)3.03.04.010.589.53.03.04.06.559.0
QCS(4)3.03.04.010.046.03.03.03.06.060.5
QCS(5)3.03.04.08.058.03.03.04.06.566.5
QCS(6)3.03.04.011.0251.03.03.03.06.044.0
QCS(7)3.03.04.011.576.53.03.04.06.058.5
QCS(8)3.03.06.017.5118.03.03.04.09.576.5
QCS(9)3.04.06.022.5118.03.03.04.015.066.5
QCS(10)3.04.07.020.0156.53.03.05.09.572.5
SIS260.0559.0802.5922.5981.5227.5574.5769.5907.0988.0
DC-SIS3.03.06.028.0450.03.03.04.019.5350.0
QA-SIS(0.1)82.0168.5300.0521.5849.570.0132.0241.5395.0737.5
QA-SIS(0.3)10.521.042.598.0195.04.512.021.053.0306.0
QA-SIS(0.5)5.520.556.0166.5517.53.09.524.590.5439.5
QA-SIS(0.7)4.011.030.585.5309.07.515.535.5114.5355.0
QA-SIS(0.9)93.5191.0382.0587.0798.5111.5259.5467.5660.5852.5
p = 5000
QA-SVS-A(4)3.012.058.5.0200.0831.03.05.013.081.0615.5
QA-SVS-A(5)3.06.018.098.0791.03.04.08.033.0242.5
QA-SVS-A(6)3.04.09.045.0322.03.04.08.031.5241.0
QA-SVS-A(7)3.04.06.542.0549.53.04.07.029.0165.0
QA-SVS-A(8)3.04.07.026.0152.53.03.06.019.5183.5
QA-SVS-A(9)3.04.012.570.0552.03.03.05.519.0423.5
QA-SVS-A(10)3.05.012.544.0539.03.03.06.028.5334.5
QCS(4)3.03.06.014.5166.03.03.04.014.5163.5
QCS(5)3.04.06.530.0209.53.03.04.08.573.5
QCS(6)3.04.09.540.0594.53.03.05.017.0160.5
QCS(7)3.04.012.050.0417.03.03.07.025.0125.5
QCS(8)3.05.020.064.5619.03.03.07.025.094.0
QCS(9)3.06.022.097.0642.53.04.09.039.5324.0
QCS(10)3.06.021.5163.01131.53.04.08.037.5431.5
SIS1341.03294.54114.04586.04968.01252.53027.53972.04524.04937.5
DC-SIS3.05.012.0110.01952.03.04.010.057.51297.5
QA-SIS(0.1)423.01111.51827.03080.04736.5350.0709.51175.51655.03101.0
QA-SIS(0.3)19.077.0198.0556.01755.514.045.5128.0402.01319.0
QA-SIS(0.5)13.582.5377.5787.02140.04.032.0177.5469.52936.0
QA-SIS(0.7)16.548.0177.5472.01475.022.596.5336.0777.52261.5
QA-SIS(0.9)549.01025.51727.02413.03997.5434.51132.52093.53309.04599.5
Notes: QA-SVS-A(4), QA-SVS-A(5), …, and QA-SVS-A(10), our proposed method defined in Remark 3 with different quantile grid points (K = 4, …, 10); QCS-A(4), QCS-A(5), …, and QCS-A(10), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 4, …, 10); SIS, the sure independence screening (Fan and Lv, 2008) [1]; DC-SIS, the distance correlation-based screening (Li et al., 2012) [8]; QA-SIS(0.1), QA-SIS(0.3), …, QA-SIS(0.9), the quantile-adaptive model-free sure independence screening (He et al., 2013) [9] at different quantile levels.
Table A4. The quantiles of minimum model size in Scenario 1.6 of Section 4.1.
Table A4. The quantiles of minimum model size in Scenario 1.6 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.011.041.5211.0717.03.03.03.04.05.0
QA-SVS-A(5)3.04.526.5126.5380.53.03.03.03.04.0
QA-SVS-A(6)3.03.013.069.5319.03.03.03.03.04.0
QA-SVS-A(7)3.03.06.018.0309.03.03.03.03.03.5
QA-SVS-A(8)3.03.06.016.5107.03.03.03.03.03.0
QA-SVS-A(9)3.03.04.510.094.03.03.03.03.03.0
QA-SVS-A(10)3.03.04.013.0112.53.03.03.03.03.0
QCS(4)4.031.595.3329.5730.53.03.03.05.571.0
QCS(5)8.039.0148.5437.0742.03.03.03.04.010.5
QCS(6)4.028.588.0287.0664.53.03.03.04.046.0
QCS(7)3.522.595.5256.0563.53.03.03.03.06.0
QCS(8)3.019.067.0201.0535.03.03.03.03.011.0
QCS(9)3.08.050.0165.5494.53.03.03.03.011.0
QCS(10)3.09.538.0124.0539.03.03.03.03.07.5
SIS3.03.03.03.07.03.03.03.03.03.0
DC-SIS3.03.03.03.04.03.03.03.03.03.0
QA-SIS(0.1)4.07.514.528.0152.03.04.05.07.515.0
QA-SIS(0.3)4.035.0110.0247.0840.53.03.03.05.017.0
QA-SIS(0.5)27.5141.0262.0516.5880.53.04.07.035.5258.0
QA-SIS(0.7)49.5185.0425.0736.5929.08.532.0160.5385.5865.0
QA-SIS(0.9)241.5456.5648.5878.0985.067.0275.0541.5754.5972.0
p = 5000
QA-SVS-A(4)6.052.0347.01262.53451.53.03.03.03.05.5
QA-SVS-A(5)3.017.5149.0472.02285.53.03.03.03.56.0
QA-SVS-A(6)3.011.047.5293.01288.03.03.03.03.04.0
QA-SVS-A(7)3.05.019.589.51288.03.03.03.03.04.0
QA-SVS-A(8)3.03.010.095.0923.03.03.03.03.03.0
QA-SVS-A(9)3.04.08.033.5783.53.03.03.03.03.0
QA-SVS-A(10)3.04.011.058.0725.53.03.03.03.03.0
QCS(4)40.5329.0853.51965.03801.03.03.04.032.0277.0
QCS(5)12.5164.5515.01677.53445.53.03.03.03.0112.0
QCS(6)6.562.5262.0917.02845.53.03.03.05.535.0
QCS(7)18.5105.0404.5937.02855.03.03.03.04.031.5
QCS(8)5.584.0333.0788.53004.03.03.03.04.016.0
QCS(9)3.046.0173.5596.51803.53.03.03.04.028.5
QCS(10)6.540.5213.5939.02590.53.03.03.05.013.0
SIS3.03.03.03.043.53.03.03.03.07.5
DC-SIS3.03.03.03.04.03.03.03.03.03.0
QA-SIS(0.1)8.023.052.5149.0919.05.011.017.527.055.5
QA-SIS(0.3)3.0117.5562.01466.03196.53.03.04.06.5109.0
QA-SIS(0.5)69.5611.51754.53333.54554.03.58.059.0552.51786.5
QA-SIS(0.7)184.51131.02307.03419.04573.017.0287.5886.51798.03692.5
QA-SIS(0.9)867.01881.53007.03863.54763.5841.51862.52873.03834.04603.0
Notes: QA-SVS-A(4), QA-SVS-A(5), …, and QA-SVS-A(10), our proposed method defined in Remark 3 with different quantile grid points (K = 4, …, 10); QCS-A(4), QCS-A(5), …, and QCS-A(10), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 4, …, 10); SIS, the sure independence screening (Fan and Lv, 2008) [1]; DC-SIS, the distance correlation-based screening (Li et al., 2012) [8]; QA-SIS(0.1), QA-SIS(0.3), …, QA-SIS(0.9), the quantile-adaptive model-free sure independence screening (He et al., 2013) [9] at different quantile levels.
Table A5. The result of criteria in all scenarios under p = 5000 with α = 0.05 of Section 4.2.
Table A5. The result of criteria in all scenarios under p = 5000 with α = 0.05 of Section 4.2.
MethodQA-SVS-AFD(K)QA-SVS-FDR(K)QCS-FDR(K)
K234562345623456
Scenario 2.1
| A ^ | 12.0810.6910.5210.2310.0411.6811.6911.9211.8811.5311.1711.2211.2511.2911.10
FDR0.160.060.050.020.000.140.140.160.150.130.100.100.100.110.09
F 1 -score0.910.970.980.991.000.920.920.910.920.930.950.940.940.940.95
Scenario 2.2
| A ^ | 50.3348.2345.0238.2127.3752.3551.9852.2252.1952.0550.2250.6749.7749.4148.43
FDR0.020.000.000.000.000.050.040.050.050.050.050.050.040.050.04
F 1 -score0.980.980.950.860.700.970.980.970.970.970.960.960.950.950.94
Scenario 2.3
| A ^ | 12.0910.6710.4010.1010.0211.7411.5811.5211.6011.5711.0810.9011.1610.9010.90
FDR0.170.060.040.010.000.140.130.130.130.130.090.080.100.080.08
F 1 -score0.910.970.981.001.000.920.930.930.930.930.950.960.950.960.96
Scenario 2.4
| A ^ | 50.0545.9139.3325.6215.2752.2351.9651.9152.0751.5649.5348.0347.2244.4743.53
FDR0.020.000.000.000.000.050.050.050.050.040.050.050.050.040.05
F 1 -score0.980.960.880.670.460.970.970.970.970.970.950.930.920.900.89
K 234562345623456
Scenario 2.5
| A ^ | 10.899.889.307.955.9010.4710.4310.5410.4710.4010.0010.099.889.629.48
FDR0.080.000.000.000.000.040.040.050.040.040.050.050.050.040.05
F 1 -score0.960.990.960.880.730.980.980.970.980.980.940.950.940.930.92
Scenario 2.6
| A ^ | 14.813.650.490.080.0012.6513.9312.8810.889.951.831.300.940.720.45
FDR0.05NaNNaNNaNNaN0.040.040.050.040.04NaNNaNNaNNaNNaN
F 1 -score0.430.130.020.000.000.380.410.380.330.310.060.050.030.020.02
Scenario 2.7
| A ^ | 10.9210.029.999.999.9010.3910.5110.4610.5210.6010.6610.7210.6310.5310.50
FDR0.080.000.000.000.000.030.040.040.050.050.060.060.050.050.04
F 1 -score0.961.001.001.000.990.980.980.980.980.970.970.970.970.980.98
Scenario 2.8
| A ^ | 38.6018.946.361.240.1943.5241.4640.0538.3436.2122.7517.2111.878.086.15
FDR0.020.000.00NaNNaN0.050.040.050.050.040.040.05NaNNaNNaN
F 1 -score0.850.550.220.050.010.890.860.840.830.800.590.480.350.260.20
Notes: QA-SVS-AFD(K), our proposed method by controlling FD adaptively defined in Remark 4 with different quantile grid points (K = 2, …, 6); QA-SVS-FDR(K), our proposed method by controlling FDR defined in Remark 5 with different quantile grid points (K = 2, …, 6); QCS-FDR(K), the quantile correlation-based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 2, …, 6). | A ^ | : the average number of selected predictors; FDR: the average of empirical false discovery proportion, where ‘NaN’ indicates the method loss validity; F1-score: the average of F1-score.

Appendix C

Figure A1. A lung CT figure of a subject in the dataset.
Figure A1. A lung CT figure of a subject in the dataset.
Entropy 25 00524 g0a1
Figure A2. Screened important pixels of lung CT by QA-SVS-FDR under K = 2 . The black pixels indicate the useless pixels, and the white pixels represent the important pixels.
Figure A2. Screened important pixels of lung CT by QA-SVS-FDR under K = 2 . The black pixels indicate the useless pixels, and the white pixels represent the important pixels.
Entropy 25 00524 g0a2
Figure A3. Screened important pixels of lung CT by QA-SVS-FDR under K = 3 .
Figure A3. Screened important pixels of lung CT by QA-SVS-FDR under K = 3 .
Entropy 25 00524 g0a3
Figure A4. Screened important pixels of lung CT by QA-SVS-FDR under K = 4 .
Figure A4. Screened important pixels of lung CT by QA-SVS-FDR under K = 4 .
Entropy 25 00524 g0a4
Figure A5. Screened important pixels of lung CT by QA-SVS-FDR under K = 5 .
Figure A5. Screened important pixels of lung CT by QA-SVS-FDR under K = 5 .
Entropy 25 00524 g0a5
Figure A6. Screened important pixels of lung CT by QA-SVS-FDR under K = 6 .
Figure A6. Screened important pixels of lung CT by QA-SVS-FDR under K = 6 .
Entropy 25 00524 g0a6

References

  1. Fan, J.; Lv, J. Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B-Stat. Methodol. 2008, 70, 849–883. [Google Scholar]
  2. Liu, W.; Li, R. , P., Ed.; Springer: Cham, Switzerland, 2020; pp. 293–326.Screening. In Macroeconomic Forecasting in the Era of Big Data: Theory and Practice; Fuleky, P., Ed.; Springer: Cham, Switzerland, 2020; pp. 293–326. [Google Scholar]
  3. Fan, J.; Song, R. Sure Independence Screening in Generalized Linear Models with Np-Dimensionality. Ann. Stat. 2010, 38, 3567–3604. [Google Scholar] [CrossRef]
  4. Fan, J.; Feng, Y.; Song, R. Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models. J. Am. Stat. Assoc. 2011, 106, 544–557. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Li, G.; Peng, H.; Zhang, J.; Zhu, L. Robust Rank Correlation Based Screening. Ann. Stat. 2012, 40, 1846–1877. [Google Scholar] [CrossRef] [Green Version]
  6. Chang, J.; Tang, C.Y.; Wu, Y. Marginal Empirical Likelihood In addition, Sure Independence Feature Screening. Ann. Stat. 2013, 41, 2123–2148. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Zhu, L.P.; Li, L.; Li, R.; Zhu, L.X. Model-Free Feature Screening for Ultrahigh-Dimensional Data. J. Am. Stat. Assoc. 2011, 106, 1464–1475. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Li, R.; Zhong, W.; Zhu, L. Feature Screening via Distance Correlation Learning. J. Am. Stat. Assoc. 2012, 107, 1129–1139. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. He, X.; Wang, L.; Hong, H.G. Quantile-Adaptive Model-Free Variable Screening for High-Dimensional Heterogeneous Data. Ann. Stat. 2013, 41, 342–369. [Google Scholar] [CrossRef]
  10. Lin, L.; Sun, J.; Zhu, L. Nonparametric feature screening. Comput. Stat. Data Anal. 2013, 67, 162–174. [Google Scholar] [CrossRef]
  11. Lu, J.; Lin, L. Model-free conditional screening via conditional distance correlation. Stat. Pap. 2020, 61, 225–244. [Google Scholar] [CrossRef]
  12. Mai, Q.; Zou, H. The Kolmogorov filter for variable screening in high-dimensional binary classification. BIOMETRIKA 2013, 100, 229–234. [Google Scholar] [CrossRef]
  13. Huang, D.; Li, R.; Wang, H. Feature Screening for Ultrahigh Dimensional Categorical Data with Applications. J. Bus. Econ. Stat. 2014, 32, 237–244. [Google Scholar] [CrossRef]
  14. Cui, H.; Li, R.; Zhong, W. Model-Free Feature Screening for Ultrahigh Dimenssional Discriminant Analysis. J. Am. Stat. Assoc. 2015, 110, 630–641. [Google Scholar] [CrossRef]
  15. Han, X. Nonparametric screening under conditional strictly convex loss for ultrahigh dimensional sparse data. Ann. Stat. 2019, 47, 1995–2022. [Google Scholar]
  16. Zhou, T.; Zhu, L.; Xu, C.; Li, R. Model-free forward screening via cumulative divergence. J. Am. Stat. Assoc. 2020, 115, 1393–1405. [Google Scholar] [CrossRef]
  17. Xie, J.; Lin, Y.; Yan, X.; Tang, N. Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data. J. Am. Stat. Assoc. 2020, 115, 747–760. [Google Scholar] [CrossRef]
  18. Hao, N.; Zhang, H.H. A note on high-dimensional linear regression with interactions. Am. Stat. 2017, 71, 291–297. [Google Scholar] [CrossRef] [Green Version]
  19. Tang, W.; Xie, J.; Lin, Y.; Tang, N. Quantile Correlation Based Variable Selection. J. Bus. Econ. Stat. 2021, 40, 1801–1903. [Google Scholar] [CrossRef]
  20. Liu, W.; Ke, Y.; Liu, J.; Li, R. Model-free feature screening and fdr control with knockoff features. J. Am. Stat. Assoc. 2022, 117, 428–443. [Google Scholar] [CrossRef]
  21. Guo, X.; Ren, H.; Zou, C.; Li, R. Threshold selection in feature screening for error rate control. J. Am. Stat. Assoc. 2022, 1–13. [Google Scholar] [CrossRef]
  22. Cook, R.D. Testing predictor contributions in sufficient dimension reduction. Ann. Stat. 2004, 32, 1062–1092. [Google Scholar] [CrossRef] [Green Version]
  23. Yin, X.; Hilafu, H. Sequential Sufficient Dimension Reduction for Large p, Small n Problems. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2015, 77, 879–892. [Google Scholar] [CrossRef]
  24. Yuan, Q.; Chen, X.; Ke, C.; Yin, X. Independence index sufficient variable screening for categorical responses. Comput. Stat. Data Anal. 2022, 174, 107530. [Google Scholar] [CrossRef]
  25. Hyndman, R.J.; Fan, Y. Sample Quantiles in Statistical Packages. Am. Stat. 1996, 50, 361–365. [Google Scholar]
  26. Mohamed, I.B.; Mirakhmedov, S.M. Approximation by Normal Distribution for A Sample Sum in Sampling Without Replacement from a Finite Population. Sankhya A 2016, 78, 188–220. [Google Scholar] [CrossRef] [Green Version]
  27. Benjamini, Y.; Hochberg, Y. Controlling The False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B (Methodol.) 1995, 57, 289–300. [Google Scholar] [CrossRef]
  28. Shalmon, T.; Salazar, P.; Horie, M.; Hanneman, K.; Pakkal, M.; Anwari, V.; Fratesi, J. Predefined and data driven CT densitometric features predict critical illness and hospital length of stay in COVID-19 patients. Sci. Rep. 2022, 12, 8143. [Google Scholar] [CrossRef]
Table 1. The quantiles of minimum model size in Scenario 1.1 of Section 4.1.
Table 1. The quantiles of minimum model size in Scenario 1.1 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(5)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(6)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(9)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.03.03.0
QCS(4)3.03.03.03.04.03.03.03.03.04.0
QCS(5)3.03.03.03.03.03.03.03.03.07.0
QCS(6)3.03.03.03.03.03.03.03.03.05.0
QCS(7)3.03.03.03.04.03.03.03.03.05.0
QCS(8)3.03.03.03.05.03.03.03.03.08.0
QCS(9)3.03.03.03.010.53.03.03.03.06.5
QCS(10)3.03.03.04.07.53.03.03.03.07.5
SIS3.03.03.03.03.03.03.03.03.04.0
DC-SIS3.03.03.03.03.03.03.03.03.04.0
QA-SIS(0.1)3.03.05.010.060.03.03.03.05.023.0
QA-SIS(0.3)3.03.03.03.04.03.03.03.03.04.5
QA-SIS(0.5)3.03.03.03.03.03.03.03.03.04.5
QA-SIS(0.7)3.03.03.03.04.03.03.03.03.05.0
QA-SIS(0.9)3.03.04.012.056.03.03.03.07.538.5
p = 5000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.03.5
QA-SVS-A(5)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(6)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(9)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.03.03.0
QCS(4)3.03.03.03.05.03.03.03.03.07.0
QCS(5)3.03.03.03.06.03.03.03.03.05.0
QCS(6)3.03.03.03.05.03.03.03.03.08.5
QCS(7)3.03.03.03.010.03.03.03.03.014.5
QCS(8)3.03.03.03.030.03.03.03.03.024.0
QCS(9)3.03.03.04.028.53.03.03.04.046.5
QCS(10)3.03.03.05.036.53.03.03.05.571.5
SIS3.03.03.03.03.03.03.03.03.04.0
DC-SIS3.03.03.03.03.03.03.03.03.04.0
QA-SIS(0.1)3.04.011.556.5241.03.03.05.018.0254.0
QA-SIS(0.3)3.03.03.03.08.53.03.03.04.06.5
QA-SIS(0.5)3.03.03.03.04.03.03.03.03.04.5
QA-SIS(0.7)3.03.03.03.014.03.03.03.03.09.0
QA-SIS(0.9)3.04.013.055.0160.03.03.06.014.0222.5
Notes: QA-SVS-A(4), QA-SVS-A(5), …, and QA-SVS-A(10), our proposed method defined in Remark 3 with different quantile grid points (K = 4, …, 10); QCS-A(4), QCS-A(5), …, and QCS-A(10), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 4, …, 10); SIS, the sure independence screening (Fan and Lv, 2008) [1]; DC-SIS, the distance correlation-based screening (Li et al., 2012) [8]; QA-SIS(0.1), QA-SIS(0.3), …, QA-SIS(0.9), the quantile-adaptive model-free sure independence screening (He et al., 2013) [9] at different quantile levels.
Table 2. The quantiles of minimum model size in Scenario 1.2 of Section 4.1.
Table 2. The quantiles of minimum model size in Scenario 1.2 of Section 4.1.
ρ = 0 ρ = 0.5
Method5%25%50%75%95%5%25%50%75%95%
p = 1000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(5)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(6)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(8)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(9)3.03.03.03.03.03.03.03.03.04.0
QA-SVS-A(10)3.03.03.03.03.03.03.03.03.03.0
QCS(4)3.03.03.03.011.03.03.03.03.04.0
QCS(5)3.03.03.03.017.03.03.03.03.07.0
QCS(6)3.03.03.04.012.53.03.03.03.05.0
QCS(7)3.03.03.04.015.53.03.03.03.05.0
QCS(8)3.03.03.05.035.03.03.03.03.08.0
QCS(9)3.03.03.09.0127.03.03.03.03.06.5
QCS(10)3.03.03.06.032.53.03.03.03.07.5
SIS287.0486.5697.5870.0986.5127.5330.5573.5824.0971.0
DC-SIS3.03.03.03.0260.03.03.03.03.017.0
QA-SIS(0.1)176.5262.0394.5576.5814.561.5147.5257.0394.0630.5
QA-SIS(0.3)3.03.04.06.021.53.03.03.03.04.0
QA-SIS(0.5)3.03.03.03.06.53.03.03.03.03.0
QA-SIS(0.7)3.05.08.523.567.03.03.03.04.05.5
QA-SIS(0.9)100.5238.0368.0517.5866.535.585.5143.5301.0601.0
p = 5000
QA-SVS-A(4)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(5)3.03.03.03.05.03.03.03.03.03.0
QA-SVS-A(6)3.03.03.03.04.03.03.03.03.03.0
QA-SVS-A(7)3.03.03.03.03.03.03.03.03.03.0
QA-SVS-A(8)3.03.03.03.04.03.03.03.03.03.0
QA-SVS-A(9)3.03.03.03.04.03.03.03.03.03.0
QA-SVS-A(10)3.03.03.03.04.03.03.03.03.03.0
QCS(4)3.03.03.03.546.03.03.03.03.03.0
QCS(5)3.03.03.08.072.53.03.03.03.03.0
QCS(6)3.03.03.06.538.53.03.03.03.03.0
QCS(7)3.03.03.010.5171.53.03.03.03.03.0
QCS(8)3.03.04.012.0124.53.03.03.03.03.0
QCS(9)3.03.04.015.5353.53.03.03.03.03.0
QCS(10)3.03.04.020.0221.53.03.03.03.03.0
SIS1391.02628.53487.04254.54810.51165.52144.03288.54218.54937.5
DC-SIS3.03.03.04.01197.53.03.03.03.057.5
QA-SIS(0.1)439.01062.51653.52543.53717.0260.5629.01142.01813.53317.0
QA-SIS(0.3)4.06.09.520.5110.53.03.04.05.09.5
QA-SIS(0.5)3.03.03.05.013.03.03.03.03.03.0
QA-SIS(0.7)5.011.025.061.0377.03.03.04.05.09.5
QA-SIS(0.9)625.01510.52279.03218.04569.0175.0500.0868.01592.52403.0
Notes: All notations are the same as those in Table 1.
Table 3. The result of criteria in all scenarios under p = 1000 with α = 0.05 of Section 4.2.
Table 3. The result of criteria in all scenarios under p = 1000 with α = 0.05 of Section 4.2.
MethodQA-SVS-AFD(K)QA-SVS-FDR(K)QCS-FDR(K)
K 234562345623456
Scenario 2.1
| A ^ | 12.1710.8110.5710.2310.0511.7011.7411.7611.7511.5211.3311.2411.2911.0911.10
FDR0.170.070.050.020.000.140.140.140.140.130.110.100.110.090.09
F 1 -score0.900.960.970.991.000.920.920.920.920.930.940.940.940.950.95
Scenario 2.2
| A ^ | 50.4048.1444.8737.9526.8252.1652.2952.2852.2152.3550.5350.5250.1649.5948.59
FDR0.020.000.000.000.000.050.050.050.050.050.050.050.050.050.05
F 1 -score0.980.980.950.860.690.970.970.970.970.970.960.950.950.940.94
Scenario 2.3
| A ^ | 11.9810.6610.3810.1310.0211.4811.5111.7011.4011.5911.1111.0211.2210.9611.00
FDR0.160.060.030.010.000.120.120.140.120.130.090.090.100.080.08
F 1 -score0.910.970.980.991.000.930.930.920.940.930.950.950.940.960.95
Scenario 2.4
| A ^ | 50.3146.0739.3026.5414.5552.2252.2651.7551.8851.7749.7747.8446.9344.9743.69
FDR0.020.000.000.000.000.050.050.050.050.050.050.040.050.050.05
F 1 -score0.980.960.880.690.450.970.970.970.970.970.950.930.920.900.89
Scenario 2.5
| A ^ | 10.829.939.167.845.7110.3610.7310.5010.4410.559.939.999.769.579.09
FDR0.080.000.000.000.000.040.060.050.040.050.050.050.040.050.05
F 1 -score0.960.990.950.870.720.980.970.970.980.970.940.950.940.930.90
Scenario 2.6
| A ^ | 14.873.630.530.070.0112.8615.0412.4910.799.231.541.351.070.610.48
FDR0.06NaNNaNNaNNaN0.050.050.03NaNNaNNaNNaNNaNNaNNaN
F 1 -score0.430.130.020.000.000.380.430.380.330.290.050.040.030.020.02
Scenario 2.7
| A ^ | 11.1110.0010.009.999.9310.6610.4410.6510.4510.5410.6210.6010.6410.4510.61
FDR0.090.000.000.000.000.060.040.060.040.050.050.050.050.040.05
F 1 -score0.951.001.001.001.000.970.980.970.980.980.970.970.970.980.97
Scenario 2.8
| A ^ | 38.8619.205.931.260.2943.2242.3440.3537.5436.3323.0016.9311.918.626.20
FDR0.030.000.00NaNNaN0.050.050.050.040.050.050.040.040.05NaN
F 1 -score0.850.550.210.050.010.880.870.850.820.800.590.480.360.270.20
Notes: QA-SVS-AFD(K), our proposed method by controlling FD adaptively defined in Remark 4 with different quantile grid points (K = 2, …, 6); QA-SVS-FDR(K), our proposed method by controlling FDR defined in Remark 5 with different quantile grid points (K = 2, …, 6); QCS-FDR(K), the quantile correlation-based screening method (Tang et al., 2013) [19] with different quantile grid points (K = 2, …, 6). | A ^ | : the average number of selected predictors; FDR: the average of empirical false discovery proportion, where ‘NaN’ indicates the method loss validity; F1-score: the average of F1-score.
Table 4. The numbers of selected picture pixels in applications of Section 5.
Table 4. The numbers of selected picture pixels in applications of Section 5.
K23456
5% PD
QA-SVS-AFD(K)20,15224352800
QA-SVS-FDR(K)89,35385,42683,99176,49274,106
QCS-FDR(K)262,144262,144262,144262,144262,144
95% PD
QA-SVS-AFD(K)51515021500
QA-SVS-FDR(K)76,80076,44275,86378,15766,473
QCS-FDR(K)262,144262,144262,144262,144262,144
Note: The number of complete figure pixels is 262,144. QA-SVS-AFD(K), our proposed method by controlling FD adaptively defined in Remark 4 with different quantile grid points; QA-SVS-FDR(K), our proposed method by controlling FDR defined in Remark 5 with different quantile grid points; QCS-FDR(K), the quantile correlation based screening method (Tang et al., 2013) [19] with different quantile grid points.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yuan, Z.; Chen, J.; Qiu, H.; Huang, Y. Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery. Entropy 2023, 25, 524. https://doi.org/10.3390/e25030524

AMA Style

Yuan Z, Chen J, Qiu H, Huang Y. Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery. Entropy. 2023; 25(3):524. https://doi.org/10.3390/e25030524

Chicago/Turabian Style

Yuan, Zihao, Jiaqing Chen, Han Qiu, and Yangxin Huang. 2023. "Quantile-Adaptive Sufficient Variable Screening by Controlling False Discovery" Entropy 25, no. 3: 524. https://doi.org/10.3390/e25030524

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop