Next Article in Journal
What Is It like to Be a Brain Organoid? Phenomenal Consciousness in a Biological Neural Network
Previous Article in Journal
Stock Market Forecasting Based on Spatiotemporal Deep Learning
Previous Article in Special Issue
Tweedie Compound Poisson Models with Covariate-Dependent Random Effects for Multilevel Semicontinuous Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detection of Interaction Effects in a Nonparametric Concurrent Regression Model

1
School of Data Science, University of Science and Technology of China, Hefei 230026, China
2
Department of Statistics and Finance, Management School, University of Science and Technology of China, Hefei 230026, China
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(9), 1327; https://doi.org/10.3390/e25091327
Submission received: 1 June 2023 / Revised: 5 August 2023 / Accepted: 8 September 2023 / Published: 12 September 2023
(This article belongs to the Special Issue Statistical Methods for Modeling High-Dimensional and Complex Data)

Abstract

:
Many methods have been developed to study nonparametric function-on-function regression models. Nevertheless, there is a lack of model selection approach to the regression function as a functional function with functional covariate inputs. To study interaction effects among these functional covariates, in this article, we first construct a tensor product space of reproducing kernel Hilbert spaces and build an analysis of variance (ANOVA) decomposition of the tensor product space. We then use a model selection method with the L 1 criterion to estimate the functional function with functional covariate inputs and detect interaction effects among the functional covariates. The proposed method is evaluated using simulations and stroke rehabilitation data.

1. Introduction

Functional data can be found in various fields, such as biology, economics, engineering, geology, medicine and psychology. Recently, statistical methods and theories on functional data were widely studied ([1,2,3,4,5,6]). Functional data sometimes have more complicated structures. For example, the motivation data of this paper, stroke rehabilitation data, utilized a collection of 3D video games known as Circus Challenge to enhance upper limb function in stroke patients ([7,8,9]). The patients were scheduled to play the movement game over three months at specified times. At each visit time t, the level of impairment of stroke subject i was measured using the CAHAI (Chedoke Arm and Hand Activity Inventory) score, denoted as y i ( t ) , and movements of upper limbs of patients, such as forward circle movement and sawing movement, were also recorded. The movement data at time t and frequency s from the ith patient were denoted as x i ( t , s ) . Determining a way to model the relationship of y i ( t ) and functional x i ( t , · ) is key to studying whether the movements are helpful to the rehabilitation level of the stroke patient or not. Furthermore, there is a question of whether there are interaction effects among the movements on the stroke patient’s rehabilitation. Zhai et al. [9] developed a nonparametric concurrent regression model to study the relationship between functional movements and the CAHAI score. However, they did not consider the interaction effects of functional movements on the CAHAI score. We aim to examine the interaction effects of movements on CAHAI scores and predict the rehabilitation level of stroke patients in this paper.
In this paper, we apply the following nonparametric concurrent regression model (NCRM) to model stroke rehabilitation data:
y i ( t ) = f ( t , x i ( t , · ) ) + ϵ i ( t ) , i = 1 , , n ,
where f is a bivariate functional to be estimated nonparametrically, response y i ( t ) is a function of t, covariate x i ( t , · ) is a vector of functional with length q, and ϵ i ( t ) is a random error. To explore the interaction effects among components of covariates x i ( t , · ) , we use the smooth spline analysis of variance (SS ANOVA) method [10,11] to decompose regression function f.
A multivariate function can be decomposed of main effects and interaction effects via the SS ANOVA method ([10,11]). When the dimension of covariates q is large, the decomposed model contains a large number of interaction effects. Even if only the main effects and second-order interaction terms are investigated, the order of the number of decomposition terms is O ( q 2 ) , which leads to a highly complicated model. To model stroke rehabilitation data with q = 3 , there are 22 terms, including the main effects and interaction effects. This challenges the estimation method for the NCRM model. To avoid this shortcoming, Zhai et al. [9] took all functional covariates as a whole and did not consider interaction effects among covariates. Following [12], this paper conducts a model selection method for the NCRM model with all main effects and interaction effects. In this method, the regression function is estimated and significant components of the decomposition are selected simultaneously.
Model selection is a crucial step in building statistical models that accurately capture the relationships between variables ([13,14]). It can choose the most suitable model from a set of candidate models based on certain criteria such as goodness-of-fit, predictive performance, and interpretability. Based on the SS ANOVA approach, model selection is crucial to determine the contribution of each component of the decomposition to the overall variance of the response variable. Several methods have been proposed for selecting models with SS ANOVA, including forward selection, backward elimination, and stepwise regression ([15,16,17,18,19,20]). However, these methods are limited in their ability to handle high-dimensional data and identify complex interactions among variables. Hence, regularization methods such as the L 1 penalty have gained popularity in recent years ([12,21,22,23,24,25]), which allow for the selection of sparse and robust models. For example, Zhang et al. [23] developed a nonparametric penalized likelihood method with the likelihood basis pursuit and used it for variable selection and model construction. Lin and Zhang [22] proposed a component selection and smoothing method for multivariate nonparametric regression models by penalizing the sum of component norms of SS ANOVA. Furthermore, Wang et al. [12] developed a unified framework for estimation and model selection methods in nonparametric function-on-function regression, which performs well when using L 1 penalty methods for model selection. Dong and Wang [24] proposed a nonparametric method for learning conditional dependence structure in graph models by applying L 1 regularization to detect the neighborhoods of edges, where SS ANOVA decomposition is used to depict interaction effects of edges in the graph model. In this paper, we borrowed the L 1 regularization idea to build model selection by penalizing the sum of norms of the ANOVA decomposition components for the NCRM model. In addition, Bayesian analysis methods can also be used to study interaction effects; for example, Ren et al. [26] proposed a novel semiparametric Bayesian variable selection model for investigating linear and nonlinear gene–environment interactions simultaneously, allowing for structural identification.
This paper proposes an estimation and model selection approach for the NCRM model (1). Following [12,22], the SS ANOVA decomposition for the tensor product space of the reproducing kernel Hilbert spaces (RKHS) is constructed, and the L 1 penalty approach for the components of the decomposition is implemented. We use estimation procedures under either an L 1 or a joint L 1 and L 2 penalty to fit teh NCRM model. We study the interaction effects of the covariate x i ( t , · ) in model (1) via ANOVA decomposition of the regression function, where the tensor product RKHS is built based on Gaussian kernels. The decomposition is different from that of Zhai et al. [9], where they took the covariate as a whole variable and did not consider their interaction effects. Based on the decomposition, model selection with the tensor product RKHS is conducted using the L 1 penalty method. With regards to the covariate x i ( t , · ) , the models from Wang et al. [12] are not suitable to analyze the stoke data. In this paper, we apply the proposed method to stroke rehabilitation data and study the relationship of the movements and the patient’s CAHAI score. Besides the main effects, the interaction effect of the movements is also detected.
The remainder of the article is organized as follows. In Section 2, we present the tensor product RKHS with the Gaussian kernel and the SS ANOVA decomposition of the regression function. In Section 3, we show model selection and estimation procedures. The simulation study and application of stroke rehabilitation data are presented in Section 4 and Section 5. We conclude in Section 6.

2. Nonparametric Concurrent Regression Model

For the NCRM model (1), we consider x i ( t , · ) = ( x i 1 ( t , · ) , , x i q ( t , · ) ) , where x i j ( t , s ) : S R for any fixed time t T is a function of s within a space denoted by X j , j 1 , , q . Generally, t and s can be transformed into [ 0 , 1 ] . For simplicity, we let T = [ 0 , 1 ] and S = [ 0 , 1 ] and let X j L 2 [ 0 , 1 ] , j = 1 , , q , which are independent of t. Furthermore, we assume that y i ( t ) Y L 2 [ 0 , 1 ] and ϵ i ( t ) for i = 1 , , n are identically and independently distributed in L 2 [ 0 , 1 ] with mean zero and 0 1 E [ ϵ i ( t ) 2 ] d t < . It is shown that the regression function f is a functional function with an independent covariate x i ( t , · ) . To provide a nonparametric estimation of f, the SS ANOVA decomposition method is used to construct a tensor product space of RKHS to which f belongs.
When f is treated as a function with respect to the first augment t T , we consider the Sobolev space [10],
H ( 1 ) = f : f and f absolutely continuous , 0 1 ( f ) 2 d t < ,
where H ( 1 ) can be rewritten as
H ( 1 ) = { 1 } { t } H 2 ( 1 ) ,
where { 1 } is a constant space, { t } is a linear function space with t as an independent variable. H 2 ( 1 ) is a smooth function space orthogonal to the constant space and the linear function space. Reproducing kernels (RK) for these three subspaces are K 0 ( 1 ) ( t , t ) = 1 , K 1 ( 1 ) ( t , t ) = k 1 ( t ) k 1 ( t ) , and K 2 ( 1 ) ( t , t ) = k 2 ( t ) k 2 ( t ) k 4 ( | t t | ) , where k 1 , k 2 and k 4 are defined as
k 1 ( x ) = x 0.5 , k 2 ( x ) = 1 2 k 1 2 ( x ) 1 12 , k 4 ( x ) = 1 24 k 1 4 ( x ) 1 2 k 1 2 ( x ) + 7 240 .
For functional augments x ( t , · ) , RK and its corresponding RKHS for f as a function of functions in X = X 1 × × X q are constructed as follows. For any u j , u j X j , we construct a Gaussian kernel as
K 2 , j ( 2 ) ( u j , u j ) = exp u j u j 2 2 ,
where u j 2 = 0 1 u j 2 ( s ) d s . We can show that when the space X j is a complete space, K 2 , j ( 2 ) is a symmetric and strictly positive definite. The unique RKHS H 2 , j ( 2 ) derived from K 2 , j ( 2 ) is separable and does not contain any non-zero constants. To construct an SS ANOVA decomposition, we let H j ( 2 ) = { 1 } H 2 , j ( 2 ) . Then, the tensor product space in this paper is H ( 2 ) = H 1 ( 2 ) H q ( 2 ) with the following decomposition:
H ( 2 ) = H 1 ( 2 ) H q ( 2 ) = { 1 } H 2 , 1 ( 2 ) H 2 , q ( 2 ) H 2 , 1 ( 2 ) H 2 , 2 ( 2 ) H 2 , q 1 ( 2 ) H 2 , 1 ( 2 ) H 2 , 1 1 ( 2 ) H 2 , q ( 2 ) .
Decomposition (4) is different from that of Zhai et al. [9] where H ( 2 ) is decomposed of constant space { 1 } and another RKHS not considering interaction among x ( t , · ) .
Next, we consider the tensor product space H = H ( 1 ) H ( 2 ) which has the following decomposition:
H = { 1 } { t } H 2 ( 1 ) { 1 } { j = 1 q H 2 , j ( 2 ) } H 2 , 1 ( 2 ) H 2 , q ( 2 ) = { 1 } { t } H 2 ( 1 ) { j = 1 q H 2 , j ( 2 ) } { j = 1 q { H 2 , j ( 2 ) × { t } } } { j = 1 q { H 2 , j ( 2 ) × H 2 ( 1 ) } } H 2 , 1 ( 2 ) H 2 , q ( 2 ) H 2 , 1 ( 2 ) H 2 , q ( 2 ) × { t } H 2 , 1 ( 2 ) H 2 , q ( 2 ) H 2 ( 1 ) .
There, the null space { 1 } { t } stands for the main effect of the parametric form of t, H 2 ( 1 ) is the main effect of the non-parametric form of t, H 2 , j ( 2 ) is the main effect of the non-parametric form of u j , { t } H 2 , j ( 2 ) is the linear nonparametric interaction between t and u j , H 2 ( 1 ) H 2 , j ( 2 ) is the nonparametric nonparametric interaction between t and u j , and so on, H 2 ( 1 ) H 2 , 1 ( 2 ) H 2 , q ( 2 ) is the nonparametric nonparametric interaction between t and u, where u = ( u 1 , , u q ) . We denote ϕ 1 ( t , u ) = 1 and ϕ 2 ( t , u ) = k 1 ( t ) as the basis functions of H 0 . For example, with q = 3 , the RKs corresponding to the above sub-RKHS are
H 0 := { 1 } { t } K 0 ( ( t , u ) , ( t , u ) ) = 1 + k 1 ( t ) k 1 ( t ) , H 1 := H 2 ( 1 ) K 1 ( ( t , u ) , ( t , u ) ) = K 2 ( 1 ) ( t , t ) , H 1 + j := H 2 , j ( 2 ) K 1 + j ( ( t , u ) , ( t , u ) ) = K 2 , j ( 2 ) ( u j , u j ) , H 4 + j := { t } H 2 , j ( 2 ) K 4 + j ( ( t , u ) , ( t , u ) ) = k 1 ( t ) k 1 ( t ) K 2 , j ( 2 ) ( u j , u j ) , H 7 + j := H 2 ( 1 ) H 2 , j ( 2 ) K 7 + j ( ( t , u ) , ( t , u ) ) = K 2 ( 1 ) ( t , t ) K 2 , j ( 2 ) ( u j , u j ) , H 8 + j + l := H 2 , j ( 2 ) H 2 , l ( 2 ) K 8 + j + l := ( ( t , u ) , ( t , u ) ) = K 2 , j ( 2 ) ( u j , u j ) K 2 , l ( 2 ) ( u l , u l ) , H 11 + j + l := { t } H 2 , j ( 2 ) H 2 , l ( 2 ) K 11 + j + l ( ( t , u ) , ( t , u ) ) = k 1 ( t ) k 1 ( t ) K 2 , j ( 2 ) ( u j , u j ) K 2 , l ( 2 ) ( u l , u l ) , H 14 + j + l := H 2 ( 1 ) H 2 , j ( 2 ) H 2 , l ( 2 ) K 14 + j + l ( ( t , u ) , ( t , u ) ) = K 2 ( 1 ) ( t , t ) K 2 , j ( 2 ) ( u j , u j ) K 2 , l ( 2 ) ( u l , u l ) , H 20 := H 2 , 1 ( 2 ) H 2 , 3 ( 2 ) H 2 , 3 ( 2 ) K 20 ( ( t , u ) , ( t , u ) ) = j = 1 3 K 2 , j ( 2 ) ( u j , u j ) , H 21 := { t } H 2 , 1 ( 2 ) H 2 , 3 ( 2 ) H 2 , 3 ( 2 ) K 21 ( ( t , u ) , ( t , u ) ) = k 1 ( t ) k 1 ( t ) j = 1 3 K 2 , j ( 2 ) ( u j , u j ) , H 22 := H 2 ( 1 ) H 2 , 1 ( 2 ) H 2 , 3 ( 2 ) H 2 , 3 ( 2 ) K 22 ( ( t , u ) , ( t , u ) ) = K 2 ( 1 ) ( t , t ) j = 1 3 K 2 , j ( 2 ) ( u j , u j ) ,
for j , l = 1 , 2 , 3 and j < l , where the left and right parts stand for the tensor product spaces and their corresponding RKs, respectively.

3. Model Selection and Estimation

We let the projection of f onto H 0 be k = 1 2 d k ϕ k ( t , u ) , u = ( u 1 , , u q ) , and { H 1 , , H Q } be the sub-RKHS generated by the tensor product method in Section 2, where Q is a number of sub-RKHS. L 1 penalties are applied to coefficients d k for the space H 0 and components of the decomposition of f (projections of f onto H j , j = 1 , , Q ). We estimate f by minimizing the following penalized least squares:
1 n i = 1 n 0 1 ( y i ( t ) f ( t , x i ( t , · ) ) ) 2 d t + λ 1 k = 1 2 w 1 k | d k | + λ 2 v = 1 Q w 2 , v P v f H ,
where f H , P v is the projection operator onto H j , · H is a norm induced from H , λ 1 and λ 2 are tuning parameters, and 0 w 1 k , w 2 , v < are pre-specified weights. We may set w 11 = 0 when ϕ 1 = 1 to avoid penalty to the constant function.
Since the response function is a stochastic process in the L 2 [ 0 , 1 ] space, there exists a set of orthogonal basis functions { η k ( t ) , k = 1 , 2 , } in L 2 [ 0 , 1 ] , where { η k ( t ) , k = 1 , 2 , , n } is an empirical functional principal component (EFPC) of { y 1 ( t ) , , y n ( t ) } ([27]). We let ν i k = < y i ( t ) , η k ( t ) > and L i k f = 0 1 f ( t , x i ( t , · ) ) η k ( t ) d t for i = 1 , 2 , , n and k = 1 , , n . We assume that { L i k } are bounded linear functionals. With EFPC, functional data can be transformed to scalar data such that modeling and analysis can be conducted by using traditional statistical methods. It can show that the PLS (5) based on functional data y i ( t ) reduces to the following PLS based on scalar data { ν i k } :
1 n i = 1 n k = 1 n ( ν i k L i k f ) 2 + λ 1 k = 1 2 w 1 k | d k | + λ 2 v = 1 Q w 2 , v P v f H .
By Lemma 3.1 in Wang et al. [12], minimizing the PLS (6) is equivalent to minimizing the following PLS:
1 n i = 1 n k = 1 n ( ν i k L i k f ) 2 + λ 1 k = 1 2 w 1 k | d k | + τ 0 v = 1 Q w 2 , v θ v 1 P v f H 2 + τ 1 v = 1 Q w 2 , v θ v ,
subject to θ v 0 for 1 v Q , where λ 1 , τ 0 , τ 1 are tuning parameters.
We let H * = H 1 H Q . To provide an RK with linear combination of its subspaces RK for the space of H * , we define a new inner product in H * ,
< f , g > * = v = 1 Q w 2 , v θ v 1 < P v f , P v g > ,
where < · , · > is the inner product in H . Under the new inner product, the RK of H 1 * is
K * ( ( t , u ) , ( t , u ) ) = v = 1 Q w 2 , v 1 θ v K v ( ( t , u ) , ( t , u ) ) ,
where coefficient θ v can measure the contribution of the components of the decomposition to the model. Next, we use the reproducing property of the kernel function to transform the infinite-dimensional optimization problem (7) into a finite-dimensional solution problem. We let H 1 n = span { 0 1 K * ( ( t , x ( t , · ) ) , ( t , x i ( t , · ) ) ) η k ( t ) d t , i = 1 , 2 , , n , k = 1 , 2 , , n } , which is a subspace of H * . Then, any f H * can be decomposed as
f = f 0 + f 1 n + ρ ,
where f 0 H 0 , f 1 n H 1 n , and ρ H * H 1 n . We denote
K ( t , x i ( t , · ) ) * ( t , x ( t , · ) ) = K * ( ( t , x i ( t , · ) ) , ( t , x ( t , · ) ) )
as the evaluation function of point ( t , x i ( t , · ) ) , and f 1 = f 1 n + ρ . Then, we can rewrite the PLS (7) as
1 n i = 1 n k = 1 n ν i k u i k < f 1 ( t , x i ( t , · ) ) , η k ( t ) > 2 + τ 0 v = 1 Q w 2 , v θ v 1 P v f H * 2 + λ 1 k = 1 2 w 1 k | d k | + τ 1 v = 1 Q w 2 , v θ v = 1 n i = 1 n k = 1 n ν i k u i k < < f 1 , K ( t , x i ( t , · ) ) * > H * , η k ( t ) > 2 + τ 0 v = 1 Q w 2 , v θ v 1 P v f H * 2 + λ 1 k = 1 2 w 1 k | d k | + τ 1 v = 1 Q w 2 , v θ v = 1 n i = 1 n k = 1 n ν i k u i k < f 1 , 0 1 K ( t , x i ( t , · ) ) * η k ( t ) d t > H * 2 + τ 0 v = 1 Q w 2 , v θ v 1 P v f H * 2 + λ 1 k = 1 2 w 1 k | d k | + τ 1 v = 1 Q w 2 , v θ v = 1 n i = 1 n k = 1 n ν i k u i k < f 1 n , 0 1 K ( t , x i ( t , · ) ) * η k ( t ) d t > H * 2 + τ 0 v = 1 Q w 2 , v θ v 1 P v f 1 n H * 2 + τ 0 v = 1 Q w 2 , v θ v 1 P v ρ H * 2 + λ 1 k = 1 2 w 1 k | d k | + τ 1 v = 1 Q w 2 , v θ v ,
where u i k = 0 1 f 0 ( t , x i ( t , · ) ) η k ( t ) d t . The first equality uses the reproducing property, and the third equality uses the fact that ρ is orthogonal to H 1 n . Minimizing (9) must have ρ = 0 , and we obtain the following representer theorem:
Theorem 1
(Representer Theorem). The solution to PLS (9) is
f ( t , x ( t , · ) ) = j = 1 2 d j φ j ( t ) + v = 1 Q w 2 , v 1 θ v i = 1 n k = 1 n c i k ξ i k ( t , x ( t , · ) ) ,
where φ 1 ( t ) = 1 , φ 2 ( t ) = k 1 ( t ) , and ξ i k ( t , x ( t , · ) ) = 0 1 K v ( ( t , x ( t , · ) ) , ( t , x i ( t , · ) ) ) η k ( t ) d t .
From this representer theorem, the PLS (9) reduces to
1 n i = 1 n k = 1 n ( ν i k j = 1 2 a i k j d j v = 1 Q w 2 , v 1 θ v j = 1 n l = 1 n c j l b i k j l ) 2 + λ 1 k = 1 2 w 1 k | d k | + τ 0 v = 1 Q w 2 , v 1 θ v i = 1 n k = 1 n j = 1 n l = 1 n c i k b i k j l c j l + τ 1 v = 1 Q w 2 , v θ v ,
where a i k j = 0 1 φ j ( t ) η k ( t ) d t , b i k j l = v = 1 Q w 2 , v 1 θ v b i k j l v , b i k j l v = 0 1 ξ j l ( t , x i ( t , · ) ) η k ( t ) d t . We let Σ = v = 1 Q w 2 , v 1 θ v Σ v , the ( i + ( k 1 ) n , j + ( l 1 ) n ) th element of Σ v is b i k j l v . We let Y k = ( ν 1 k , , ν n k ) , Y = ( Y 1 , , Y n ) , c = ( c 11 , c 21 , , c n n ) , d = ( d 1 , d 2 ) , w 2 = ( w 2 , 1 , , w 2 , Q ) , Σ be an n 2 × n 2 matrix with b i k j l as the ( i + ( k 1 ) n , j + ( l 1 ) n ) element, and T be a n 2 × 2 matrix with a i k j as the ( i + ( k 1 ) n , j ) element. Then, the PLS (11) reduces to
1 n Y T d Σ c 2 + λ 1 k = 1 2 w 1 k | d k | + τ 0 c Σ c + τ 1 w 2 θ ,
subject to θ v 0 , v = 1 , 2 , , Q .
The backfitting algorithm in Wang et al. [12] is applied to solve the PLS (12) as follows (Algorithm 1):
Algorithm 1 Model Selection Algorithm
 Set initial value d = d 0 , θ = θ 0 .
repeat
    Update c by minimizing 1 n Y T d Σ c 2 + τ 0 c Σ c
    Calculate Y * = Y R θ , where R is an n × Q matrix with the v-th column being w 2 , v 1 Σ v c
    Update d by minimizing 1 n Y * T d 2 + λ 1 k = 1 2 w 1 k | d k |
    select tuning parameter M by the k-fold cross-validation or BIC method
    Update θ by minimizing 1 n Y T d R θ 2 + τ 0 c R θ subject to θ v 0 for 1 v Q and w 2 θ M
until c, d and θ converge
return c, d and θ

4. Statistical Properties

In this section, we assume that X and Y are complete measurable spaces. We let P be a probability measure on X q × L 2 ( T ) and M = T × X q . Without the loss of generality, we let the terms λ 1 k = 1 2 w 1 k | d k | and λ 2 v = 1 Q w 2 , v P v f H in (6) be combined into f H .
We define a loss function,
L ( f ; x , y ) = 0 1 ( y ( t ) f ( t , x ( t , · ) ) ) 2 d t ,
where y ( t ) Y and x X q . The corresponding L-risk function (Steinwart and Christmann [28]) is
R L , P ( f ) = E P [ L ( f ; x , y ) ] .
We let f * = arg min f H R L , P ( f ) , R L , P , H * = R L , P ( f * ) , and
f P , λ = arg min f H { R L , P ( f ) + λ f H } .
Obviously, f ^ = f D , λ . We state convergence properties in the following theorem and show its proofs in Appendix B.
Theorem 2.
Assume that f : M R is measurable for any f H , M is a complete measurable space, and | P | 2 = X q × L 2 ( T ) y ( t ) 2 2 d P ( x , y ) < . When λ 0 and λ 6 n as n , we have
| R L , P ( f ^ ) R L , P , H * | = O p ( λ ) .
Theorem 2 states that as λ tends to 0 and λ 6 n tends to infinity as n tends to infinity, the function estimate f ^ is L-risk consistent (Steinwart and Christmann [28]).

5. Simulation

In this section, numerical experiments are studied to evaluate the performance of the proposed model selection approach. Functional covariate take x i ( t , · ) = ( x i 1 ( t , · ) , x i 2 ( t , · ) ) , where x i j ( t , · ) = cos ( 2 π ( x i j * ( t , · ) ) ) , and x i j * ( t , · ) follows a Gaussian process with mean function μ ( t ) = t . Kernel function for the GP takes the RBF kernel k g ( s 1 , s 2 ) = e x p ( ( s 1 s 2 ) 2 / 2 ) for j = 1 and the rational quadratic kernel k l ( s 1 , s 2 ) = 1 / ( 1 + ( s 1 s 2 ) 2 ) for j = 2 . Three functions for f ( t , x ( t , · ) ) are presented as follows: for t [0, 1],
M 1 : f ( t , x ( t , · ) ) = 1 + 5 cos ( 2 π t ) 3 , M 2 : f ( t , x ( t , · ) ) = 1 + 0.5 t + 10 0 1 x 1 3 ( t , s ) d s + 5 0 1 x 2 3 ( t , s ) d s + 10 0 1 x 1 3 ( t , s ) d s 0 1 x 2 3 ( t , s ) d s , M 3 : f ( t , x ( t , · ) ) = 1 + 5 cos ( 2 π t ) + 10 0 1 x 1 ( t , s ) x 2 ( t , s ) d s .
We see that M 1 has the main effect of t, M 2 consists of three main effects and the nonparametric nonparametric interaction of x 1 and x 2 , M 3 consists of the main effect of t, and the nonparametric nonparametric interaction of x 1 and x 2 . Random error ϵ i ( t ) follows N ( 0 , 0.22 2 ) and N ( 0 , 0.5 2 ) . All simulations are repeated 200 times.
We generate n samples { y i ( t ) , x i ( t , · ) : i = 1 , , n } as training data, and n t = 50 samples { y ˜ i ( t ) , x ˜ i ( t , · ) : i = 1 , , n } as test data. For comparison, we evaluate the performance using the following root mean square error (RMSE) on the test data:
RMSE = 1 n t i = 1 n t f ( t , x ˜ i ( t , · ) ) f ^ ( t , x ˜ i ( t , · ) ) 2 2 ,
where · 2 is the norm of L 2 ( T ) .
The proposed model selection method is used to train the model and predict the test data, denoted by L 1 . Not considering model selection, we use the L 2 penalty method to estimate the NCRM model denoted by L 2 , which minimizes the following objective function,
1 n i = 1 n 0 1 ( y i ( t ) f ( t , x i ( t , · ) ) ) 2 d t + λ v = 1 Q P v f H 2 ,
where λ is the tuning parameter. After model selection, the selected model is estimated with the L 2 penalty method, which is denoted by L 1 + L 2 . Table 1 shows the average RMSE and standard deviation in parentheses for these three estimation methods, L 1 , L 2 and L 1 + L 2 . We see that under models M1 and M3, L 1 + L 2 have the smallest RMSEs among these three estimation methods. Under model M2, L 1 has better performance than L 2 and comparable results with those of L 1 + L 2 . In addition, for the three different methods, prediction performance improves as the σ decreases and the training sample size increases.
To evaluate the performance of model selection by the L 1 penalty method, we take three measurement indices in Wang et al. [12], specificity (SPE), sensitivity (SEN) and F 1 scores,
SPE = TN TN + FP , SEN = TP TP + FN , F 1 = 2 TP 2 TP + FN + FP ,
where TP, TN, FP and FN are the numbers of true positives, true negatives, false positives and false negatives, respectively. The non-zero components of the decomposition of the regression function are considered as positive samples. For θ v > 0 , its estimated value is larger than 0, which is considered a true positive.
Table 2 shows the sensitivities, specificities, and F1 scores. Overall, the L 1 penalty method performs well in different simulation settings. In addition, model selection becomes better with decreasing σ and increasing training sample size.

6. Application

The proposed model selection approach is applied to analyze stroke rehabilitation data with 70 stroke survivors ([7]).
The data consist of 34 acute patients with an incidence of stroke less than a month ago and 36 chronic patients with an incidence of stroke more than six months ago. To improve upper limb functions for stroke patients, a convenient home-based rehabilitation system via action video games with 3D-position movement behaviors has been developed [7,8]. The patients played the movement game at scheduled times. For each visit time t, the impairment level of subject i was assessed using a measure called CAHAI (Chedoke Arm and Hand Activity Inventory) score, denoted as y i ( t ) , and movements such as forward circular movement and sawing movement were recorded. In this paper, three movements, forward circular movement of the parental limb from the x-axis ( x i 1 = L A 05 . l x ), sawing movement of the parental limb from the y-axis ( x i 2 = L A 09 . l y ), and the movement of the non-parental limb from the direction of the x-axis ( x i 3 = L A 28 . r q x ) are taken as functional covariates. For the purpose of illustrating the proposed method, we use the data from acute patients. During the three-month study period, each acute patient received up to seven assessments, which resulted in 173 observations. CAHAI scores were normalized before analysis.
In this paper, we focus on the interaction effect upon the order of two, and take the following decomposition:
K 0 ( ( t , u ) , ( t , u ) ) = 1 + k 1 ( t ) k 1 ( t ) , K 1 ( ( t , u ) , ( t , u ) ) = K 2 ( 1 ) ( t , t ) , K 1 + j ( ( t , u ) , ( t , u ) ) = K 2 , j ( 2 ) ( u j , u j ) , K 4 + j ( ( t , u ) , ( t , u ) ) = k 1 ( t ) k 1 ( t ) K 2 , j ( 2 ) ( u j , u j ) + K 2 ( 1 ) ( t , t ) K 2 , j ( 2 ) ( u j , u j ) , K 5 + j + l ( ( t , u ) , ( t , u ) ) = K 2 , j ( 2 ) ( u j , u j ) K 2 , l ( 2 ) ( u l , u l ) ,
for j , l = 1 , 2 , 3 and j < l . Readers can also choose various kinds of SS ANOVA decomposition by merging kernel functions according to their own needs. From Section 3, we have
K * ( ( t , u ) , ( t , u ) ) = v = 1 10 w 2 , v 1 θ v K v ( ( t , u ) , ( t , u ) ) ,
where coefficient θ v for kernel function K v provides levels of contribution of K v to the overall model.
The penalty method with L 1 regularization for model selection is applied to stroke rehabilitation data. Parameters { θ v } are computed, and values larger than 0 are θ 2 = 4.157 , θ 3 = 0.819 , θ 4 = 0.636 , θ 7 = 0.592 and θ 10 = 0.741 . This shows that the main effects of x i 1 ( t , · ) , x i 2 ( t , · ) and x i 3 ( t , · ) , the linear nonparametric interaction of t and x 3 ( t , · ) and the nonparametric-nonparametric interaction of x 2 ( t , · ) and x 3 ( t , · ) have nonzero contributions to the CAHAI score. Thus, the three movements, forward circular movements of the parental limb, awing movements of the parental limb and of the non-parental limb, may be helpful to the recovery of stroke patients. In addition, the interaction of awing movements of the parental limb and the non-parental limb, may contribute to the level of daily life dependence or upper limb function impairment. Figure 1 plots the estimates of nonparametric regression functions for four stroke patients. We can see that the regression function in the NCRM model has the same trend as the scores of CAHAI, and on the whole, they all show an increasing trend along with fluctuating trends, which shows that movements may improve upper limb function for stroke patients.
Prediction performance of the proposed method is evaluated using a tenfold cross-validation,
RPE = 1 10 i = 1 10 1 n j i j t h f o l d   Y i ( t ) Y ^ i ( j ) ( t ) 2 2 ,
where Y ^ i ( j ) ( t ) is the predicted value of Y i ( t ) based on the fitted selected model with the L 1 + L 2 penalty and the data excluding the jth fold. The RPE for Stoke data is 1.0690, which is smaller than 1.1700 from the method of Zhai et al. [9].

7. Conclusions

For functional data with functional covariate inputs, this paper applies the Gaussian kernel function to construct the tensor product RKHS to model the regression function. This leads to a nonparametric concurrent regression model. The L 1 penalty method is used to detect components of the SS ANOVA decomposition of the regression function, which has nonzero contribution to model fit. The backfitting algorithm is developed to estimate the model selection. The proposed method is applied to stroke rehabilitation data, and the results show that besides the main effects, there are interaction effects of the movements on the CAHAI score. This indicates that movements may help improve the level of daily life dependence or impairment of upper limb function of a stroke patient.

Author Contributions

Conceptualization, Z.W. and Y.W.; methodology, Z.W. and R.P.; data curation, R.P.; writing draft preparation, R.P.; writing and editing, Z.W. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China grant number 11971457, 12201601 and Anhui Provincial Natural Science Foundation grant number 2208085.

Data Availability Statement

The data presented in this study are available on reasonable request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SS ANOVASmooth spline analysis of variance
RKHSReproducing Kernel Hilbert Space
NCRMNonparametric concurrent regression model

Appendix A. Tensor Product and Reproducing Kernel Hilbert Space

In this section, we provide a brief description of tensor product space, reproducing kernel Hilbert space (RKHS) and SS ANOVA.
Tensor Product Space. A tensor product space refers to the direct product of multiple vector spaces. For two vector spaces, denoted by V and W, the tensor product space is V W = { ( v , w ) | v V , w W } . For details, please see Lin [29].
Reproducing Kernel Hilbert Space (RKHS). Reproducing kernel is a kernel function with the property of reproduction, and a reproducing kernel Hilbert space is a type of Hilbert space that possesses the property of a reproducing kernel. Mathematically, for a reproducing kernel K and its deduced RKHS H , the reproduction property is f ( x ) = f , K ( x , · ) , where f is in H and x is an input variable. RKHS provides an effective tool for modeling nonlinear relationships and handling high-dimensional data. In the context of regression, an RKHS is utilized as the foundation for model selection and estimation. For details, please see Wainwright [30].
Smoothing spline analysis of variance. Smoothing spline analysis of variance (SS ANOVA) is a powerful tool which combines the strengths of smoothing splines and analysis of variance to facilitate the simultaneous exploration of main effects and interactions among variables. SS ANOVA is an important and useful method to model nonlinear relationships within the regression framework [31,32,33,34,35,36]. For example, Wahba [31] presented theory and applications of smoothing spline models, with a special focus on function estimation from functional data with noise, where it includes univariate smoothing spline, multidimensional thin plate spline, splines on the sphere, additive spline, and interacting spline. Furthermore, Wahba et al. [32] extended the SS ANOVA model to the exponential family distribution and used the developed method to estimate the risk of diabetic retinopathy progression. In addition, Gao et al. [34] combined an SS ANOVA model with a log-linear model to fit multivariate Bernoulli data.
To illustrate the SS ANOVA approach, we consider a nonparametric model represented as follows:
y = f ( x 1 , , x p ) + ϵ ,
where f is the unknown smoothing function and ϵ is an error term. Applying SS ANOVA to model (A1), we decompose f as follows:
f ( x 1 , , x p ) = μ + i = 1 p f i ( x i ) + i < j f i , j ( x i , x j ) + + f 1 , , p ( x 1 , , x p ) .
For this decomposition, μ denotes the overall mean, functions f 1 ( x 1 ) , f 2 ( x 2 ) , , and f p ( x p ) capture the main effects inherent in Model (A1), and functions f i , j ( x i , x j ) provide the interactions between variables x i and x j , and so forth. One way to model these functions, f 1 ( x 1 ) , f 2 ( x 2 ) , , f p ( x p ) , and f 1 , , p ( x 1 , , x p ) , is to use the smoothing splines, such as the cubic splines.

Appendix B. Proof of Theorem 2

From the triangle inequality, we have
| R L , P ( f ^ ) R L , P , H * | | R L , P ( f ^ ) R L , P ( f P , λ ) | + | R L , P ( f P , λ ) R L , P , H * | .
Hence, we separately calculate the convergence rates of | R L , P ( f ^ ) R L , P ( f P , λ ) | and | R L , P ( f P , λ ) R L , P , H * | .
Since | P | 2 and K H H are bounded, without loss of generality, we assume that q = 1 , | P | 2 = 1 , and K H = 1 , where K H is the RK of H and then R L , P ( 0 ) | P | 2 = 1 . From the proof of Theorem 3 in Zhai et al. [9], we have
R L , P ( f ^ ) R L , P f P , λ c 2 + f ^ + f P , λ f ^ f P , λ ,
where c is an indeterminate constant depending on L.
We know that
λ f P , λ H arg inf f H R L , P ( f ) + λ f H R L , P ( 0 ) 1 .
Hence, when f P , λ f ^ H 1 , we have f P , λ K H f P , λ H c λ 1 and f ^ f P , λ + f P , λ f ^ c λ 1 + 1 . Therefore, we have
R L , P ( f ^ ) R L , P f P , λ c λ 1 | | f P , λ f ^ H .
Meanwhile,
λ f P , λ H + R L , P ( f P , λ ) R L , P , H * = inf f H λ f H + R L , P ( f ) R L , P , H * λ f * H + R L , P ( f * ) R L , P , H * ,
which shows that
R L , P ( f P , λ ) R L , P , H * λ ( f * H f P , λ H ) c 2 λ ,
where c 2 > 0 is a constant. Taking the Fr e ´ chet derivative of R L , P ( f ) + λ f H with respect to f, setting it to zero, we have
λ E P ( y f P , λ ( x ) ) Φ = 0 ,
where Φ ( ( t , x ( t , · ) ) ) = K H ( · , ( t , x ( t , · ) ) ) is a canonical map. We let h ( x , y ) = 2 ( y f P , λ ( x ) ) . Following the proof of Theorem 5.9 in [28], we show that
f P ¯ , λ f P , λ , E P ¯ h Φ E P h Φ + λ f P , λ f P ¯ , λ H 2 0 ,
where P ¯ is any distribution defined on X q × L 2 ( T ) . According to (A3), we know that
λ f P , λ f P ¯ , λ H 2 f P , λ f P ¯ , λ , E P ¯ h Φ E P h Φ f P , λ f P ¯ , λ H · E P ¯ h Φ E P h Φ H ,
which indicates that
f P , λ f P ¯ , λ H 1 λ E P ¯ h Φ E P h Φ H .
Let P ¯ = D , and from Lemma 9.2 of Steinwart and Christmann [28], we have
P R L , P ( f ^ ) R L , P , H * ϵ P c λ 2 E P h Φ E D h Φ H + c 2 λ > ϵ O n 1 λ 6 ,
with ϵ = O ( λ ) . Thence, we obtain the order of R L , P ( f ^ ) R L , P , H * as O p ( λ ) .

References

  1. Aue, A.; Rice, G.; Sönmez, O. Detecting and dating structural breaks in functional data without dimension reduction. J. R. Stat. Soc. Ser. B Stat. Methodol. 2018, 80, 509–529. [Google Scholar] [CrossRef]
  2. Aristizabal, J.P.; Giraldo, R.; Mateu, J. Analysis of variance for spatially correlated functional data: Application to brain data. Spat. Stat. 2019, 32, 100381. [Google Scholar] [CrossRef]
  3. Slaoui, Y. Recursive nonparametric regression estimation for independent functional data. Stat. Sin. 2020, 30, 417–437. [Google Scholar] [CrossRef]
  4. Yao, F.; Yang, Y. Online estimation for functional data. J. Am. Stat. Assoc. 2021, 1–15. [Google Scholar]
  5. Smida, Z.; Cucala, L.; Gannoun, A.; Durif, G. A median test for functional data. J. Nonparametric Stat. 2022, 34, 520–553. [Google Scholar] [CrossRef]
  6. De Silva, J.; Abeysundara, S. Functional Data Analysis on Global COVID-19 Data. Asian J. Probab. Stat. 2023, 21, 12–28. [Google Scholar] [CrossRef]
  7. Serradilla, J.; Shi, J.; Cheng, Y.; Morgan, G.; Lambden, C.; Eyre, J.A. Automatic assessment of upper limb function during play of the action video game, circus challenge: Validity and sensitivity to change. In Proceedings of the 2014 IEEE 3nd International Conference on Serious Games and Applications for Health (SeGAH), IEEE, Rio de Janeiro, Brazil, 14–16 May 2014; pp. 1–7. [Google Scholar]
  8. Shi, J.Q.; Cheng, Y.; Serradilla, J.; Morgan, G.; Lambden, C.; Ford, G.A.; Price, C.; Rodgers, H.; Cassidy, T.; Rochester, L.; et al. Evaluating functional ability of upper limbs after stroke using video game data. In Proceedings of the Brain and Health Informatics: International Conference, BHI 2013, Maebashi, Japan, 29–31 October 2013; pp. 181–192. [Google Scholar]
  9. Zhai, Y.; Wang, Z.; Wang, Y. A nonparametric concurrent regression model with multivariate functional inputs. Stat. Its Interface 2023. To be appear. [Google Scholar]
  10. Wang, Y. Smoothing Splines: Methods and Applications; Chapman and Hall: New York, NY, USA, 2011. [Google Scholar]
  11. Gu, C. Smoothing Spline ANOVA Models, 2nd ed.; Springer: New York, NY, USA, 2013. [Google Scholar]
  12. Wang, Z.; Dong, H.; Ma, P.; Wang, Y. Estimation and model selection for nonparametric function-on-function regression. J. Comput. Graph. Stat. 2022, 31, 835–845. [Google Scholar] [CrossRef]
  13. Vapnik, V.; Izmailov, R. Rethinking statistical learning theory: Learning using statistical invariants. Mach. Learn. 2019, 108, 381–423. [Google Scholar] [CrossRef]
  14. Hsu, H.L.; Ing, C.K.; Tong, H. On model selection from a finite family of possibly misspecified time series models. Ann. Stat. 2019, 47, 1061–1087. [Google Scholar] [CrossRef]
  15. Guo, W. Inference in smoothing spline analysis of variance. J. R. Stat. Soc. Ser. B Stat. Methodol. 2002, 64, 887–898. [Google Scholar] [CrossRef]
  16. Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 2006, 68, 49–67. [Google Scholar] [CrossRef]
  17. Olusegun, A.M.; Dikko, H.G.; Gulumbe, S.U. Identifying the limitation of stepwise selection for variable selection in regression analysis. Am. J. Theor. Appl. Stat. 2015, 4, 414–419. [Google Scholar] [CrossRef]
  18. Malik, N.A.M.; Jamshaid, F.; Yasir, M.; Hussain, A.A.N. Time series Model selection via stepwise regression to predict GDP Growth of Pakistan. Indian J. Econ. Bus. 2021, 20, 1881–1894. [Google Scholar]
  19. Untadi, A.; Li, L.D.; Li, M.; Dodd, R. Modeling Socioeconomic Determinants of Building Fires through Backward Elimination by Robust Final Prediction Error Criterion. Axioms 2023, 12, 524. [Google Scholar] [CrossRef]
  20. Radman, M.; Chaibakhsh, A.; Nariman-zadeh, N.; He, H. Generalized sequential forward selection method for channel selection in EEG signals for classification of left or right hand movement in BCI. In Proceedings of the 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 24–25 October 2019; pp. 137–142. [Google Scholar]
  21. Storlie, C.B.; Bondell, H.D.; Reich, B.J.; Zhang, H.H. Surface estimation, variable selection, and the nonparametric oracle property. Stat. Sin. 2011, 21, 679–705. [Google Scholar] [CrossRef]
  22. Lin, Y.; Zhang, H.H. Component selection and smoothing in multivariate nonparametric regression. Ann. Stat. 2006, 34, 2272–2297. [Google Scholar] [CrossRef]
  23. Zhang, H.H.; Wahba, G.; Lin, Y.; Voelker, M.; Ferris, M.; Klein, R.; Klein, B. Variable selection and model building via likelihood basis pursuit. J. Am. Stat. Assoc. 2004, 99, 659–672. [Google Scholar] [CrossRef]
  24. Dong, H.; Wang, Y. Nonparametric Neighborhood Selection in Graphical Models. J. Mach. Learn. Res. 2022, 23, 1–36. [Google Scholar]
  25. Dong, H. Nonparametric Learning Methods for Graphical Models; University of California: Santa Barbara, CA, USA, 2022. [Google Scholar]
  26. Ren, J.; Zhou, F.; Li, X.; Chen, Q.; Zhang, H.; Ma, S.; Jiang, Y.; Wu, C. Semiparametric Bayesian variable selection for gene-environment interactions. Stat. Med. 2020, 39, 617–638. [Google Scholar] [CrossRef]
  27. Hsing, T.; Eubank, R. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators; John Wiley & Sons: Hoboken, NJ, USA, 2015; p. 997. [Google Scholar]
  28. Steinwart, I.; Christmann, A. Support Vector Machines; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  29. Lin, Y. Tensor product space ANOVA models. Ann. Stat. 2000, 28, 734–755. [Google Scholar] [CrossRef]
  30. Wainwright, M.J. High-Dimensional Statistics: A Non-Asymptotic Viewpoint; Cambridge University Press: Cambridge, UK, 2019; Volume 48. [Google Scholar]
  31. Wahba, G. Spline Models for Observational Data; SIAM: Philadelphia, PA, USA, 1990. [Google Scholar]
  32. Wahba, G.; Wang, Y.; Gu, C.; Klein, R.; Klein, B. Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy: The 1994 Neyman Memorial Lecture. Ann. Stat. 1995, 23, 1865–1895. [Google Scholar] [CrossRef]
  33. Gu, C.; Wahba, G. Smoothing spline ANOVA with component-wise Bayesian “confidence intervals”. J. Comput. Graph. Stat. 1993, 2, 97–117. [Google Scholar]
  34. Gao, F.; Wahba, G.; Klein, R.; Klein, B. Smoothing spline ANOVA for multivariate Bernoulli observations with application to ophthalmology data. J. Am. Stat. Assoc. 2001, 96, 127–160. [Google Scholar] [CrossRef]
  35. Guo, W.; Dai, M.; Ombao, H.C.; Von Sachs, R. Smoothing spline ANOVA for time-dependent spectral analysis. J. Am. Stat. Assoc. 2003, 98, 643–652. [Google Scholar] [CrossRef]
  36. Chiu, C.Y.; Liu, A.; Wang, Y. Smoothing spline mixed-effects density models for clustered data. Stat. Sin. 2020, 30, 397–416. [Google Scholar] [CrossRef]
Figure 1. CAHAI scores (red line) and their corresponding fitted values (blue line) for 4 patients.
Figure 1. CAHAI scores (red line) and their corresponding fitted values (blue line) for 4 patients.
Entropy 25 01327 g001
Table 1. Average values and standard deviations of RMSEs. (Methods corresponding to bold numbers perform best).
Table 1. Average values and standard deviations of RMSEs. (Methods corresponding to bold numbers perform best).
n σ Model L 1 L 2 L 1 + L 2
200.2M10.2906 (0.1386)0.3010 (0.0287) 0.0981 ( 0.0420 )
M2 0.8876 ( 0.1147 ) 0.9577 (0.0782)0.9493 (0.0707)
M31.2065 (0.8556)0.9423 (0.1713) 0.7108 ( 0.6317 )
0.5M10.3027 (0.1323)0.4238 (0.0654) 0.1205 ( 0.0591 )
M2 0.9486 ( 0.1166 ) 1.0039 (0.0760)0.9996 (0.0794)
M31.3906 (0.9134)1.0044 (0.1952) 0.8008 ( 0.5486 )
400.2M10.1655 (0.0127)0.2390 (0.0416) 0.0792 ( 0.0113 )
M2 0.7722 ( 0.0517 ) 0.8088 (0.0877)0.7928 (0.0630)
M30.6664 (0.2255)0.5773 (0.0456) 0.3968 ( 0.0423 )
0.5M10.1913 (0.0944)0.3257 (0.0860) 0.0913 ( 0.0179 )
M2 0.8605 ( 0.0678 ) 0.8918 (0.0856)0.8819 (0.0746)
M30.7824 (0.2060)0.6869 (0.0420) 0.4952 ( 0.0764 )
800.2M10.1358 (0.0860)0.1416 (0.0238) 0.0737 ( 0.0076 )
M2 0.6040 ( 0.0620 ) 0.6599 (0.0894)0.6595 (0.0708)
M30.3227 (0.0218)0.3793 (0.0237) 0.2635 ( 0.0248 )
0.5M10.1635 (0.1412)0.2419 (0.0704) 0.0844 ( 0.0206 )
M2 0.7338 ( 0.0552 ) 0.7589 (0.0696)0.7578 (0.0605)
M30.4080 (0.1488)0.5511 (0.0475) 0.3566 ( 0.0470 )
Table 2. Average values and standard deviations of SPE, SEN, F1 scores under models M1, M2, M3.
Table 2. Average values and standard deviations of SPE, SEN, F1 scores under models M1, M2, M3.
n σ ModelSPESENF1
200.2M10.9956 (0.0291)0.9800 (0.1404)0.9783 (0.1421)
M20.9700 (0.0601)0.7017 (0.1472)0.7883 (0.1166)
M30.9906 (0.0366)0.9850 (0.1219)0.9592 (0.1527)
0.5M10.9961 (0.0233)0.9850 (0.1219)0.9800 (0.1279)
M20.9786 (0.0511)0.6917 (0.1529)0.7881 (0.1201)
M30.9889 (0.0386)0.9850 (0.1219)0.9553 (0.1543)
400.2M11.0000 (0.0000)1.0000 (0.0000)1.0000 (0.0000)
M20.9614 (0.0667)0.9817 (0.0762)0.9512 (0.0872)
M30.9833 (0.0428)1.0000 (0.0000)0.9517 (0.1212)
0.5M10.9989 (0.0111)0.9950 (0.0707)0.9933 (0.0744)
M20.9407 (0.0720)0.9600 (0.1086)0.9177 (0.1060)
M30.9844 (0.0417)1.0000 (0.0000)0.9550 (0.1178)
800.2M10.9983 (0.0175)1.0000 (0.0000)0.9958 (0.0424)
M20.9743 (0.0737)1.0000 (0.0000)0.9635 (0.0701)
M31.0000 (0.0000)1.0000 (0.0000)1.0000 (0.0000)
0.5M10.9956 (0.0291)0.9850 (0.1219)0.9817 (0.1259)
M20.9546 (0.0853)1.0000 (0.0000)0.9317 (0.1031)
M30.9983 (0.0135)1.0000 (0.0000)0.9950 (0.0406)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pan, R.; Wang, Z.; Wu, Y. Detection of Interaction Effects in a Nonparametric Concurrent Regression Model. Entropy 2023, 25, 1327. https://doi.org/10.3390/e25091327

AMA Style

Pan R, Wang Z, Wu Y. Detection of Interaction Effects in a Nonparametric Concurrent Regression Model. Entropy. 2023; 25(9):1327. https://doi.org/10.3390/e25091327

Chicago/Turabian Style

Pan, Rui, Zhanfeng Wang, and Yaohua Wu. 2023. "Detection of Interaction Effects in a Nonparametric Concurrent Regression Model" Entropy 25, no. 9: 1327. https://doi.org/10.3390/e25091327

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop