High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model

Alahiane, Mohamed; Ouassou, Idir; Rachdi, Mustapha; Vieu, Philippe

doi:10.3390/math10152704

Open AccessArticle

High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model

¹

Ecole Nationale des Sciences Appliquées, Université Cadi Ayyad, Marrakech 40000, Morocco

²

Laboratoire AGEIS EA 7407, Université Grenoble Alpes, AGIM Team, UFR SHS, BP. 47, CEDEX 09, 38040 Grenoble, France

³

Institut de Mathématiques de Toulouse, Université Paul Sabatier, CEDEX 09, 31062 Toulouse, France

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(15), 2704; https://doi.org/10.3390/math10152704

Submission received: 27 June 2022 / Revised: 18 July 2022 / Accepted: 26 July 2022 / Published: 30 July 2022

(This article belongs to the Special Issue Advances in Statistics: Theory, Methodology, Applications and Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

We study the non-parametric estimation of partially linear generalized single-index functional models, where the systematic component of the model has a flexible functional semi-parametric form with a general link function. We suggest an efficient and practical approach to estimate (I) the single-index link function, (II) the single-index coefficients as well as (III) the non-parametric functional component of the model. The estimation procedure is developed by applying quasi-likelihood, polynomial splines and kernel smoothings. We then derive the asymptotic properties, with rates, of the estimators of each component of the model. Their asymptotic normality is also established. By making use of the splines approximation and the Fisher scoring algorithm, we show that our approach has numerical advantages in terms of the practical efficiency and the computational stability. A computational study on data is provided to illustrate the good practical behavior of our methodology.

Keywords:

functional data analysis; generalized linear model; polynomial splines; quasi-likelihood; semi-parametric regression; the kernel estimator of the regression operator; single-index model; Fisher scoring algorithm; asymptotic normality

MSC:

62G05; 62G08; 62G20; 62G35; 62G07; 62G32; 62G30; 62H12

1. Introduction

Generalized linear models (GLM) encompass several parametric regression models by studying parametric modeling between a (often canonical or known) link function between the mean response and certain covariates; check [1,2]. This is not always desirable because the link function is not always known and may be more complicated.

Several models have been developed to overcome this problem, such as non-parametric and semi-parametric regression models. However, the great scourge arises and limits the use of such models. Thus, efforts have been made to circumvent this difficulty. The remedy is available according to two approaches: the approximation of the functions of link and the reduction of dimension. To do this, the generalized additive model (GAM), in which the non-parametric component is replaced by a sum of functions with only one variable, was recommended by Hastie et al. [3] and detailed by Wood [4]. The only criticism of this type of model is that it does not take into account the interactions between the explanatory variables. Thus, the single index model (SIM) was developed by Hardle et al. [5] and Hristache et al. [6] because it makes it possible to reduce the dimensions and soften the restrictive parametric assumptions by replacing several covariates by a linear combination of them. Subsequently, the partially linear single-index model (PLSIM), making it possible to model discrete explanatory variables in the linear part, were developed by Liang et al. [7] and Chen et al. [8] for the case of longitudinal data. Generalized partial linear single index (GPLSIM) models that are based on kernel smoothing to estimate the single index link function first appeared in the work of Caroll et al. [9], while in that of Cao and Wang [10], we use penalized spline smoothing of the quasi-likelihood and the technique of Fisher scoring, which is theoretically more reliable and relevant.

Notice also that some covariates may be functional and are not taken into account by this model. It should be remembered that several works have focused on the study of functional variables (see, for example, Ramsay and Silverman [11] and Ferraty and Vieu [12]). Note also that semi-functional partial linear regression has been studied by Aneiros-Perez and Vieu [13], then partial linear modeling with multi-functional covariates by Aneiros-Perez and Vieu [14]. We can also refer to several books on this subject, such as Horvath and Kokoszka [15], Kokoszka and Reimherr [16], Schumaker [17], Ould-Said et al. [18], Ouassou and Rachdi [19,20], Laksaci et al. [21], Cao et al. [22], Li and Lu [23] and Yao et al. [24]. We specifically cite, for example, Yu et al. [25] for the study of the partially functional single-index linear regression model and Yu and Ruppert [26] for a comprehensive review of the penalized spline smoothing methodology for PLSIM, in which the sub-regression function underlying is assumed to be a spline function with a fixed number of knots. Partially linear generalized single index models for functional data (PLGSIMF) has been studied by Rachdi et al. [27] and Alahiane et al. [28] using B-spline expansion and the quasi-likelihood function where the functional part is linear.

In this paper, we study the generalized non-parametric functional partially linear single-index models GNPFPLSIM where the functional variables, which froze there, are taken into account. Notice that in this model (I) the link function is unknown, (II) the number of knots increases with the size of the sample, (III) the functional regression is estimated in parallel with the unknown link function and the simple nonparametric index function using an iterative algorithm based on smoothing by spline functions and maximizing the quasi-likelihood function.

In addition, we use Fisher’s algorithm to solve the maximization problem iteratively. In addition, we also provide a generalized cross-validation method to select the number of knots in the spline approximation and use kernel methods for functional data.

We also provide the convergence rates of our different estimators of the different parameters of GNPFPLSIM.

The rest of this paper is organized as follows. In Section 2 and Section 3, we develop an estimation methodology, give some asymptotic properties of the proposed estimators and we present an iterative algorithm based on the maximization of the quasi-likelihood function for the computation of the proposed estimators. A simulation study is conducted in Section 5. The technical lemmas allowing to prove Theorems 1, 2 and 3 are presented in Appendix A.

Notice finally that in order to save space, proofs of various results obtained are grouped in a supplementary file to this paper.

2. Some Preliminaries

Let Y be a scalar response variable and

(X, Z) \in {I R}^{d} \times H

be the predictor vector where

X = {(X_{1}, \dots, X_{d})}^{⊤}

and Z belongs to H where

(H, δ)

is a semi-metric space of functions defined on

[0, 1]

, i.e., Z is a functional random variable and

δ

is a semi-metric.

For a fixed

(x, z) \in {I R}^{d} \times H

, we assume that the conditional density function of the response Y given

(X, Z) = (x, z)

belongs to a canonical exponential family, which is given by

f_{Y | X = x, Z = z} (y) = exp (y ξ (x, z) - B (ξ (x, z)) + C (y)),

(1)

where B and C are two known functions, and where

ξ

is the unknown natural parameter in the generalized parametric linear models, which is linked to the dependent variable by

µ (x, z) = E [Y | X = x, Z = z] = B^{'} (ξ (x, z)),

(2)

where

B^{'}

denotes the first derivative of the function B (see [10,29]).

In what follows, we modelize

g (µ (x, z))

as a generalized non-parametric functional partially linear single-index model (GNPFPLSIM) by

g (µ (x, z)) = η_{0} (α^{⊤} x) + r (z),

(3)

where

α = (α_{1}, α_{2}, \dots, α_{d}) \in R^{d}

is the single-index coefficient vector of dimension d,

r (\cdot)

is the unknown non-parametric function in the functional component, and

η_{0} (\cdot)

is the unknown single-index link function which will be assumed to be sufficiently smooth.

Remark 1.

Notice the following:

For identifiability reasons, we assume that ${| | α | |}_{d} = 1$ and the first component of α is non-negative, i.e., $α_{1} > 0$ , where $| | \cdot {| |}_{d}$ denotes the Euclidean norm on ${I R}^{d}$ .
In order to identify the function $η_{0} (\cdot)$ , we define its support as $[a, b]$ , where $a = inf α^{⊤} X$ and $b = sup α^{⊤} X$ .
The GNPFPLSIM includes as special cases the linear model (LM), the single-index model (SIM), as well as the partially linear model (PLM), the PLSIM, and the non-parametric models.
In the definition of the real canonical link function g, we assume that the functional random variable $Z = {Z (t), t \in [0, 1]}$ is valued in H and such that

$E [Z] = 0, E (ε | X, Z) = 0 a n d var (ε | X, Z) = σ^{2} .$
If the conditional variance $var (Y | X = x, Z = z) = σ^{2} V (µ (x, z))$ where $V (\cdot)$ is an unknown positive function, then the estimation of the mean function $g (µ)$ may be obtained by replacing the log-likelihood $f_{Y | X = x, Z = z}$ , given by (1), by the quasi-likelihood $Q (u, v)$ which is given, for any real numbers u and v, by

$\frac{\partial Q (u, v)}{\partial u} = \frac{v - u}{σ^{2} V (u)} = \frac{v - u}{var (Y | X = x, Z = z)},$

and which may be written as follows:

$Q (u, v) = \int_{v}^{u} \frac{v - t}{σ^{2} V (t)} d t .$
The regression operator $r (\cdot)$ , which is a nonlinear operator from H into $I R$ , satisfies
•
$r \in C_{H}^{0}$ where $C_{H}^{0} = \{f : H ⟶ I R s u c h t h a t lim_{δ (Z, Z^{'}) \mapsto 0} f (Z^{'}) = f (Z)\} .$
or
•
There exists $β > 0$ such that $r \in L i p_{H, β}$ , where

$L i p_{H, β} = \{f : H ⟶ I R, \exists C \in {I R}_{*}^{+}, \forall Z^{'} \in H, | f (Z) - f (Z^{'}) | < C {δ (Z, Z^{'})}^{β}\} .$

3. Estimation Methodology

Let

(X_{i}, Y_{i}, Z_{i})

for

i = 1, \dots, n

, be an independent and identically distributed (i.i.d.) n-sample of

(X, Y, Z)

. Then, for each

i = 1, \dots, n

g (µ (X_{i}, Z_{i})) = η_{0} (α^{⊤} X_{i}) + r (Z_{i}) .

(4)

Let

v \in N^{*}

and

κ \in (0, 1]

such that

p = v + κ > 1.5

. We designate by

H (p)

the collection of functions g defined on

[a, b]

whose vth derivative,

g^{(v)}

exists and satisfies the following Lipschitz condition of order

κ,

|g^{(v)} (m^{^{'}}) - g^{(v)} (m)| \leq C {|m^{^{'}} - m|}^{κ}, for all a \leq m, m^{^{'}} \leq b .

We introduce a knots sequence

(k_{m})

in the interval

[a, b]

with J interior knots, such that

k_{- r + 1} = \dots = k_{- 1} = k_{0} = a < k_{1} < \dots < k_{J} = k_{J + 1} = \dots = k_{J + r},

where

J = J_{n}

increases as the sample size n increases.

Definition 1.

A function

s (\cdot)

is said to belong to the space of polynomial splines,

S_{n}

, on an interval

[a, b]

of order

ν \geq 1

, if

$s (\cdot)$ is a polynomial of degree $(ν - 1)$ on each sub-interval $I_{j} = [k_{j}, k_{j + 1}],$ for $j = 0, \dots, J_{n} - 1$ and $I_{J_{n}} = [k_{J_{n}}, b]$ .
$s (\cdot)$ is $(ν - 2)$ -times continuously differentiable on $[a, b]$ .

Let

N_{n} = J_{n} + ν

be the number of knots, and let

B_{j} (u), j = 1, \dots, N_{n}

, be the B-spline basis functions of order

ν

. Moreover, let

h = (b - a) / (J_{n} + 1)

be the distance between the neighboring knots. Then, a function

η_{0} (\cdot) \in H (p)

(which will be defined in Section 3) may be approximated by a function

\tilde{η} \in S_{n}

with

\tilde{η} (\cdot) = {\tilde{γ}}^{⊤} B (\cdot)

where

B (\cdot) = {(B_{1} (\cdot), B_{2} (.), \dots, B_{N_{n}} (.))}^{⊤}

is a vector of cubic B-splines of order

ν

(see DeBoor [30]).

So, our estimation process consists of two steps as follows.

3.1. The First Step

Let

Y_{i} = r (Z_{i}) for 1 \leq i \leq n .

(5)

A non-parametric estimator of the regression operator

r (\cdot)

is defined by

\hat{r} (z) = \frac{\sum_{i = 1}^{n} Y_{i} K (\frac{δ (z, Z_{i})}{h_{1}})}{\sum_{i = 1}^{n} K (\frac{δ (z, Z_{i})}{h_{1}})} = \sum_{i = 1}^{n} ω_{i, h_{1}} (z) Y_{i},

(6)

where

Y_{i} = g (µ (X_{i}, Z_{i}))

,

ω_{i, h_{1}} (z) = K (δ (z, Z_{i}) / h_{1}) / \sum_{i = 1}^{n} K (δ (z, Z_{i}) / h_{1})

, and the kernel

K : R \to R^{+}

which is supported within

(0, 1)

, is of

Type 1: if $\int K = 1, c_{1} 1_{[0, 1]} \leq K \leq c_{2} 1_{[0, 1]}$ for constants $0 < c_{1} < c_{2}$ ,

or

Type 2: if $\int K = 1$ , K is derivable on $(0, 1)$ and $c_{2} \leq K^{(1)} \leq c_{1}$ for some constants $- \infty < c_{2} < c_{1} < 0 .$

and the sequence

h_{1} = h_{1, n} > 0

is the bandwidth (the smoothing parameter).

3.2. The Second Step

By plugging in the non-parametric estimator

\hat{r} (\cdot)

in (6), we consider the model

g (µ (X_{i}, Z_{i})) = γ^{⊤} B (α^{⊤} X_{i}) + \hat{r} (Z_{i}) for i = 1, \dots, n .

(7)

The mean function estimator

\hat{µ}

is given by the evaluation of the parameters

\hat{θ} = {({\hat{α}}^{⊤}, {\hat{γ}}^{⊤})}^{⊤}

and inverting Equation (7). In fact,

\hat{θ} = {({\hat{α}}^{⊤}, {\hat{γ}}^{⊤})}^{⊤}

is determined by maximizing the following quasi-likelihood

\hat{θ} = {({\hat{α}}^{⊤}, {\hat{γ}}^{⊤})}^{⊤} = \underset{θ = (α, γ) \in R^{d} \times R^{N}}{arg max} L (θ)

(8)

where

\begin{matrix} L (θ) : = L (α, γ) & = & \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{γ^{⊤} B (α^{⊤} X_{i}) + \hat{r} (Z_{i})\}, Y_{i}) \\ = & \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{m_{i} (X_{i}, Z_{i}) + \hat{r} (Z_{i})\}, Y_{i}), \end{matrix}

with

\begin{matrix} m_{i} & = & γ^{⊤} B (α^{⊤} X_{i}) + \hat{r} (Z_{i}) = γ^{⊤} B (U_{i}) + \hat{r} (Z_{i}) where U_{i} = α^{⊤} X_{i}, \\ m_{0 i} & = & γ_{0}^{⊤} B (α_{0}^{⊤} X_{i}) + \hat{r} (Z_{i}) = γ_{0}^{⊤} B (U_{0 i}) + \hat{r} (Z_{i}) where U_{0 i} = α_{0}^{⊤} X_{i} \end{matrix}

and

\begin{matrix} m_{0} & = & γ_{0}^{⊤} B (α_{0}^{⊤} X) + \hat{r} (Z) = γ_{0}^{⊤} B (U_{0}) + \hat{r} (Z) where U_{0} = α_{0}^{⊤} X, \end{matrix}

where

α_{0}, γ_{0}

and

η_{0} (\cdot)

denote the true values, respectively, of

α, γ

and

η (\cdot)

.

In order to overcome the constraints

∥ α ∥ = 1

and

α_{1} > 0

of the d-dimensional index

α

, we proceed by a re-parametrization (see Yu and Ruppert [26])

α = α (τ) = {(\sqrt{1 - {∥ τ ∥}^{2}}, τ^{⊤})}^{⊤} for τ \in {I R}^{d - 1} .

The true value

τ_{0}

, of

τ

, must satisfy

∥τ_{0}∥ \leq 1

. Then, we assume that

∥τ_{0}∥ < 1

.

The Jacobian matrix of

α : τ \to α (τ)

of dimension

d \times (d - 1)

is

J (τ) = (\begin{matrix} - \frac{1}{\sqrt{1 - {∥ τ ∥}^{2}}} τ^{⊤} \\ I_{(d - 1) \times (d - 1)} \end{matrix}) .

Notice that

τ

is unconstrained and of one dimension lower than the

α

dimension.

Recall that since

η_{0} \in H (p)

, then there exists

\tilde{η} \in S_{n}

such that

∥ η_{0} - \tilde{η} ∥_{\infty} = O (h^{p})

and

\tilde{η} (\cdot) = {\tilde{γ}}^{⊤} B (\cdot)

. Thus, let

\tilde{α} = \underset{{∥ α ∥}_{d} = 1}{arg max} \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{\tilde{η} (α^{⊤} X_{i}) + \hat{r} (Z_{i})\}, Y_{i}) .

(9)

Then

\tilde{τ} = \underset{τ \in R^{d - 1}}{arg max} \tilde{l} (τ),

(10)

where

\tilde{l} (τ) = \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{\tilde{η} ({\tilde{α}}^{⊤} (τ) X_{i}) + \hat{r} (Z)\}, Y_{i}) .

We define

{\tilde{θ}}_{τ} = {({\tilde{τ}}^{⊤}, {\tilde{γ}}^{⊤})}^{⊤}

, such that

{({\tilde{τ}}^{⊤}, {\tilde{γ}}^{⊤})}^{⊤} = \underset{τ \in R^{d - 1}, γ \in R^{N}}{arg max} \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{γ^{⊤} B ({\tilde{α}}^{⊤} (τ) X_{i}) + \hat{r} (Z_{i})\}, Y_{i})

(11)

Notice that

θ_{τ} = {(τ^{⊤}, γ^{⊤})}^{⊤}

is

(d - 1) \times N

-dimensional, while

θ = {(α^{⊤} (τ), γ^{⊤})}^{⊤}

is

d \times N

-dimensional. Then,

\tilde{l} (θ_{τ})

becomes

\tilde{l} (θ_{τ}) = \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{γ^{⊤} B ({\tilde{α}}^{⊤} (τ) X_{i}) + \hat{r} (Z_{i})\}, Y_{i}) = \frac{1}{n} \sum_{i = 1}^{n} Q (g^{- 1} \{m_{i}\}, Y_{i}) .

For

l = 1, 2

, we denote by

ρ_{l} (m) = \frac{1}{σ^{2} V (g^{- 1} (m))} {[\frac{d}{d m} (g^{- 1} (m))]}^{l} and q_{l} (m, y) = \frac{\partial^{l}}{\partial m^{l}} Q (g^{- 1} (m), y) .

Then

q_{1} (m, y) = (y - g^{- 1} (m)) ρ_{1} (m) and q_{2} (m, y) = (y - g^{- 1} (m)) ρ_{1}^{'} (m) - ρ_{2} (m) .

The score vector is

\begin{matrix} S (θ_{τ}) = \frac{\partial L}{\partial θ_{τ}} (θ) |_{θ = θ_{τ}} = \frac{1}{n} \sum_{i = 1}^{n} q_{1} (m_{i}, Y_{i}) ξ_{i} (τ, γ), \end{matrix}

(12)

where

\begin{matrix} ξ_{i} (τ, γ) = (\begin{matrix} γ^{⊤} B^{'} (α^{⊤} (τ) X_{i}) J^{⊤} (τ) X_{i} \\ B (α^{⊤} (τ) X_{i}) \end{matrix}) . \end{matrix}

Then, the Hessian matrix of the quasi-likelihood function is therefore

\begin{matrix} H (θ_{τ}) & = & - E [\frac{\partial}{\partial θ_{τ}^{⊤} \partial θ_{τ}} S {(θ)}_{| θ = θ_{r}}] = \frac{1}{n} \sum_{i = 1}^{n} ρ_{2} (m_{i}) ξ_{i} (τ, γ) ξ_{i}^{⊤} (τ, γ) . \end{matrix}

We have

{\tilde{θ}}_{τ} = {({\tilde{τ}}^{⊤}, {\tilde{γ}}^{⊤})}^{⊤} = \underset{θ_{τ} = (τ, γ) \in R^{d - 1} \times R^{N}}{arg max} \tilde{L} (θ_{τ}) .

Then

0 = \frac{\partial \tilde{L}}{\partial θ_{τ}} (θ) |_{θ = {\tilde{θ}}_{τ}} \approx \frac{\partial l}{\partial θ_{τ}} (θ) |_{θ = {\hat{θ}}_{τ}} + \frac{\partial^{2} l}{\partial θ_{τ}^{⊤} \partial θ_{τ}} (θ) |_{θ = {\hat{θ}}_{τ}} ({\tilde{θ}}_{τ} - {\hat{θ}}_{τ}) .

By replacing

\frac{\partial^{2} l}{\partial θ_{τ}^{⊤} \partial θ_{τ}} (θ) |_{θ = {\hat{θ}}_{τ}}

by

E [\frac{\partial^{2} l}{\partial θ_{τ}^{⊤} \partial θ_{τ}} (θ) |_{θ = {\hat{θ}}_{τ}}]

, we obtain

S ({\hat{θ}}_{τ}) - H ({\hat{θ}}_{τ}) ({\tilde{θ}}_{τ} - {\hat{θ}}_{τ}) = 0,

and then

{\tilde{θ}}_{τ} = {\hat{θ}}_{τ} + {[H ({\hat{θ}}_{τ})]}^{- 1} S ({\hat{θ}}_{τ}) .

Elsewhere, the Fisher scoring update equations become

\begin{matrix} θ_{τ}^{(k + 1)} & = & θ_{τ}^{(k)} + {[H (θ_{τ}^{(k)})]}^{- 1} S (θ_{τ}^{(k)}) \\ = & θ_{τ}^{(k)} + {[\sum_{i = 1}^{n} ρ_{2} (m_{i}^{(k)}) ξ_{i} (τ^{(k)}, γ^{(k)}) ξ_{i}^{⊤} (τ^{(k)}, γ^{(k)})]}^{- 1} \\ \times [\sum_{i = 1}^{n} (Y_{i} - µ_{i}^{(k)}) ρ_{1} (m_{i}^{(k)}) ξ_{i} (τ^{(k)}, γ^{(k)})] \end{matrix}

(13)

where, for

1 \leq i \leq n,

\begin{matrix} m_{i}^{(k)} & = & γ^{(k) ⊤} B (α^{(k) ⊤} (τ^{(k)}) X_{i}) + \hat{r} (Z_{i}), \\ µ_{i}^{(k)} & = & g^{- 1} (m_{i}^{k}), \\ \hat{η} (t) & = & {\hat{γ}}^{⊤} B (t) \approx γ^{(k) ⊤} B (t) = \sum_{j = 1}^{N} γ_{j}^{(k)} B_{j} (t) \\ {\hat{m}}_{i} & = & {\hat{γ}}^{⊤} B (α^{⊤} (\hat{τ}) X_{i}) + \hat{r} (Z_{i}) \approx \sum_{j = 1}^{N_{1}} γ_{j}^{(k)} B_{j} (α^{⊤} {(τ^{k})}^{} X_{i}) + \hat{r} (Z_{i}) . \end{matrix}

Then

{\hat{µ}}_{i} = g^{- 1} ({\hat{m}}_{i})

and

\hat{α} = α (τ^{(k)})

is the estimator of the single index coefficient vector of the GNPFPLSIM model.

4. Some Asymptotic Properties

In this section, we present the asymptotic properties of the estimators for the non-parametric components, the parametric components, the single-index and the almost complete convergence of the functional regression operator of the GNPFPLSIM model. For this aim, we will need some assumptions.

4.1. Assumptions

Let

φ

,

φ_{1}

and

φ_{2}

be measurable functions on

[a, b]

. We define the empirical inner product and its corresponding norm as follows

{〈φ_{1}, φ_{2}〉}_{n} = \frac{1}{n} \sum_{i = 1}^{n} φ_{1} (U_{i}) φ_{2} (U_{i}) and {∥ φ ∥}_{n}^{2} = \frac{1}{n} \sum_{i = 1}^{n} φ^{2} (U_{i}) where U_{i} = α^{⊤} X_{i} .

If

φ

,

φ_{1}

and

φ_{2}

are

L^{2}

-integrable, we define the theoretical inner product and its corresponding norm as follows

〈φ_{1}, φ_{2}〉 = E [φ_{1} (U) φ_{2} (U)] and {∥ φ ∥}_{2}^{2} = E [φ^{2} (U)] = \int_{a}^{b} φ^{2} (u) f (u) d u .

We assume that

(C1)

η_{0} (\cdot) \in H (p) .

(C2)

For all

m \in R

and for all y in the range of the response variable Y, we have, for

k = 1, 2

that

q_{2} (m, y) < 0 and c_{q} < |q_{2}^{k} (m, y)| < C_{q},

for some positive constants

c_{q}

and

C_{q} .

(C3)

The

ν

th order partial derivative of the joint density function of X satisfies the Lipschitz’ condition of order

κ (κ \in (0, 1]) .

The marginal density function of

α^{⊤} X

is continuous and bounded away from zero and supported within

[a, b]

.

(C4)

For any vector

τ

, there exists positive constants

c_{τ}

and

C_{τ}

such that

c_{τ} I_{t \times t} \leq E [(\begin{matrix} 1 \\ X \end{matrix}) {(\begin{matrix} 1 \\ X \end{matrix})}^{⊤} | α^{⊤} (τ) X = α^{⊤} (τ) x] \leq C_{τ} I_{t \times t},

where

t = 1 + d + N_{n} .

(C5)

The number

N_{n}

of knots satisfies

n^{\frac{1}{2 (p + 1)}} ≪ N_{n} ≪ n^{\frac{1}{8}} (p > 3) .

(C6)

The variable

Z \in (H, δ)

where

(H, δ)

is semi-metric space.

(C7)

–: The operator $r (\cdot) \in C_{H}^{0}$ , where $C_{H}^{0} = \{f : H \to R such that lim_{δ (x, x^{'}) \to 0} f (x^{'}) = f (x)\}$
–: There exists $β > 0$ such that $r (\cdot) \in {Lip}_{S, β}$ .
–: For all $ε > 0, P (Z \in B (z, ε)) = φ_{z} (ε) > 0$
–: The bandwidth $h_{1}$ satisfies $h_{1} \to 0 and log n / n φ_{z} (h_{1}) \to 0,$ when $n \to \infty$ .
–: The kernel K is of Type 1 or Type 2.
–: For all $m \geq 2, E [|Y^{m}| | Z = z] < σ_{m} (z) < \infty$ , where $σ_{m} (\cdot)$ is a continuous function in z.

(C8)

There exist some positive constants

0 < C_{g}, C_{g}^{*}, M_{1} < \infty

such that

|ρ_{1} (m_{0})| \leq C_{ρ} and |ρ_{1} (m) - ρ_{1} (m_{0})| \leq C_{ρ}^{*} |m - m_{0}| for all |m - m_{0}| \leq M_{0} .

(C9)

There exist positive constants

0 < C_{g}, C_{g}^{*}, M_{1} < \infty

such that the link function g satisfies

|\frac{d}{d m} g (m_{0})| \leq C_{g} and |\frac{d}{d m} g^{- 1} (m) - \frac{d}{d m} g^{- 1} (m_{0})| \leq C_{g}^{*} |m - m_{0}|

for all

|m - m_{0}| \leq M_{1} .

(C10)

There exists a positive constant

C_{0}

such that

E [ε^{2} | U_{τ, 0}] \leq C_{0}

almost surely (a.s.)

Remark 2.

1.: If the kernel K is of Type 1, then there are two generic constants $c > 0$ and $c^{^{'}} > 0$ such that

$c φ_{z} (h_{1}) \leq E [K (h_{1}^{- 1} δ (z, Z))] \leq c^{'} φ_{z} (h_{1}) .$
2.: If the kernel K is of Type 1 and if there exists $c_{3} > 0$ and $ε_{0} > 0$ such that for all $ε < ε_{0}$ , $\int_{0}^{ε} φ_{z} (u) d u > c_{3} ε φ_{z} (ε)$ , then there are two generic constants $c > 0$ and $c^{^{'}} > 0$ such that for $h_{1}$ small enough

$c φ_{z} (h_{1}) < E [K (h_{1}^{- 1} δ (z, Z))] \leq c^{'} φ_{z} (h_{1}) .$

4.2. The Consistency Study

Lemma 1.

Under assumptions (C1)–(C5), we have

\sqrt{n} (α (\tilde{τ}) - α (τ_{0})) \overset{D}{⟶} N (0, J (τ_{0}) A^{- 1} Σ_{1} A^{- 1} J^{⊤} (τ_{0})),

and

α (\tilde{τ}) - α (τ_{0}) = O_{P} (\frac{1}{\sqrt{n}}),

where

\overset{D}{⟶}

(respectively,

O_{P}

) denotes the convergence in distribution (respectively, in probability), and

Σ_{1}

and A are two matrices that will be defined in the Appendix A.

Lemma 2.

Under assumptions (C1)–(C5), we have

∥ \hat{θ} - \tilde{θ} ∥ = O_{P} \{\sqrt{N_{n}} (h^{p} + \frac{1}{\sqrt{n h}})\} .

(14)

The proofs of the previous results are based on the following lemmas and among others on the papers of Pollard [31] and Stone [32].

Lemma 3

(Lemma A.1. in Huang [33]). For any

λ > 0

, let

Θ_{n} = {η (α_{0}^{⊤} x) s u c h t h a t

η \in S_{n}, {∥η - η_{0}∥}_{2} \leq λ} .

Then, for any

ε \leq λ

log N_{[]} (λ, Θ_{n}, L^{2} (P)) \leq C N_{n} log (\frac{λ}{ε}) .

Lemma 4

(Lemma A.2. in Wang and Yang [29] and Lemma A.4. in Xue and Yang [34]). Under assumptions (C1)–(C5), we have

A_{n} = sup_{η_{1}, η_{2} \in S_{n}} |\frac{{〈η_{1}, η_{2}〉}_{n} - 〈η_{1}, η_{2}〉}{{∥η_{1}∥}_{2} {∥η_{2}∥}_{2}}| = O_{a . c o .} \{\sqrt{\frac{log n}{n h}}\},

where

O_{a . c o .}

denotes the “O” Lanadau symbol, for the almost-complete convergence.

Let

D_{n, θ} = (\begin{matrix} γ^{⊤} B^{'} (α^{⊤} (\tilde{τ}) X_{i}) J^{⊤} (τ) & 0 \\ 0 & B (α^{⊤} (\tilde{τ}) X_{i}) \end{matrix})

, and denote by

W_{n, θ} = \frac{1}{n} \sum_{i = 1}^{n} D_{i, θ} (\begin{matrix} X_{i} \\ 1 \end{matrix}) {(\begin{matrix} X_{i} \\ 1 \end{matrix})}^{⊤} D_{i, θ}^{⊤} and W_{θ} = \frac{1}{n} \sum_{i = 1}^{n} E [D_{i, θ} (\begin{matrix} X_{i} \\ T \end{matrix}) {(\begin{matrix} X_{i} \\ 1 \end{matrix})}^{⊤} D_{i, θ}^{⊤}] .

Then, we have the following lemma.

Lemma 5

(Lemma A.3 in the Supplementary Material of Wang and Yang [29]). Under assumptions (C1)–(C8), there exists

C > 0

such that

\underset{θ}{error} {∥W_{θ}^{- 1}∥}_{2} \leq C \sqrt{N_{n}} a . c o . a n d \underset{θ}{error} {∥W_{n, θ}^{- 1}∥}_{2} \leq C \sqrt{N_{n}} a . c o .,

where

{∥ M ∥}_{2} = sup_{x \neq 0} \frac{∥ M x ∥}{∥ x ∥} = \underset{∥ x ∥ = 1}{error} ∥ M x ∥ .

4.2.1. Almost Complete Convergence of the Functional Kernel Estimator of $\hat{r}$

Theorem 1.

Under assumptions (C1)–(C7), we have

\hat{r} (x) \overset{a . c o}{⟶} r (x) as n \to \infty .

In order to show Theorem 1, we will need the following lemmas for which the proofs are given in the Supplementary Material. In fact, the proof is based on the following decomposition:

\begin{matrix} \hat{r} (z) - r (z) = \frac{1}{{\hat{r}}_{1} (z)} \{({\hat{r}}_{2} (z) - E [{\hat{r}}_{2} (z)]) - (r (z) - E [{\hat{r}}_{2} (z)])\} - \frac{r (z)}{{\hat{r}}_{1} (z)} [{\hat{r}}_{1} (z) - 1] \end{matrix}

where

\hat{r} (z) = \frac{{\hat{r}}_{2} (z)}{{\hat{r}}_{1} (z)} with {\hat{r}}_{2} (z) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} Δ_{i} and {\hat{r}}_{1} (z) = \frac{1}{n} \sum_{i = 1}^{n} Δ_{i} with Δ_{i} = \frac{K (h_{1}^{- 1} δ (z, Z_{i}))}{E [K (h_{1}^{- 1} δ (z, Z))]}

Lemma 6.

Under assumptions (C1)–(C6), (C7)-1 and (C7)-4, we have

lim_{n \to + \infty} E [{\hat{r}}_{2} (z)] = r (z) .

Lemma 7.

Under assumptions (C1)–(C7), we have the following:

(i): If assumptions (C7)-3 to (C7)-5 are satisfied, then we have

${\hat{r}}_{2} (z) - E [{\hat{r}}_{2} (z)] = O_{a . c o} (\sqrt{\frac{log n}{n φ_{z} (h_{1})}}) .$
(II): If assumptions (C7)-3 and (C7)-4 are satisfied, then we have

${\hat{r}}_{1} (z) - 1 = O_{a . c o .} (\sqrt{\frac{log n}{n φ_{z} (h_{1})}}) .$

Lemma 8

(Corollary of Bernstein’s inequality).

(i): If for all $m \geq 2$ there exists $c_{m} > 0$ such that $E [W_{1}^{m}] \leq c_{m} a^{2 (m - 1)}$ , then for all $ε > 0$ ,

$P [| \frac{1}{n} \sum_{i = 1}^{n} W_{i} | > ε] \leq 2 exp \{- \frac{ε^{2} n}{2 a^{2} (1 + ε)}\} .$
(II): If $W_{i} = W_{i, n}$ depends on n, and for all $m \geq 2,$ there exists $c_{m} > 0$ , such that $E [W_{1}^{m}] \leq c_{m} a^{2 (m - 1)}$ and $u_{n} = \frac{a_{n}^{2} log n}{n} \underset{n \to + \infty}{⟶} 0$ , then

$\frac{1}{n} \sum_{i = 1}^{n} W_{i} = O_{a . c o .} (\sqrt{u_{n}}) .$

4.2.2. Estimation of the Non-Parametric Function

Theorem 2.

Under assumptions (C1)–(C7), we have

{∥\hat{η} - η_{0}∥}_{2} = O_{P} \{\sqrt{N_{n}} (\frac{1}{\sqrt{n} h} + h^{p})\}

and

{∥\hat{η} - η_{0}∥}_{n} = O_{P} \{\sqrt{N_{n}} (\frac{1}{\sqrt{n} h} + h^{p})\}

The proof of Theorem 2 is given in Appendix A.

4.2.3. Estimation of the Parametric Components

Theorem 3.

Under assumptions (C1)–(C10), the quasi-likelihood estimator

\hat{α}

with the constraint

∥ \hat{α} ∥ = 1

is asymptotically normal, i.e.,

\sqrt{n} (\hat{α} - α_{0}) \overset{D}{\to} N (0, J (τ_{0}) D^{- 1} J^{⊤} (τ_{0})),

where

D = E [ρ_{2} (m_{0} (T)) (η_{0}^{'} (U_{τ, 0}) J^{⊤} (τ_{0}) Φ (X)) {(η_{0}^{'} (U_{τ, 0}) J^{⊤} (τ_{0}) Φ (X))}^{⊤}] .

The proof of Theorem 3 and the used technical lemmas are given in Appendix A.

5. A Simulation Study

We aim to illustrate numerically the convergence of different estimators of the parameters

τ

,

γ

, the non-parametric function

η

and the regression operator r of Y on Z. We conduct this numerical study in the Gaussian and in the logistic cases.

The conditional density of

Y_{| X = x, Z = z}

is given by

f_{Y_{| X = x, Z = z}} (y) = exp \{y ξ (x, z) - B (ξ (x, z)) + C (y)\} .

We deduce that

g (µ (x, z)) = E [Y_{| X = x, Z = z}] = B^{'} (ξ (x, z)) and V {µ (x, z)} = \frac{B^{″} (ξ (x, z))}{σ^{2}} .

We consider that the model is given by the following equation

g (µ (X_{i}, Z_{i})) = sin \{\frac{π (α^{⊤} X_{i} - A)}{B - A}\} + r (Z_{i}) + ε_{i} for i = 1, \dots, n

(15)

The responses

Y_{i}

are simulated according to Equation (15) (see Figure 1),

X_{i}

is taken uniformly over the interval

[- 0.5, 0.5]

, whereas the errors

ε_{i} \sim N (0, 0.025)

. Moreover, we take the following coefficients

α = \frac{1}{\sqrt{3}} {(1, 1, 1)}^{⊤}, A = \frac{\sqrt{3}}{2} - \frac{1.645}{\sqrt{12}} and B = \frac{\sqrt{3}}{2} + \frac{1.645}{\sqrt{12}} .

The functional real variable

Z_{i} (\cdot)

is taken as

Z (t) = f (U ([0, T])),

where

f (t) = g (B (1, 0.5), t)

and

g (a, t) = 2 a (1 - 2 a) sin (t π N (0, 1))

.

The regression operator r of Y on Z is defined as follows:

r (Z) = \int_{0}^{1} \frac{1}{1 + |Z (t)|} d t .

The knots are selected according to formula

C n^{\frac{1}{2 r}} log (n)

where

C \in [0.3, 1]

(see Cao and Wang [10]). We choose

C = 0.6

and we make 2000 sample replications of sizes

n = 500

.

Then, the computed bias, the standard deviation (SD) and the mean squared error (MSE) with respect to (I) the parameter

τ

, and (II) the parameter

γ

are summarized in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10.

Notice that in the first step, we estimate the regression operator r using the functional kernel regression estimator by using the R-code routine funopare.knn.lcv. The obtained mean squared error is MSE

= 0.17

.

By the plug-in process (the second step) we estimate the parameters of the following model by using our algorithm GNFPLSIM as described before

g (µ (X_{i}, Z_{i})) = sin \{\frac{π (α^{⊤} X_{i} - A)}{B - A}\} + \hat{r} (Z_{i}) + ε_{i} for i = 1, \dots, n

(16)

To compute the bias, SD and MSE, we recorded 2000 replications of the GNFPLSIM algorithm in the Gaussian case (Table 1, Table 2, Table 3, Table 4 and Table 5) and in the logistic case (Table 6, Table 7, Table 8, Table 9 and Table 10) with

n = 500

as follows.

Notice that it is obvious to see that the quality of the estimators is illustrated through these simulations. The method works quite well. The bias, SD and MSE are generally reasonably low. The parametric and non-parametric components, simple index and nonlinear regression operator r of Y over Z are calculated by the procedure described above. The two tables therefore indicate the consistency of

\hat{α}

such that the bias, SD and MSE decrease as the sample size increases.

We developed our algorithm in both cases: the identity link function and the logistic link function. The simulations show that the GNFPLSIM algorithm works well in both cases.

In Figure 2, we illustrate 500 realizations of the functional random variable Z and the predicted response versus the true response.

We present below, in Figure 3, the single index estimated by the model in both cases: Gaussian and logistic cases.

We observe that the single-index estimated by our model fits well with the single-index.

We present below, in Figure 4, the systematic component

η (.)

estimated by the model in both cases: Gaussian and logistic cases.

We consider that our model approximated to the best non-parametric function

η (.)

. We use the square root of the average square errors criterion (RASE, see Lai et al. [35]) in the Gaussian case and in the logistic case as follows:

R A S E = {(\frac{1}{n} \sum_{i = 1}^{n} {(\hat{η} (u_{i}) - η (u_{i}))}^{2})}^{\frac{1}{2}}

The following Table 11 and Table 12 summarize the samples’ means, medians and variances of the

R A S E

with different sample sizes in the Gaussian case.

The following table summarizes the samples means, medians and variances of the

R A S E

with different sample sizes in the logistic case.

We conclude that as the sample size n increases from 500 to

1000,

the sample mean, median and variance of

R A S E

decrease.

Application to Tecator Data

In this paragraph, we apply the GNFPLSIM model for Tecator data, popularly known in the functional data analysis. These data can be downloaded in the following link http://lib.stat.cmu.edu/datasets/tecator (accessed on 1 March 2022). For more details, check Ferraty and Vieu [12].

Given 215 finely chopped pieces of meat, (see Figure 5) Tecator’s data contain their corresponding fat contents (

Y_{i}, i = 1, \dots, 215

), near-infrared absorbance spectra (

Z_{i}, i = 1, \dots, 215

) observed on 100 equal wavelengths in the range 850–1050 nm, the protein

X_{1, i}

and the moisture content

X_{2, i}

.

We try to predict the fat content of the finely chopped meat samples.

The following figure shows the absorbance curves.

We divide the sample randomly into sub-samples: the training

I_{1}

of size 160 and the test

I_{2}

of size

55 .

The training sample is used to estimate the parameters and the test sample is employed to verify the quality of predictors. To perform our model, we use the mean square error of prediction (MSEP) like in Aneiros-Pérez and Vieu [13] defined as the following:

M S E P = \frac{1}{50} \sum_{i \in I_{2}} {(Y_{i} - {\hat{Y}}_{i})}^{2} / v a r_{I_{2}} (Y_{i})

where

{\hat{Y}}_{i}

is the predicted value based on the training sample and

v a r_{I_{2}}

is variance of the response variables test sample.

The following Table 13 and Table 14 show the performance of our GNPFPLSIM model by comparing it with other models. We can conclude that GNPFPLSIM is competitive for such data.

The following Figure 6, shows us the estimator of the non-parametric function

η (.)

by the model in both cases: Gaussian and logistic cases.

The following Figure 7 compares the content of fatness and its estimation by the model in both cases: Gaussian and Logistic cases.

We can see that our model well fits the content of fatness of “215 pieces of meat”.

6. Summary

In this paper, we introduce estimates for the non-parametric generalized functional partially linear single-index model (GNPFPLSIM). Our estimates are obtained via the kernel methods and the Fisher scoring update equation derived from the quasi likelihood function and the normalized B-splines basis with their derivatives.

We prove the n-consistency and asymptotic normality of our estimates and therefore, first, we define estimates of the estimator

\hat{r}

, which converges almost completely to the true operator regression r. Second, we define estimates, with rates, of the estimator

\hat{η}

, which still converge at the rate to the true non-parametric function

η

. Finally, we define estimates, with rates, of the estimator

\hat{α}

, which still converge at the rate to the non-parametric parameter

α

.

A numerical study reveals that our estimation performs well in higher dimensions. The quality of the estimators is illustrated via simulations and real data.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math10152704/s1.

Author Contributions

Formal analysis, M.A.; Investigation, I.O.; Methodology, M.R. and P.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In what follows, we present the technical lemmas that are used for the proof of the previous theorems.

In what follows, for all probability measure Q, define

L^{2} (Q) = \{f : Q f^{2} = \int f^{2} d Q < \infty\} .

Let

F

be a subclass of

L^{2} (Q)

. For all

f \in F, ∥ f ∥ = {(\int f^{2} d Q)}^{\frac{1}{2}}

. Denote

N (δ, F, L^{2} (Q))

the

δ

-covering number of

F

, i.e., the smallest value of N for which there exist functions

f_{1}

,

f_{2}

,…,

f_{N}

, such that for each

f \in F

, there exists

j \in {1, \dots, N}

,

∥ f - f_{j} ∥ < δ

or that

F \subset ⋃_{j = 1}^{N} B (f_{j}, δ)

. Notice that

f_{j}

’s are not necessarily in

F .

For two functions l and u, a bracketing

[l, u]

is the set of functions f such that

l \leq f \leq u

,

[l, u] = {f : l \leq f \leq u} .

The

δ

-covering number with bracketing

N_{[]} (δ, F, L^{2} (Q))

is defined as the smallest value of N, necessary to cover the whole

F,

for which there exist pairs of functions

\{[f_{j}^{L}, f_{j}^{U}] for j = 1, \dots, N\}

with

∥f_{j}^{U} - f_{j}^{L}∥ \leq δ

, such that for each

f \in F

, there is a

j \in {1, \dots, N}

such that

f_{j}^{L} \leq f \leq f_{j}^{U}

(

f_{j}^{U}

and

f_{j}^{L}

are not necessary, belonging to

F

).

The

δ

-entropy with bracketing is

log N_{[]} (δ, F, L^{2} (Q))

. The uniform entropy integral

J_{[]} (δ, F, L^{2} (Q))

is defined as

J_{[]} (δ, F, L^{2} (Q)) = \int_{0}^{δ} {\{1 + log N_{[]} (ε, F, L^{2} (Q))\}}^{\frac{1}{2}} d ε

Let

Q_{n}

be the empirical measure of Q i.e.,

Q_{n} = \frac{1}{n} \sum_{i = 1}^{n} δ_{X_{i}} (\cdot)

such that

Q_{n} f = E^{Q_{n}} [f] = \int f d Q_{n} = \frac{1}{n} \sum_{i = 1}^{n} \int f δ_{X_{i}} = \frac{1}{n} \sum_{i = 1}^{n} f (X_{i})

Denote

G_{n} = \sqrt{n} (Q_{n} - Q)

the standardized empirical process indexed by

F

, and

{∥G_{n}∥}_{F} = \sum_{f \in F} |G_{n} f|

for any measurable class of functions

F .

For all

f \in F

, we have

Q f = E^{Q} [f (X)] = \int f d Q

, and

\begin{matrix} G_{n} f & = & \sqrt{n} (Q_{n} f - Q f) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (f (X_{i}) - E [f (X)]) \end{matrix}

Lemma A1

(Lemma 3.4.2. in Van Der Vaart and Wellner [36]). Let

M_{0} > 0

and

F

uniformly bounded class of measurable functions such that

\{\begin{matrix} f o r a l l f \in {F, ∥ f ∥}_{\infty} < M_{0}, \\ Q f^{2} < δ^{2} . \end{matrix}

Then

E^{Q} [{∥G_{n}∥}_{F}] \leq c_{0} J_{[]} (δ, F, L^{2} (Q)) \{1 + \frac{J_{[]} (δ, F, L^{2} (Q))}{δ^{2} \sqrt{n}} M_{0}\},

where

c_{0}

is a finite constant not dependent on n.

Lemma A2

(Lemma A.1. in Huang [33]). For any

λ > 0

, let

Θ_{n} = {η (α_{0}^{⊤} x), η \in S_{n},

{∥η - η_{0}∥}_{2} \leq λ} .

Then, for any

ε \leq λ,

log N_{[]} (λ, Θ_{n}, L^{2} (P)) \leq c N_{n} log (\frac{λ}{ε}) .

Recall that

N_{n}

is number basis functions of B-spline basis functions of order r.

Lemma A3

(Lemma A.2. Page 3 in Wang and Yang [29] and Lemma A.4. Page 1442 in Xue and Yang [34]). Let

S_{n}

be the space of all polynomial spline functions of order r on

[a, b]

. Under conditions (C1)–(C5), we have

A_{n} = sup_{η_{1}, η_{2} \in S_{n}} |\frac{{〈η_{1}, η_{2}〉}_{n} - 〈η_{1}, η_{2}〉}{{∥η_{1}∥}_{2} {∥η_{2}∥}_{2}}| = O_{p . s .} \{\sqrt{\frac{log n}{n h}}\} .

Let

D_{n, θ} = (\begin{matrix} γ^{⊤} B^{'} (α^{⊤} (\tilde{τ}) X_{i}) J^{⊤} (τ) & 0 \\ 0 & B (α^{⊤} (\tilde{τ}) X_{i}) \end{matrix}) .

Denote

W_{n, θ} = \frac{1}{n} \sum_{i = 1}^{n} D_{i, θ} (\begin{matrix} X_{i} \\ 1 \end{matrix}) {(\begin{matrix} X_{i} \\ 1 \end{matrix})}^{⊤} D_{i, θ}^{⊤},

and

W_{θ} = \frac{1}{n} \sum_{i = 1}^{n} E [D_{i, θ} (\begin{matrix} X_{i} \\ 1 \end{matrix}) {(\begin{matrix} X_{i} \\ 1 \end{matrix})}^{⊤} D_{i, θ}^{⊤}] .

Lemma A4

(Lemma A.3 in Wang and Yang [29]). Under conditions (C1)–(C8), there exists

C > 0

such that

sup_{θ} {∥W_{θ}^{- 1}∥}_{2} \leq C \sqrt{N_{n}} a . s . and sup_{θ} {∥W_{n, θ}^{- 1}∥}_{2} \leq C \sqrt{N_{n}} a . s .

Recall that

{∥ A ∥}_{2} = sup_{x \neq 0} \frac{∥ A x ∥}{∥ x ∥} = \underset{∥ x ∥ = 1}{error} ∥ A x ∥

In what follows, we give lemmas allowing to prove Theorem 3. The lemmas and theorem proofs are developed in the Appendix A.

Lemma A5.

Under conditions (C1)–(C8), we have

\frac{1}{n} \sum_{i = 1}^{n} ρ_{2} (m_{0 i}) \{\hat{η} (U_{τ, 0 i}) - η_{0} (U_{τ, 0 i})\} η_{0}^{'} (U_{τ, 0 i}) J^{⊤} (τ_{0}) Φ (X_{i}) = O_{P} (\frac{1}{\sqrt{n}})

(A1)

\frac{1}{n} \sum_{i = 1}^{n} ρ_{2} (m_{0 i}) η_{0}^{'} (U_{τ, 0 i}) Φ (X_{i}) Υ^{⊤} (U_{τ, 0 i}) J (τ_{0}) (\hat{τ} - τ_{0}) = O_{P} (\frac{1}{\sqrt{n}})

(A2)

where

Υ (u_{τ, 0}) = \frac{E [X ρ_{2} (m_{0} (T)) / U_{τ, 0} = u_{τ, 0}]}{E [ρ_{2} (m_{0} (T)) / U_{τ, 0} = u_{τ, 0}]}; Γ (u_{τ, 0}) = \frac{E [W ρ_{2} (m_{0} (T)) / U_{τ, 0} = u_{τ, 0}]}{E [ρ_{2} (m_{0} (T)) / U_{τ, 0} = u_{τ, 0}]}

Φ (x) = Φ (U_{τ, 0}, x) = x - Υ (u_{τ, 0}) and Ψ (w) = Ψ (U_{τ, 0}, w) = w - Γ (u_{τ, 0})

Lemma A6.

Under conditions (C1)–(C8), we have

\frac{1}{n} \sum_{i = 1}^{n} ρ_{2} (m_{o i}) \{\hat{η} (U_{τ, o i}) - η_{0} (U_{τ, o i})\} Ψ (T_{i}) = O_{P} (\frac{1}{\sqrt{n}})

(A3)

\frac{1}{n} \sum_{i = 1}^{n} ρ_{2} (m_{o i}) η_{0}^{'} (U_{τ, o i}) Ψ (T_{i}) Υ^{⊤} (U_{τ, o i}) J (τ_{0}) (\hat{τ} - τ_{0}) = O_{P} (\frac{1}{\sqrt{n}})

(A4)

References

McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1972. [Google Scholar] [CrossRef]
Nelder, J.A.; Wedderburn, R.W.M. Generalized Linear Models. J. R. Stat. Soc. Ser. A 1972, 135, 370–384. [Google Scholar] [CrossRef]
Hastie, T.J.; Tibshirani, R.J. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar] [CrossRef]
Wood, S.N. Generalized Additive Models. An Introduction with R; Chapman and Hall/CRC: Boca Raton, FL, USA, 2017; Volume 2017. [Google Scholar] [CrossRef]
Härdle, W.; Hall, P.; Ichimura, H. Optimal Smoothing in Single-Index Models. Ann. Stat. 1993, 21, 157–178. [Google Scholar] [CrossRef]
Hristache, M.; Juditsky, A.; Spokoiny, V. Direct Estimation of the Index Coefficient in a Single-Index Model. Ann. Stat. 2001, 29, 595–623. [Google Scholar] [CrossRef]
Liang, H.; Wang, N. Partially Linear Single-Index Measurement Error Models. Stat. Sin. 2005, 15, 99–116. [Google Scholar]
Chen, J.; Li, D.; Liang, H.; Wang, S. Semiparametric GEE Analysis of Partially Linear Single-Index Models for Longitudinal Data. Ann. Stat. 2015, 43, 1682–1715. [Google Scholar] [CrossRef]
Caroll, R.J.; Fan, J.; Gijbels, I.; Wand, M.P. Generalized Partially Linear Single-Index Models. J. Am. Stat. Assoc. 1997, 92, 477–489. [Google Scholar] [CrossRef]
Wang, L.; Cao, G. Efficient Estimation for Generalized Partially Linear Single-Index Models. Bernoulli 2018, 24, 1101–1127. [Google Scholar] [CrossRef] [Green Version]
Ramsay, J.O.; Silverman, B.W. Functional Data Analysis; Springer: New York, NY, USA, 2005. [Google Scholar] [CrossRef]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer Series in Statistics; Springer: Berlin, Germany, 2006. [Google Scholar]
Aneiros-Perez, G.; Vieu, P. Semi Functional Partial Linear Regression. Stat. Probab. Lett. 2006, 76, 1102–1110. [Google Scholar] [CrossRef]
Aneiros-Perez, G.; Vieu, P. Partial Linear Modelling with Multi-Functional Covariates. Comput. Stat. 2015, 30, 647–671. [Google Scholar] [CrossRef]
Horváth, L.; Kokoszka, P. Inference for Functional Data with Applications; Springer Series in Statistics; Springer: New York, NY, USA, 2012. [Google Scholar]
Kokoszka, P.; Reimherr, M. Introduction to Functional Data Analysis; Chapman and Hall/CRC Press: Boca Raton, FL, USA, 30 June 2021. [Google Scholar]
Schumaker, L.L. Spline Functions: Basic Theory; Wiley: New York, NY, USA, 1981; Volume 14. [Google Scholar]
Ould-Said, E.; Ouassou, I.; Rachdi, M. Functional Statistics and Applications; Contributions to Statistics; Springer: Berlin, Germany, 2013. [Google Scholar]
Ouassou, I.; Rachdi, M. Stein Type Estimation of the Regression Operator for Functional Data. Adv. Appl. Stat. Sci. 2010, 1, 233–250. [Google Scholar]
Ouassou, I.; Rachdi, M. Regression Operator Estimation by Delta-Sequences Method for Functional Data and its Applications. AStA Adv. Stat. Anal. 2012, 92, 451–465. [Google Scholar] [CrossRef]
Laksaci, A.; Kaid, Z.; Alahiane, M.; Ouassou, I.; Rachdi, M. Non parametric estimations of the conditional density and mode when the regressor and the response are curves. Commun. Stat.—Theory Methods 2022. [Google Scholar] [CrossRef]
Cao, R.; Du, J.; Zhou, J.; Xie, T. FPCA-based Estimation for Generalized Functional Partially Linear Models. Stat. Pap. 2020, 61, 2715–2735. [Google Scholar] [CrossRef]
Li, C.S.; Lu, M. A Lack-of-fit Test for Generalized Linear Models via Single-Index Techniques. Comput. Stat. 2018, 33, 731–756. [Google Scholar] [CrossRef]
Yao, D.S.; Chen, W.X.; Long, C.X. Parametric estimation for the simple linear regression model under moving extremes ranked set sampling design. Appl. Math. J. Chin. Univ. 2021, 36, 269–277. [Google Scholar] [CrossRef]
Yu, P.; Du, J.; Zhang, Z. Single-Index Partially Functional Linear Regression Model. Stat. Pap. 2020, 61, 1107–1123. [Google Scholar] [CrossRef]
Yu, Y.; Ruppert, D. Penalized Spline Estimation for Partially Linear Single-Index Models. J. Am. Stat. Assoc. 2002, 16, 1042–1054. [Google Scholar] [CrossRef]
Rachdi, M.; Alahiane, M.; Ouassou, I.; Vieu, P. Generalized functional partially linear single-index models. In Functional and High-Dimensional Statistics and Related Fields; Springer International Publishing: Cham Switzerland, 2020; pp. 221–228. [Google Scholar]
Alahiane, M.; Ouassou, I.; Rachdi, M.; Vieu, P. Partially Linear Generalized Single Index Models for Functional Data (PLGSIMF). Stats 2021, 4, 793–813. [Google Scholar] [CrossRef]
Li, W.; Yang, L. Spline Estimation of Single-Index Models. Stat. Sin. 2009, 19, 765–783. [Google Scholar]
De Boor, C. A Practical Guide to Splines, Revised ed.; Applied Mathematical Sciences; Springer: Berlin, Germany, 2001; Volume 27. [Google Scholar]
Pollard, D. Asymptotics for Least Absolute Deviation Regression Estimators. Econ. Theory 1991, 7, 186–199. [Google Scholar] [CrossRef]
Stone, C.J. The Dimensionality Reduction Principle for Generalized Additive Models. Ann. Stat. 1986, 14, 590–606. [Google Scholar] [CrossRef]
Huang, J. Efficient Estimation of the Partly Linear Additive Cox Model. Ann. Stat. 1999, 27, 1536–1563. [Google Scholar] [CrossRef]
Xue, L.; Yang, L. Additive Coefficient Modeling via Polynomial Spline. Stat. Sin. 2006, 16, 1423–1446. [Google Scholar]
Lai, P.; Tian, Y.; Lian, H. Estimation and Variable Selection for Generalised Partially Linear Single-Index Models. J. Nonparametr. Stat. 2014, 26, 171–185. [Google Scholar] [CrossRef]
Van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes with Applications to Statistics; Springer: New York, NY, USA, 1996. [Google Scholar]

Figure 1. Responses of testing sample versus Predicted responses (step 1).

Figure 2. On the left plot: 500 realizations of the functional random variable Z, and on the right plot: the predicted response (in abscissa) compared to the true response (in ordinate).

Figure 3. On the left plot: single-index versus predicted single-index, Gaussian case. On the right plot: single-index versus predicted single-index, logistic case.

Figure 4. On the left plot: the non-parametric function

η (.)

versus its estimator

\hat{η} (.)

, Gaussian case. On the right plot: the non-parametric function

η (.)

versus its estimator

\hat{η} (.)

, logistic case.

Figure 4. On the left plot: the non-parametric function

η (.)

versus its estimator

\hat{η} (.)

, Gaussian case. On the right plot: the non-parametric function

η (.)

versus its estimator

\hat{η} (.)

, logistic case.

Figure 5. A sample of 100 absorbance curves

Z .

Figure 5. A sample of 100 absorbance curves

Z .

Figure 6. On the left plot: estimated non-parametric function

\hat{η} (.)

, Gaussian case. On the right plot: estimated non-parametric function

\hat{η} (.)

, logistic case.

Figure 6. On the left plot: estimated non-parametric function

\hat{η} (.)

, Gaussian case. On the right plot: estimated non-parametric function

\hat{η} (.)

, logistic case.

Figure 7. On the left plot: the content of fatness and its estimation, Gaussian case. On the right plot: the content of fatness and its estimation, logistic case.

Table 1. Bias, SD and MSE according to the parameter

τ

for GNFPLSIM with the identity link function and

n = 500

.

Table 1. Bias, SD and MSE according to the parameter

τ

for GNFPLSIM with the identity link function and

n = 500

.

	$τ_{1}$	$τ_{2}$
Bias	0.0004	−0.0005
SD	0.0006	0.0013
MSE	5.20 $\times 10^{- 7}$	1.94 $\times 10^{- 6}$

Table 2. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

Table 2. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

	$γ_{1}$	$γ_{2}$	$γ_{3}$	$γ_{4}$	$γ_{5}$
Bias	−0.0054	0.0258	−0.0387	0.0289	−0.0093
SD	0.0123	0.0165	0.0214	0.0152	0.0064
MSE	1.8045 $\times 10^{- 4}$	9.3789 $\times 10^{- 4}$	1.9556 $\times 10^{- 3}$	1.0662 $\times 10^{- 3}$	1.2745 $\times 10^{- 4}$

Table 3. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

Table 3. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

	$γ_{6}$	$γ_{7}$	$γ_{8}$	$γ_{9}$	$γ_{10}$
Bias	0.0072	0.0006	−0.0043	0.0006	−0.0056
SD	0.0062	0.0046	0.0024	0.0036	0.0042
MSE	9.028 $\times 10^{- 5}$	2.152 $\times 10^{- 5}$	2.425 $\times 10^{- 5}$	1.332 $\times 10^{- 5}$	4.900 $\times 10^{- 5}$

Table 4. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

Table 4. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

	$γ_{11}$	$γ_{12}$	$γ_{13}$	$γ_{14}$	$γ_{15}$
Bias	0.0009	0.0027	−0.0057	0.0031	0.0092
SD	0.0056	0.0034	0.0028	0.0057	0.0051
MSE	3.217 $\times 10^{- 5}$	1.885 $\times 10^{- 5}$	4.033 $\times 10^{- 5}$	4.210 $\times 10^{- 5}$	1.1065 $\times 10^{- 4}$

Table 5. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

Table 5. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

	$γ_{16}$	$γ_{17}$	$γ_{18}$	$γ_{19}$	$γ_{20}$	$γ_{21}$
Bias	0.0009	0.0027	−0.0057	0.0031	0.0092	0.0051
SD	0.0046	0.0084	0.0142	0.0154	0.0232	0.0131
MSE	2.792 $\times 10^{- 5}$	8.017 $\times 10^{- 5}$	2.2373 $\times 10^{- 4}$	4.1140 $\times 10^{- 4}$	7.1780 $\times 10^{- 4}$	6.3386 $\times 10^{- 4}$

Table 6. Bias, SD and MSE according to the parameter

τ

for GNFPLSIM with the logistic link function and

n = 500

.

Table 6. Bias, SD and MSE according to the parameter

τ

for GNFPLSIM with the logistic link function and

n = 500

.

	$τ_{1}$	$τ_{2}$
Bias	−0.0084	0.0047
SD	0.0103	0.0108
MSE	1.7665 $\times 10^{- 4}$	1.3873 $\times 10^{- 4}$

Table 7. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

Table 7. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

	$γ_{1}$	$γ_{2}$	$γ_{3}$	$γ_{4}$	$γ_{5}$
Bias	−0.0043	0.0352	−0.0389	0.0383	−0.0065
SD	0.0107	0.0234	0.0223	0.0136	0.0058
MSE	1.3298 $\times 10^{- 3}$	1.7866 $\times 10^{- 3}$	2.0105 $\times 10^{- 3}$	1.6518 $\times 10^{- 3}$	7.589 $\times 10^{- 4}$

Table 8. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

Table 8. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

	$γ_{6}$	$γ_{7}$	$γ_{8}$	$γ_{9}$	$γ_{10}$
Bias	0.0087	0.0006	0.0467	0.0003	−0.0054
SD	0.0070	0.0061	0.0051	0.0047	0.0026
MSE	1.2469 $\times 10^{- 4}$	3.757 $\times 10^{- 5}$	2.2069 $\times 10^{- 3}$	2.2180 $\times 10^{- 5}$	3.5920 $\times 10^{- 5}$

Table 9. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

Table 9. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the identity link function and

n = 500

.

	$γ_{11}$	$γ_{12}$	$γ_{13}$	$γ_{14}$	$γ_{15}$
Bias	0.0006	0.0053	−0.0083	0.0036	−0.0072
SD	0.0041	0.0027	0.0072	0.0064	0.0052
MSE	1.7170 $\times 10^{- 5}$	3.5380 $\times 10^{- 5}$	1.2073 $\times 10^{- 4}$	5.3920 $\times 10^{- 5}$	7.888 $\times 10^{- 5}$

Table 10. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

Table 10. Bias, SD and MSE evolutions with respect to the parameter

γ

variation for GNFPLSIM with the logistic link function and

n = 500

.

	$γ_{16}$	$γ_{17}$	$γ_{18}$	$γ_{19}$	$γ_{20}$	$γ_{21}$
Bias	0.0027	0.0035	−0.0048	0.0215	−0.0187	−0.0214
SD	0.0063	0.0078	0.0127	0.0215	0.0254	0.0213
MSE	4.698 $\times 10^{- 5}$	7.309 $\times 10^{- 5}$	1.8433 $\times 10^{- 4}$	9.2450 $\times 10^{- 4}$	9.9485 $\times 10^{- 4}$	9.1165 $\times 10^{- 4}$

Table 11. The RASE criterion with the non-parametric function

η (.)

for both cases

n = 500

and

n = 1000

.

Table 11. The RASE criterion with the non-parametric function

η (.)

for both cases

n = 500

and

n = 1000

.

Gaussian Case	Mean	Median	Variance
$n = 500$	0.028	0.024	0.004
$n = 1000$	0.027	0.022	0.002

Table 12. The RASE criterion with the non-parametric function

η (.)

for both cases

n = 500

and

n = 1000

.

Table 12. The RASE criterion with the non-parametric function

η (.)

for both cases

n = 500

and

n = 1000

.

Logistic Case	Mean	Median	Variance
$n = 500$	0.038	0.043	0.027
$n = 1000$	0.029	0.039	0.016

Table 13. The MSEPs for different models: Gaussian case.

Functional Models	MSEP
Model 1 (GNPFPLSIM) $g (µ (X_{i}, Z_{i})) = η (α_{1} X_{1, i} + α_{2} X_{2, i}) + r (Z_{i})$	0.019
Model 2 (GNPFPLM) $g (µ (X_{i}, Z_{i})) = α_{1} X_{1, i} + α_{2} X_{2, i} + r (Z_{i})$	0.059
Model3 (SIM) $Y_{i} = η (α_{1} X_{1, i} + α_{2} X_{2, i}) + ϵ_{i}$	1.102
Model 4 (FM) $Y_{i} = r (Z_{i}) + ϵ_{i}$	1.831

Table 14. The MSEPs for different models: logistic case.

Functional Models	MSEP
Model 1 (GNPFPLSIM) $g (µ (X_{i}, Z_{i})) = η (α_{1} X_{1, i} + α_{2} X_{2, i}) + r (Z_{i})$	0.039
Model 2 (GNPFPLM) $g (µ (X_{i}, Z_{i})) = α_{1} X_{1, i} + α_{2} X_{2, i} + r (Z_{i})$	0.093
Model 3 (SIM) $Y_{i} = η (α_{1} X_{1, i} + α_{2} X_{2, i}) + ϵ_{i}$	1.102
Model 4 (FM) $Y_{i} = r (Z_{i}) + ϵ_{i}$	1.831

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alahiane, M.; Ouassou, I.; Rachdi, M.; Vieu, P. High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model. Mathematics 2022, 10, 2704. https://doi.org/10.3390/math10152704

AMA Style

Alahiane M, Ouassou I, Rachdi M, Vieu P. High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model. Mathematics. 2022; 10(15):2704. https://doi.org/10.3390/math10152704

Chicago/Turabian Style

Alahiane, Mohamed, Idir Ouassou, Mustapha Rachdi, and Philippe Vieu. 2022. "High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model" Mathematics 10, no. 15: 2704. https://doi.org/10.3390/math10152704

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model

Abstract

1. Introduction

2. Some Preliminaries

3. Estimation Methodology

3.1. The First Step

3.2. The Second Step

4. Some Asymptotic Properties

4.1. Assumptions

4.2. The Consistency Study

4.2.1. Almost Complete Convergence of the Functional Kernel Estimator of $\hat{r}$

4.2.2. Estimation of the Non-Parametric Function

4.2.3. Estimation of the Parametric Components

5. A Simulation Study

Application to Tecator Data

6. Summary

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

High-Dimensional Statistics: Non-Parametric Generalized Functional Partially Linear Single-Index Model

Abstract

1. Introduction

2. Some Preliminaries

3. Estimation Methodology

3.1. The First Step

3.2. The Second Step

4. Some Asymptotic Properties

4.1. Assumptions

4.2. The Consistency Study

4.2.1. Almost Complete Convergence of the Functional Kernel Estimator of r ^

4.2.2. Estimation of the Non-Parametric Function

4.2.3. Estimation of the Parametric Components

5. A Simulation Study

Application to Tecator Data

6. Summary

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2.1. Almost Complete Convergence of the Functional Kernel Estimator of $\hat{r}$