Next Article in Journal
Decomposing and Tracing Mutual Information by Quantifying Reachable Decision Regions
Previous Article in Journal
Ergodic Measure and Potential Control of Anomalous Diffusion
Previous Article in Special Issue
Information and Divergence Measures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Z-Estimators for Semiparametric Moment Condition Models

1
Department of Applied Mathematics, Bucharest University of Economic Studies, 010374 Bucharest, Romania
2
“Gheorghe Mihoc-Caius Iacob” Institute of Mathematical Statistics and Applied Mathematics of the Romanian Academy, 050711 Bucharest, Romania
Entropy 2023, 25(7), 1013; https://doi.org/10.3390/e25071013
Submission received: 8 May 2023 / Revised: 28 June 2023 / Accepted: 29 June 2023 / Published: 30 June 2023
(This article belongs to the Special Issue Information and Divergence Measures)

Abstract

:
In the present paper, we introduce a class of robust Z-estimators for moment condition models. These new estimators can be seen as robust alternatives for the minimum empirical divergence estimators. By using the multidimensional Huber function, we first define robust estimators of the element that realizes the supremum in the dual form of the divergence. A linear relationship between the influence function of a minimum empirical divergence estimator and the influence function of the estimator of the element that realizes the supremum in the dual form of the divergence led to the idea of defining new Z-estimators for the parameter of the model, by using robust estimators in the dual form of the divergence. The asymptotic properties of the proposed estimators were proven, including here the consistency and their asymptotic normality. Then, the influence functions of the estimators were derived, and their robustness is demonstrated.

1. Introduction

A moment condition model is a family M 1 of probability measures, all defined on the same measurable space ( R m , B ( R m ) ) , such that
g ( x , θ ) d Q ( x ) = 0 , for all Q M 1 .
The parameter θ belongs to Θ R d ; the function g : = ( g 1 , , g l ) is defined on R m × Θ , each of the g i ’s being real-valued, l d , and the functions g 1 , , g l and 1 X are supposed to be linearly independent. Denote by M 1 the set of all probability measures on ( R m , B ( R m ) ) and
M θ 1 : = { Q M 1 : g ( x , θ ) d Q ( x ) = 0 } ,
such that
M 1 = θ Θ M θ 1 .
Let X 1 , , X n be an i.i.d. sample on the random vector X with unknown probability distribution P 0 . We considered the problem of the estimation of the parameter θ 0 for which the constraints of the model are satisfied:
g ( x , θ 0 ) d P 0 ( x ) = 0 .
We supposed that θ 0 is the unique solution of Equation (4). Thus, we assumed that information about θ 0 and P 0 is available in the form of l d functionally independent unbiased estimating functions, and we used this information to estimate θ 0 .
Among the best-known estimation methods for moment condition models, we mention the generalized method of moments (GMM) [1], the continuous updating (CU) estimator [2], the empirical likelihood (EL) estimator [3,4], the exponential tilting (ET) estimator [5], and the generalized empirical likelihood (GEL) estimators [6]. Although the EL estimator is superior to other estimators in terms of higher-order asymptotic properties, these properties hold only under the correct specification of the moment conditions. In [7] was proposed the exponentially tilted empirical likelihood (ETEL) estimator, which has the same higher-order property as the EL estimator under the correct specification, while maintaining the usual asymptotic properties such as the consistency and asymptotic normality under misspecification. The so-called information and entropy econometric techniques have been proposed to improve the finite sample performance of the GMM-estimators and tests (see, e.g., [4,5]).
Some recent methods for the estimation and testing of moment condition models are based on using divergences. Divergences between probability measures are widely used in statistics and data science in order to perform inference in models of various kinds, parametric or semiparametric. Statistical methods based on divergence minimization extend the likelihood paradigm and often have the advantage of providing a trade-off between efficiency and robustness [8,9,10,11]. A general methodology for the estimation and testing of moment condition models was developed in [12]. This approach is based on minimizing divergences in their dual form and allows the asymptotic study of the estimators, called minimum empirical divergence estimators, and of the associated test statistics, both under the model and under misspecification of the model. The approach based on minimizing dual forms of divergences was initially used in the case of parametric models, the results being published in a series of articles [13,14,15,16]. The broad class of minimum empirical divergence estimators contains in particular the EL estimator, the CU estimator, as well as the ET estimator mentioned above. Using the influence function as the robustness measure, it has been shown that the minimum empirical divergence estimators are not robust, because the corresponding influence functions are generally not bounded [17]. On the other hand, the minimum empirical divergence estimators have the same efficiency of first order, and moreover, the EL estimator, which belong to this class, is superior in higher-order efficiency. Therefore, proposing robust versions of the minimum empirical divergence estimators would bring a trade-off between robustness and efficiency. These aspects motivated our studies in the present paper.
Some robust estimation methods for moment condition models have been proposed in the literature, for example in [18,19,20,21,22]. In the present paper, we introduce a class of robust Z-estimators for moment condition models. These new estimators can be seen as robust alternatives for the minimum empirical divergence estimators. By using the multidimensional Huber function, we first define robust estimators of the element that realizes the supremum in the dual form of the divergence. A linear relationship between the influence function of a minimum empirical divergence estimator and the influence function of the estimator of the element that realizes the supremum in the dual form of the divergence led to the idea of defining new Z-estimators for the parameter of the model, by using robust estimators in the dual form of the divergence. The asymptotic properties of the proposed estimators were proven, including here the consistency and their asymptotic normality. Then, the influence functions of the estimators were derived, and their robustness is demonstrated.
The paper is organized as follows. In Section 2, we briefly recall the context and the definitions of the minimum empirical divergence estimators, these being necessary for defining the new estimators. In Section 3, the new Z-estimators for moment condition models are defined. The asymptotic properties of these estimators were proven, including here the consistency and their asymptotic normality. Then, the influence functions of the estimators were derived, and their robustness is demonstrated. The proofs of the theoretical results are deferred in Appendix A.

2. Minimum Empirical Divergence Estimators

2.1. Statistical Divergences

Let φ be a convex function defined on R and 0 , -valued, such that φ ( 1 ) = 0 , and let P M 1 be some probability measure. For any signed finite measure Q defined on the same measurable space ( R m , B ( R m ) ) , absolutely continuous (a.c.) with respect to P, the φ divergence between Q and P is defined by
D φ ( Q , P ) : = φ d Q d P ( x ) d P ( x ) .
When Q is not a.c. with respect to P , we set D φ ( Q , P ) = . This extension, for the case when Q is not absolutely continuous with respect to P, was considered in order to have a unique definition of divergences, appropriate for both cases—that of continuous probability laws and that of discrete probability laws.
This definition extends the one of divergences between probability measures [23], and the necessity of working with signed finite measures will be explained in Section 2.2.
Largely used in information theory, the Kullback–Leibler divergence is associated with the real convex function φ ( x ) : = x log x x + 1 and is defined by
K L ( Q , P ) : = log d Q d P d Q .
The modified Kullback–Leibler divergence is associated with the convex function φ ( x ) : = log x + x 1 and is defined through
K L m ( Q , P ) : = log d Q d P d P .
Other divergences, largely used in inferential statistics, are the χ 2 and the modified χ 2 divergences, namely
χ 2 ( Q , P ) : = 1 2 d Q d P 1 2 d P ,
χ m 2 ( Q , P ) : = 1 2 d Q d P 1 2 d Q d P d P ,
these being associated with the convex functions φ ( x ) : = 1 2 ( x 1 ) 2 and φ ( x ) : = 1 2 ( x 1 ) 2 / x , respectively. The Hellinger distance and the L 1 distance are also φ divergences. They are associated with the convex functions φ ( x ) : = 2 ( x 1 ) 2 and φ ( x ) : = | x 1 | , respectively.
All the preceding examples, except the L 1 distance, belong to the class of power divergences introduced by Cressie and Read [24] and defined by the convex functions:
x R + * φ γ ( x ) : = x γ γ x + γ 1 γ ( γ 1 ) ,
for γ R { 0 , 1 } and φ 0 ( x ) : = log x + x 1 , φ 1 ( x ) : = x log x x + 1 . The Kullback–Leibler divergence is associated with φ 1 , the modified Kullback–Leibler with φ 0 , the χ 2 divergence with φ 2 , the modified χ 2 divergence with φ 1 , and the Hellinger distance with φ 1 / 2 . When φ γ is not defined on ( , 0 ) or when φ γ is not convex, the definition of the corresponding power divergence function Q M 1 D φ γ ( Q , P ) can be extended to the whole set of signed finite measures by taking the following extension of φ γ :
φ γ : x R φ γ ( x ) 1 0 , ( x ) + ( + ) 1 , 0 ( x ) .
The φ divergence between some set Ω of signed finite measures and a probability measure P is defined by
D φ ( Ω , P ) = inf Q Ω D φ ( Q , P ) .
Assuming that D φ ( Ω , P ) is finite, a measure Q * Ω is called a φ -projection of P on Ω if
D φ ( Q * , P ) D φ ( Q , P ) , for all Q Ω .

2.2. Minimum Empirical Divergence Estimators

Let X 1 , , X n be an i.i.d. sample on the random vector X with the probability distribution P 0 . The “plug-in” estimator of the φ divergence D φ ( M θ 1 , P 0 ) between the set M θ 1 and the probability measure P 0 is defined by replacing P 0 with the empirical measure associated with the sample. More precisely,
D ^ φ ( M θ 1 , P 0 ) = inf Q M θ 1 D φ ( Q , P n ) = inf Q M θ 1 φ d Q d P n ( x ) d P n ( x ) ,
where P n : = 1 n i = 1 n δ X i is the empirical measure associated with the sample, δ x being the Dirac measure putting all mass at x. If the projection of the measure P n on M θ 1 exists, it is a law a.c. with respect to P n . Then, it is natural to consider
M θ ( n ) = { Q M 1 : Q a . c . with respect to P n and i = 1 n g ( X i , θ ) Q ( X i ) = 0 } ,
and then, the plug-in estimator (8) can be written as
D ^ φ ( M θ 1 , P 0 ) = inf Q M θ ( n ) 1 n i = 1 n φ ( n Q ( X i ) ) .
The infimum in the above expression (10) may be achieved at a point situated on the frontier of the set M θ ( n ) , a case in which the Lagrange method for characterizing the infimum and computing D ^ φ ( M θ 1 , P 0 ) cannot be applied. In order to avoid this difficulty, Broniatowski and Keziou [12,25] proposed to work on sets of signed finite measures and defined
M θ : = { Q M : d Q = 1 and g ( x , θ ) d Q ( x ) = 0 } ,
where M denotes the set of all signed finite measures on the measurable space ( R m , B ( R m ) ) , and
M : = θ Θ M θ .
They showed that, if Q 1 * the projection of P n on M θ 1 is an interior point of M θ 1 and Q * the projection of P n on M θ is an interior point of M θ , then both approaches based on signed finite measures, respectively on probability measures, for defining minimum divergence estimators coincide. On the other hand, in the case when Q 1 * is a frontier point of M θ 1 , the estimator of the parameter θ 0 defined using the context of signed finite measures converges to θ 0 . These aspects justify the substitution of M θ 1 by M θ .
In the following, we briefly recall the definitions of the estimators for the moment condition proposed in [12] in the context of signed finite measure sets.
Denote by g ¯ the function defined on R m × Θ and R l + 1 -valued:
g ¯ ( x , θ ) : = 1 X ( x ) , g 1 ( x , θ ) , , g l ( x , θ ) .
Given a φ divergence, when the function φ is strictly convex on its domain, denote
φ * ( u ) : = u φ 1 ( u ) φ ( φ 1 ( u ) ) ,
the convex conjugate of the function φ . For a given probability measure P M 1 and a fixed θ Θ , define
Λ θ ( P ) : = { t R l + 1 : | φ * ( t 0 + j = 1 l t j g j ( x , θ ) ) | d P ( x ) < } .
We also use the notations Λ θ for Λ θ ( P 0 ) and Λ θ ( n ) for Λ θ ( P n ) .
Supposing that P 0 admits a projection Q θ * on M θ with the same support as P 0 and that the function φ is strictly convex on its domain, then the φ divergence D φ ( M θ , P 0 ) admits the dual representation:
D φ ( M θ , P 0 ) = sup t Λ θ m ( x , θ , t ) d P 0 ( x ) ,
where m ( x , θ , t ) : = t 0 φ * ( t g ¯ ( x , θ ) ) .
The supremum in (15) is unique and is reached at a point that we denote as t θ = t θ ( P 0 ) :
t θ : = arg sup t Λ θ m ( x , θ , t ) d P 0 ( x ) .
Then, D φ ( M θ , P 0 ) , t θ , D φ ( M , P 0 ) and θ 0 can be estimated respectively by
D ^ φ ( M θ , P 0 ) : = sup t Λ θ ( n ) m ( x , θ , t ) d P n ( x ) ,
t ^ θ : = arg sup t Λ θ ( n ) m ( x , θ , t ) d P n ( x ) ,
D ^ φ ( M , P 0 ) : = inf θ Θ sup t Λ θ ( n ) m ( x , θ , t ) d P n ( x ) ,
θ ^ φ : = arg inf θ Θ sup t Λ θ ( n ) m ( x , θ , t ) d P n ( x ) .
The estimators defined in (20) are called minimum empirical divergence estimators. We refer to [12] for the complete study of the existence and of the asymptotic properties of the above estimators.
The influence functions of these estimators and corresponding robustness properties were studied in [17]. According to those results, for θ Θ fixed, the influence function of the estimator t ^ θ is given by
IF ( x ; t θ , P 0 ) = 2 2 t m ( y , θ , t θ ( P 0 ) ) d P 0 ( y ) 1 t m ( x , θ , t θ ( P 0 ) ) ,
where
t m ( x , θ , t ) = ( 1 , 0 l ) φ 1 ( t g ¯ ( x , θ ) ) g ¯ ( x , θ ) ,
2 2 t m ( x , θ , t ) = 1 φ ( φ 1 ( t g ¯ ( x , θ ) ) ) g ¯ ( x , θ ) g ¯ ( x , θ ) ,
with the particular case θ = θ 0 :
IF ( x ; t θ 0 , P 0 ) = φ ( 1 ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) 1 ( 0 , g ( x , θ 0 ) ) .
On the other hand, the influence function of the estimator θ ^ φ is given by
IF ( x ; T φ , P 0 ) = θ g ( y , θ 0 ) d P 0 ( y ) g ( y , θ 0 ) g ( y , θ 0 ) d P 0 ( y ) 1 θ g ( y , θ 0 ) d P 0 ( y ) 1 · θ g ( y , θ 0 ) d P 0 ( y ) g ( y , θ 0 ) g ( y , θ 0 ) d P 0 ( y ) 1 g ( x , θ 0 ) .
Since the function x g ( x , θ ) is usually not bounded, for example, when we have linear constraints, the influence function IF ( x ; T φ , P 0 ) is not bounded; therefore, the minimum empirical divergence estimators θ ^ φ defined in (20) are generally not robust.
Through the calculations, it can be seen that there is a connection between the influence functions IF ( x ; t θ 0 , P 0 ) and IF ( x ; T φ , P 0 ) , namely the relation
θ g ¯ ( y , θ 0 ) d P 0 ( y ) · θ t ( θ 0 , P 0 ) IF ( x ; T φ , P 0 ) + IF ( x ; t θ 0 , P 0 ) = 0 .
Since IF ( x ; T φ , P 0 ) is linearly related to IF ( x ; t θ 0 , P 0 ) , using a robust estimator of t θ = t θ ( P 0 ) in the original duality Formula (15) would lead to a new robust estimator of θ 0 . This is the idea at the basis of our proposal in this paper, for constructing new robust estimators for moment condition models.

3. Robust Estimators for Moment Condition Models

3.1. Definitions of New Estimators

In this section, we define robust versions of the estimators t ^ θ from (18) and robust versions of minimum empirical divergence estimators θ ^ φ from (20). First, we define robust estimators of t θ , by using a truncated version of the function x t m ( x , θ , t ) , and then, we insert such a robust estimator in the estimating equation corresponding to the minimum empirical divergence estimator. The truncated function is based on the multidimensional Huber function and contains a shift vector τ θ and a scale matrix A θ to calibrate t θ and, thus, t θ , which realizes the supremum in the duality formula, will also be the solution of a new equation based on the new truncated function.
For simplicity, for fixed θ Θ , we also use the notation m θ ( x , t ) : = m ( x , θ , t ) . With this notation, t θ = t θ ( P 0 ) defined in (16) is the unique solution of the equation:
t m θ ( x , t θ ( P 0 ) ) d P 0 ( x ) = 0 .
Consider the system
t m θ ( y , t ) d P 0 ( y ) = 0
H c ( A [ t m θ ( y , t ) τ ] ) d P 0 ( y ) = 0
H c ( A [ t m θ ( y , t ) τ ] ) H c ( A [ t m θ ( y , t ) τ ] ) d P 0 ( y ) = I + 1
where
H c ( y ) : = y · min 1 , c y if y 0 0 if y = 0
is the multidimensional Huber function, with c > 0 , I + 1 the identity matrix, A is a ( l + 1 ) × ( l + 1 ) matrix, and τ R l + 1 . For fixed θ , this system admits a unique solution ( t , A , τ ) = ( t θ ( P 0 ) , A θ ( P 0 ) , τ θ ( P 0 ) ) (according to [18], p. 17).
The multidimensional Huber function is useful to define robust estimators; it transforms each point outside a hypersphere of c radius to the nearest point of it and leaves those inside unchanged (see [26], p. 239, [27]). By applying the multidimensional Huber function to the function y t m θ ( y , t ) , together with considering the scale matrix A θ and the shift vector τ θ , a modification is produced there, where the norm exceeds the bound c, and in the meantime, the original t θ remains the solution of the equation based on the new truncated function. For parametric models, the multidimensional Huber function was also used in other contexts, for example to define optimal B s -robust estimators or optimal B i -robust estimators (see [26], p. 244).
The above arguments can be used for each probability measure P from the moment condition model M 1 . This context allows defining the truncated version of the function y t m θ ( y , t ) , which we denote by ψ θ ( y , t ) , such that the original t θ ( P 0 ) , the solution of Equation (26), is also the solution of the equation ψ θ ( y , t θ ( P 0 ) ) d P 0 ( y ) = 0 .
For θ fixed and P a probability measure, the equation t m θ ( y , t ) d P ( y ) = 0 has a unique solution t = t θ ( P ) Λ θ ( P ) assuring the supremum in the dual form of the divergence D φ ( M θ , P ) (see [12]). For each t, we define the A θ ( t ) and τ θ ( t ) solutions of the system:
H c ( A θ ( t ) [ t m θ ( y , t ) τ θ ( t ) ] ) d P ( y ) = 0
H c ( A θ ( t ) [ t m θ ( y , t ) τ θ ( t ) ] ) H c ( A θ ( t ) [ t m θ ( y , t ) τ θ ( t ) ] ) d P ( y ) = I + 1 .
We define a new estimator t ^ θ c of t θ = t θ ( P 0 ) , as a Z-estimator corresponding to the ψ -function:
ψ θ ( x , t ) : = H c ( A θ ( t ) [ t m θ ( x , t ) τ θ ( t ) ] ) ;
more precisely, t ^ θ c is defined by
ψ θ ( y , t ^ θ c ) d P n ( y ) = 0 or i = 1 n H c ( A θ ( t ^ θ c ) [ t m θ ( X i , t ^ θ c ) τ θ ( t ^ θ c ) ] ) = 0 ,
the theoretical counterpart of this estimating equation being
ψ θ ( y , t θ ( P 0 ) ) d P 0 ( y ) = 0 .
For a given probability measure P, the statistical functional t θ c ( P ) associated with the estimator t ^ θ c , whenever it exists, is defined by
ψ θ ( y , t θ c ( P ) ) d P ( y ) = H c ( A θ ( t θ c ( P ) ) [ t m θ ( y , t θ c ( P ) ) τ θ ( t θ c ( P ) ) ] ) d P ( y ) = 0 .
Note that
t θ c ( P 0 ) = t θ ( P 0 ) ,
by construction.
Remark 1.
We notice a similarity between the Z-estimator defined in (34) and the classical optimal Bs-robust estimator for parametric models from [26]. In the case of the parametric models, the M-estimator corresponding to the ψ-function (33), but defined for the classical score function t ( ln f t ( x ) ) = t f t ( x ) f t ( x ) instead of the function t m θ ( x , t ) (inclusively in the system (31) and (32) defining A θ ( t ) and τ θ ( t ) ), is the classical optimal Bs-robust estimator ( f t ( x ) denotes the density corresponding to a parametric model indexed by the parameter t). The classical optimal Bs-robust estimator for parametric models has the optimal property that minimizes a measure of the asymptotic mean-squared error, among all the Fisher-consistent estimators with a self-standardized sensitivity smaller than the positive constant c.
In the following, for a given divergence, using the estimators t ^ θ c for t θ ( P 0 ) , we constructed new estimators of the parameter θ 0 of the model. In Section 3.3, we prove that all the estimators t ^ θ c are robust, and this property will be transferred to the new estimators that we define for the parameter θ 0 .
To define new estimators for θ 0 , we used the dual representation (15) of the divergence D φ ( M θ , P 0 ) . Since
θ 0 = arg inf θ Θ D φ ( M θ , P 0 ) = arg inf θ Θ sup t Λ θ m θ ( y , t ) d P 0 ( y )
= arg inf θ Θ m θ ( y , t θ ( P 0 ) ) d P 0 ( y ) ,
θ = θ 0 is the solution of the equation:
θ [ m ( y , θ , t ( θ , P 0 ) ) ] d P 0 ( y ) = 0 ,
where we used the notation t ( θ , P ) : = t θ ( P ) . Equation (40) may be written as
θ m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) + θ t ( θ 0 , P 0 ) t m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) = 0 .
On the basis of the definition of t θ ( P 0 ) = t ( θ , P 0 ) , for θ = θ 0 , we have
t m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) = 0 ;
therefore, we deduce that θ = θ 0 is the solution of equation:
θ m ( y , θ , t ( θ , P 0 ) ) d P 0 ( y ) = 0 .
Using (37), namely t c ( θ , P 0 ) = t ( θ , P 0 ) , we obtain that θ = θ 0 is in fact the solution of equation:
θ m ( y , θ , t c ( θ , P 0 ) ) d P 0 ( y ) = 0 .
Then, we define a new estimator θ ^ φ c of θ 0 , as a plug-in estimator solution of the equation:
θ m ( y , θ ^ φ c , t c ( θ ^ φ c , P n ) ) d P n ( y ) = 0 .
For a probability measure P, the statistical functional T c corresponding to the estimator θ ^ φ c , whenever it exists, is defined by
θ m ( y , T c ( P ) , t c ( T c ( P ) , P ) ) d P ( y ) = 0 .
The functional T c is Fisher-consistent, because
T c ( P 0 ) = θ 0 .
This equality is obtained by using (46) for P = P 0 , the fact that t c ( T c ( P 0 ) , P 0 ) = t ( T c ( P 0 ) , P 0 ) , and the definition of t θ ( P 0 ) = t ( θ , P 0 ) for θ = T c ( P 0 ) , all these leading to
θ m ( y , T c ( P 0 ) , t ( T c ( P 0 ) , P 0 ) ) d P 0 ( y ) + θ t ( T c ( P 0 ) , P 0 ) t m ( y , T c ( P 0 ) , t ( T c ( P 0 ) , P 0 ) ) d P 0 ( y ) = 0 .
Since θ 0 is the unique solution of Equation (41) and, according to (48), T c ( P 0 ) would be another solution to the same equation, we deduce (47).
From (34) and (45), we have
ψ θ ^ φ c ( y , t c ( θ ^ φ c , P n ) ) d P n ( y ) = 0 , θ m ( y , θ ^ φ c , t c ( θ ^ φ c , P n ) ) d P n ( y ) = 0 ,
and then,
ψ ( y , θ ^ φ c , t ^ θ ^ φ c c ) d P n ( y ) = 0 , θ m ( y , θ ^ φ c , t ^ θ ^ φ c c ) d P n ( y ) = 0 ,
with ψ ( y , θ , t ) : = ψ θ ( y , t ) . The couple of estimators θ ^ φ c , t ^ θ ^ φ c c can be viewed as a Z-estimator solution of the above system. Denoting
Ψ ( y , θ , t ) : = ( ψ ( y , θ , t ) , ( θ m ( y , θ , t ) ) ) ,
the Z-estimators θ ^ φ c , t ^ θ ^ φ c c are the solutions of the system:
Ψ ( y , θ ^ φ c , t ^ θ ^ φ c c ) d P n ( y ) = 0 ,
and the theoretical counterpart is given by
Ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) = 0 .

3.2. Asymptotic Properties

In this section, we establish the consistency and the asymptotic distributions for the estimators θ ^ φ c and t ^ θ ^ φ c c . In order to prove the consistency of the estimators, we adopted the results from the general theory of Z-estimators as presented for example in [28]. Then, using the consistency of the estimators, as well as supplementary conditions, we proved that the asymptotic distributions of the estimators are multivariate normal:
Assumption A1.
(a) 
There exist compact neighbourhoods V θ 0 of θ 0 and V t θ 0 of t θ 0 such that
sup θ V θ 0 , t V t θ 0 Ψ ( y , θ , t ) d P 0 ( y ) < .
(b) 
For any positive ε, the following condition holds
inf ( θ , t ) M Ψ ( y , θ , t ) d P 0 ( y ) > 0 = Ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ,
where M : = { ( θ , t ) s . t . ( θ , t ) ( θ 0 , t θ 0 ) > ε } .
Proposition 1.
Under Assumption 1, θ ^ φ c converges in probability to θ 0 and t ^ θ ^ φ c c converges in probability to t θ 0 :
Assumption 2.
(a) 
Both estimators θ ^ φ c and t ^ θ ^ φ c c converge in probability to θ 0 and t θ 0 , respectively.
(b) 
The function ( θ , t ) ψ ( x , θ , t ) is C 2 on some neighbourhood V ( θ 0 , t θ 0 ) for all x ( P 0 a.s.), and the partial derivatives of order two of the functions { ( θ , t ) ψ ( x , θ , t ) ; ( θ , t ) V ( θ 0 , t θ 0 ) } are dominated by some P 0 -integrable function H 1 ( x ) .
(c) 
The function ( θ , t ) m ( x , θ , t ) is C 3 on some neighbourhood U ( θ 0 , t θ 0 ) for all x ( P 0 a.s.), and the partial derivatives of order three of the functions { ( θ , t ) m ( x , θ , t ) ; ( θ , t ) U ( θ 0 , t θ 0 ) } are dominated by some P 0 -integrable function H 2 ( x ) .
(d) 
θ m ( y , θ 0 , t θ 0 ) 2 d P 0 ( y ) is finite, and the matrix:
S : = S 11 S 12 S 21 S 22 ,
with S 11 : = ( t ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , S 12 : = ( θ ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , S 21 : = ( 2 θ t m ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) and S 22 : = 2 2 θ m ( y , θ 0 , t θ 0 ) d P 0 ( y ) , exists and is invertible.
Proposition 2.
Let P 0 belong to the model M 1 , and suppose that Assumption 2 holds. Then, both n ( θ ^ φ c θ 0 ) and n ( t ^ θ ^ φ c c t θ 0 ) converge in distribution to a centred multivariate normal variable with covariance matrices given by
[ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] ,
and
[ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] .
The condition of Type (a) from Assumption 1 is usually considered to apply the uniform law of large numbers. For many choices of divergence (for example, those from the Cressie–Read family), the function Ψ is continuous in ( θ , t ) , and consequently, this condition is verified. The second condition from Assumption 1 is imposed for the uniqueness of ( θ 0 , t θ 0 ) as a solution of the equation and is verified, for example, whenever Ψ is continuous and the parameter space is compact ([28], p. 46). Furthermore, the conditions of Type (b)–(d), included in Assumption 2, are often imposed in order to apply the law of large numbers or the central limit theorem and can be verified for the functions appearing in the definitions of estimators proposed in the present paper.

3.3. Influence Functions and Robustness

In this section, we derive the influence functions of the estimators t ^ θ c and θ ^ φ c and prove their B-robustness. The corresponding statistical functionals are defined by (36) and (46), respectively.
Recall that, a map T, defined on a set of probability measures and parameter-space-valued, is a statistical functional corresponding to an estimator θ ^ of the parameter θ 0 from the model P 0 , if θ ^ = T ( P n ) , P n being the empirical measure corresponding to the sample. The influence function of T at P 0 is defined by
IF ( x ; T , P 0 ) : = T ( P ˜ ε x ) ε ε = 0 ,
where P ˜ ε x : = ( 1 ε ) P 0 + ε δ x , δ x being the Dirac measure. An unbounded influence function implies an unbounded asymptotic bias of a statistic under single-point contamination of the model. Therefore, a natural robustness requirement on a statistical functional is the boundedness of its influence function. Whenever the influence function is bounded with respect to x, the corresponding estimator is called B-robust [26].
Proposition 3.
For fixed θ, the influence function of the functional t θ c is given by
IF ( x ; t θ c , P 0 ) = t ψ θ ( y , t θ ( P 0 ) ) d P 0 ( y ) 1 · ψ θ ( x , t θ ( P 0 ) ) .
Proposition 4.
The influence function of the functional T c is given by
IF ( x ; T c , P 0 ) = θ g ¯ ( y , θ 0 ) d P 0 ( y ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) 1 θ g ¯ ( y , θ 0 ) d P 0 ( y ) 1 · θ g ¯ ( y , θ 0 ) d P 0 ( y ) 1 φ ( 1 ) IF ( x ; t θ 0 c , P 0 ) .
On the basis of Propositions 3 and 4, since x ψ θ ( x , t θ ( P 0 ) ) is bounded, all the estimators θ ^ φ c are B-robust.

4. Conclusions

We introduced a class of robust Z-estimators for moment condition models. These new estimators can be seen as robust alternatives for the minimum empirical divergence estimators. By using truncated functions based on the multidimensional Huber function, we defined robust estimators of the element that realizes the supremum in the dual form of the divergence, as well as new robust estimators for the parameter of the model. The asymptotic properties were proven, including the consistency and the limit laws. The influence functions for all the proposed estimators are bounded; therefore, these estimators are B-robust. The truncated function that we used to define the new robust Z-estimators contains functions implicitly defined, for which analytic forms are not available. The implementation of the estimation method will be addressed in a future research study. The idea of using the multidimensional Huber function, together with a scale matrix and a shift vector, to create a bounded version of the function corresponding to the estimating equation for the parameter of interest, could be considered in other contexts as well and would lead to new robust Z-estimators. As one of the Referees suggested, some other bounded functions could be used to define new robust Z-estimators for moment condition models. For example, the Tukey biweight function used together with a norm inside, in order to be appropriate to be applied to functions with vector values, could also be considered. Again, the original parameter of interest should remain the solution of the estimating equation based on the new bounded function. Such an idea is interesting to be analysed in future studies, in order to provide new robust versions of minimum empirical divergence estimators or robust Z-estimators in other contexts.

Funding

This work was supported by a grant of the Ministry of Research, Innovation and Digitization, CNCS CCCDI—UEFISCDI, Project Number PN-III-P4-ID-PCE-2020-1112, within PNCDI III.

Data Availability Statement

Not applicable.

Acknowledgments

We are very grateful to the Referees for their helpful comments and suggestions.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
i.i.d.independent and identically distributed
a.c.absolutely continuous
GMMgeneralized method of moments
CUcontinuous updating
ELempirical likelihood
ETexponential tilting
GELgeneralized empirical likelihood
ETELexponentially tilted empirical likelihood

Appendix A

Proof of Proposition 1.
Since ( θ , t ) Ψ ( y , θ , t ) is continuous, by the uniform law of large numbers, Assumption 1 (a) implies
sup θ V θ 0 , t V t θ 0 Ψ ( y , θ , t ) d P n ( y ) Ψ ( y , θ , t ) d P 0 ( y ) 0 ,
in probability. This result together with Assumption 1 (b) ensures the convergence in probability of the estimators θ ^ φ c and t ^ θ ^ φ c c toward θ 0 and t θ 0 , respectively. The proof is the same as the one for Theorem 5.9 from [28], p. 46. □
Proof of Proposition 2.
By the definitions of θ ^ φ c and t ^ θ ^ φ c c , they both satisfy
ψ ( y , θ ^ φ c , t ^ θ ^ φ c c ) d P n ( y ) = 0 ( E 1 )
θ m ( y , θ ^ φ c , t ^ θ ^ φ c c ) d P n ( y ) = 0 ( E 2 )
Using a Taylor expansion in (E1), there exists ( θ ˜ φ c , t ˜ φ c ) inside the segment that links ( θ ^ φ c , t ^ θ ^ φ c c ) and ( θ 0 , t θ 0 ) such that
0 = ψ ( y , θ 0 , t θ 0 ) d P n ( y ) + ( t ψ ( y , θ 0 , t θ 0 ) d P n ( y ) ) , ( θ ψ ( y , θ 0 , t θ 0 ) d P n ( y ) ) · a n + 1 2 a n A n a n ,
where
a n : = ( ( t ^ θ ^ φ c c t θ 0 ) , ( θ ^ φ c θ 0 ) ) ,
and
A n : = 2 2 t ψ ( y , θ ˜ φ c , t ˜ φ c ) d P n ( y ) 2 θ t ψ ( y , θ ˜ φ c , t ˜ φ c ) d P n ( y ) 2 t θ ψ ( y , θ ˜ φ c , t ˜ φ c ) d P n ( y ) 2 2 θ ψ ( y , θ ˜ φ c , t ˜ φ c ) d P n ( y ) .
By Assumption 2 (b), the law of large numbers implies that A n = O P ( 1 ) . Then, using Assumption 2 (a), the last term in (A2) can be written o P ( 1 ) a n . On the other hand, by Assumption 2 (d), using the law of large numbers, we can write
( t ψ ( y , θ 0 , t θ 0 ) d P n ( y ) ) , ( θ ψ ( y , θ 0 , t θ 0 ) d P n ( y ) ) = ( t ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , ( θ ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) + o P ( 1 ) .
Consequently, (A2) becomes
ψ ( y , θ 0 , t θ 0 ) d P n ( y ) = ( t ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) + o P ( 1 ) , ( θ ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) + o P ( 1 ) · a n .
In the same way, using a Taylor expansion in (E2), there exists ( θ ¯ φ c , t ¯ φ c ) inside the segment that links ( θ ^ φ c , t ^ θ ^ φ c c ) and ( θ 0 , t θ 0 ) such that
0 = θ m ( y , θ 0 , t θ 0 ) d P n ( y ) + ( 2 θ t m ( y , θ 0 , t θ 0 ) d P n ( y ) ) , ( 2 2 θ m ( y , θ 0 , t θ 0 ) d P n ( y ) ) · a n + 1 2 a n B n a n ,
where
B n : = 3 θ 2 t m ( y , θ ¯ φ c , t ¯ φ c ) d P n ( y ) 3 2 θ t m ( y , θ ¯ φ c , t ¯ φ c ) d P n ( y ) 3 θ t θ m ( y , θ ¯ φ c , t ˜ φ c ) d P n ( y ) 3 3 θ m ( y , θ ¯ φ c , t ¯ φ c ) d P n ( y ) .
Similarly, as in (A5), we obtain
θ m ( y , θ 0 , t θ 0 ) d P n ( y ) = ( 2 θ t m ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) + o P ( 1 ) , 2 2 θ m ( y , θ 0 , t θ 0 ) d P 0 ( y ) + o P ( 1 ) · a n .
Using (A5) and (A8), we obtain
n a n = n ( t ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) ( θ ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) ( 2 θ t m ( y , θ 0 , t θ 0 ) d P n ( y ) ) 2 2 θ m ( y , θ 0 , t θ 0 ) d P n ( y ) 1 × ψ ( y , θ 0 , t θ 0 ) d P n ( y ) θ m ( y , θ 0 , t θ 0 ) d P n ( y ) + o P ( 1 ) .
Consider S the ( l + 1 + d ) × ( l + 1 + d ) matrix:
S : = S 11 S 12 S 21 S 22 ,
with S 11 : = ( t ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , S 12 : = ( θ ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , S 21 : = ( 2 θ t m ( y , θ 0 , t θ 0 ) d P 0 ( y ) ) , and S 22 : = 2 2 θ m ( y , θ 0 , t θ 0 ) d P 0 ( y ) . Through calculations, we have
S 21 = [ 0 d , θ g ( y , θ 0 ) d P 0 ( y ) ] ,
S 22 = [ 0 d , , 0 d ] .
From (A9), we deduce that
n t ^ θ ^ φ c c t θ 0 θ ^ φ c θ 0 = S 1 n ψ ( y , θ 0 , t θ 0 ) d P n ( y ) 0 d + o P ( 1 ) .
On the other hand, under assumption Assumption 2 (d), using the central limit theorem,
n ψ ( y , θ 0 , t θ 0 ) d P n ( y ) 0 d
converges in distribution to a centred multivariate normal variable with covariance matrix:
M : = M 11 M 12 M 21 M 22 ,
with
M 11 : = cov [ ψ ( X , θ 0 , t θ 0 ) ] , M 12 : = 0 d 0 d , M 21 : = 0 0 0 l 0 l , M 22 : = 0 d 0 d .
Since E [ ψ ( X , θ 0 , t θ 0 ) ] = 0 by the construction of ψ , we obtain
M 11 = cov [ ψ ( X , θ 0 , t θ 0 ) ] = ψ ( y , θ 0 , t θ 0 ) ψ ( y , θ 0 , t θ 0 ) d P 0 ( y ) = I l + 1 ,
on the basis of (29) for θ = θ 0 .
Using then (A13) and the Slutsky theorem, we obtain that
n t ^ θ ^ φ c c t θ 0 θ ^ φ c θ 0
converges in distribution to a centred multivariate normal variable with the covariance matrix given by
C = S 1 M [ S 1 ] .
If we denote
C : = C 11 C 12 C 21 C 22 ,
through calculation, we obtain
C 11 = [ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] ,
C 12 = [ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] ,
C 21 = [ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ S 11 1 S 11 1 S 12 [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] ,
C 22 = [ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] × [ [ S 21 S 11 1 S 12 ] 1 S 21 S 11 1 ] .
Proof of Proposition 3.
For the contaminated model P ˜ ε x = ( 1 ε ) P 0 + ε δ x , whenever it exists, t θ c ( P ˜ ε x ) is defined as the solution of equation:
H c ( A θ ( t θ c ( P ˜ ε x ) ) [ t m θ ( y , t θ c ( P ˜ ε x ) ) τ θ ( t θ c ( P ˜ ε x ) ) ] ) d P ˜ ε x ( y ) = 0 .
It follows that
( 1 ε ) H c ( A θ ( t θ c ( P ˜ ε x ) ) [ t m θ ( y , t θ c ( P ˜ ε x ) ) τ θ ( t θ c ( P ˜ ε x ) ) ] ) d P 0 ( y ) + ε H c ( A θ ( t θ c ( P ˜ ε x ) ) [ t m θ ( x , t θ c ( P ˜ ε x ) ) τ θ ( t θ c ( P ˜ ε x ) ) ] ) = 0 .
Derivation with respect to ε in (A25) yields
H c ( A θ ( t θ ( P 0 ) ) [ t m θ ( y , t θ ( P 0 ) ) τ θ ( t θ ( P 0 ) ) ] ) d P 0 ( y ) + t [ H c ( A θ ( t ) [ t m θ ( y , t ) τ θ ( t ) ] ) ] t = t θ ( P 0 ) d P 0 ( y ) IF ( x ; t θ c , P 0 ) + H c ( A θ ( t θ ( P 0 ) ) [ t m θ ( x , t θ ( P 0 ) ) τ θ ( t θ ( P 0 ) ) ] ) .
Since the first integral in (A26) equals zero, we obtain
IF ( x ; t θ c , P 0 ) = t [ H c ( A θ ( t ) [ t m θ ( y , t ) τ θ ( t ) ] ) ] t = t θ ( P 0 ) d P 0 ( y ) 1 · H c ( A θ ( t θ ( P 0 ) ) [ t m θ ( x , t θ ( P 0 ) ) τ θ ( t θ ( P 0 ) ) ] ) = t ψ θ ( y , t θ ( P 0 ) ) d P 0 ( y ) 1 · ψ θ ( x , t θ ( P 0 ) ) .
For each θ , the influence function (55) is bounded with respect to x; therefore, the estimators t ^ θ c are B-robust.
Proof of Proposition 4.
For the contaminated model P ˜ ε x = ( 1 ε ) P 0 + ε δ x , T c ( P ˜ ε x ) is defined as the solution of equation:
θ m ( y , T c ( P ˜ ε x ) , t c ( T c ( P ˜ ε x ) , P ˜ ε x ) ) d P ˜ ε x ( y ) = 0 ,
whenever this solution exists. Then,
( 1 ε ) θ m ( y , T c ( P ˜ ε x ) , t c ( T c ( P ˜ ε x ) , P ˜ ε x ) ) d P 0 ( y ) + ε θ m ( x , T c ( P ˜ ε x ) , t c ( T c ( P ˜ ε x ) , P ˜ ε x ) ) = 0 .
Derivation with respect to ε in (A29) yields
θ m ( y , θ 0 , t c ( θ 0 , P 0 ) ) d P 0 ( y ) + 2 2 θ m [ y , θ , t ] θ = θ 0 , t = t θ 0 ( P 0 ) d P 0 ( y ) IF ( x ; T c , P 0 ) + + 2 θ t m [ y , θ , t ] θ = θ 0 , t = t θ 0 ( P 0 ) d P 0 ( y ) · θ t c ( θ 0 , P 0 ) IF ( x ; T c , P 0 ) + + IF ( x ; t θ 0 c , P 0 ) + θ [ m ( x , θ , t ) ] θ = θ 0 , t = t θ 0 ( P 0 ) = 0 .
Some calculations show that
θ [ m ( x , θ , t ) ] θ = θ 0 , t = t θ 0 ( P 0 ) = 0 and 2 2 θ [ m ( x , θ , t ) ] θ = θ 0 , t = t θ 0 ( P 0 ) = 0 ,
for any x; therefore, (A30) reduces to
2 θ t m [ y , θ , t ] θ = θ 0 , t = t θ 0 ( P 0 ) d P 0 ( y ) · θ t c ( θ 0 , P 0 ) IF ( x ; T c , P 0 ) + IF ( x ; t θ 0 c , P 0 ) = 0 .
On the other hand,
2 θ t m [ y , θ , t ] θ = θ 0 , t = t θ 0 ( P 0 ) d P 0 ( y ) = ψ ( φ ( 1 ) ) θ g ¯ ( y , θ 0 ) d P 0 ( y ) = θ g ¯ ( y , θ 0 ) d P 0 ( y ) ,
since ψ ( u ) = φ 1 ( u ) .
Taking into account that t c ( θ , P 0 ) = t ( θ , P 0 ) and t ( θ , P 0 ) verifies
t m ( y , θ , t ( θ , P 0 ) ) d P 0 ( y ) = 0 ,
and the derivation with respect to θ yields
2 t θ m ( y , θ , t ( θ , P 0 ) ) d P 0 ( y ) + 2 2 t m ( y , θ , t ( θ , P 0 ) ) d P 0 ( y ) · θ t ( θ , P 0 ) = 0 ,
which implies
θ t c ( θ 0 , P 0 ) = θ t ( θ 0 , P 0 ) = 2 2 t m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) 1 2 t θ m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) = φ ( 1 ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) 1 θ g ¯ ( y , θ 0 ) d P 0 ( y ) ,
because
2 2 t m ( y , θ 0 , t ( θ 0 , P 0 ) ) d P 0 ( y ) = 1 φ ( 1 ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) .
Then, (A32) becomes
θ g ¯ ( y , θ 0 ) d P 0 ( y ) φ ( 1 ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) 1 θ g ¯ ( y , θ 0 ) d P 0 ( y ) IF ( x ; T c , P 0 ) + IF ( x ; t θ 0 c , P 0 ) = 0 ,
and consequently,
IF ( x ; T c , P 0 ) = θ g ¯ ( y , θ 0 ) d P 0 ( y ) g ¯ ( y , θ 0 ) g ¯ ( y , θ 0 ) d P 0 ( y ) 1 θ g ¯ ( y , θ 0 ) d P 0 ( y ) 1 · θ g ¯ ( y , θ 0 ) d P 0 ( y ) 1 φ ( 1 ) IF ( x ; t θ 0 c , P 0 ) .

References

  1. Hansen, L.P. Large sample properties of generalized method of moments estimators. Econometrica 1982, 50, 1029–1054. [Google Scholar] [CrossRef]
  2. Hansen, L.; Heaton, J.; Yaron, A. Finite-sample properties of some alternative gmm estimators. J. Bus. Econ. Stat. 1996, 14, 262–280. [Google Scholar]
  3. Qin, J.; Lawless, J. Empirical likelihood and general estimating equations. Ann. Stat. 1994, 22, 300–325. [Google Scholar] [CrossRef]
  4. Imbens, G.W. One-step estimators for over-identified generalized method of moments models. Rev. Econ. Stud. 1997, 64, 359–383. [Google Scholar] [CrossRef]
  5. Kitamura, Y.; Stutzer, M. An information-theoretic alternative to generalized method of moments estimation. Econometrica 1997, 65, 861–874. [Google Scholar] [CrossRef]
  6. Newey, W.K.; Smith, R.J. Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica 2004, 72, 219–255. [Google Scholar] [CrossRef] [Green Version]
  7. Schennach, S.M. Point estimation with exponentially tilted empirical likelihood. Ann. Stat. 2007, 35, 634–672. [Google Scholar] [CrossRef] [Green Version]
  8. Pardo, L. Statistical Inference Based on Divergence Measures; Chapmann & Hall: London, UK, 2006. [Google Scholar]
  9. Basu, A.; Shioya, H.; Park, C. Statistical Inference: The Minimum Distance Approach; Chapmann & Hall: London, UK, 2011. [Google Scholar]
  10. Pardo, L.; Martín, N. Robust procedures for estimating and testing in the framework of divergence measures. Entropy 2021, 23, 430. [Google Scholar] [CrossRef]
  11. Riani, M.; Atkinson, A.C.; Corbellini, A.; Perrotta, D. Robust regression with density power divergence: Theory, comparisons, and data analysis. Entropy 2020, 22, 399. [Google Scholar] [CrossRef] [Green Version]
  12. Broniatowski, M.; Keziou, A. Divergences and duality for estimation and test under moment condition models. J. Stat. Plan. Inference 2012, 142, 2554–2573. [Google Scholar] [CrossRef] [Green Version]
  13. Broniatowski, M.; Keziou, A. Parametric estimation and tests through divergences and the duality technique. J. Multivar. Anal. 2009, 100, 16–36. [Google Scholar] [CrossRef] [Green Version]
  14. Toma, A.; Broniatowski, M. Dual divergence estimators and tests: Robustness results. J. Multivar. Anal. 2011, 102, 20–36. [Google Scholar] [CrossRef] [Green Version]
  15. Toma, A.; Leoni-Aubin, S. Robust tests based on dual divergence estimators and saddlepoint approximations. J. Multivar. Anal. 2010, 101, 1143–1155. [Google Scholar] [CrossRef] [Green Version]
  16. Toma, A. Model selection criteria using divergences. Entropy 2014, 16, 2686–2698. [Google Scholar] [CrossRef] [Green Version]
  17. Toma, A. Robustness of dual divergence estimators for models satisfying linear constraints. C. R. Math. Acad. Sci. Paris 2013, 351, 311–316. [Google Scholar] [CrossRef]
  18. Ronchetti, E.; Trojani, F. Robust inference with GMM estimators. J. Econom. 2001, 101, 37–69. [Google Scholar] [CrossRef] [Green Version]
  19. Lô, S.N.; Ronchetti, E. Robust small sample accurate inference in moment condition models. Comput. Stat. Data Anal. 2012, 56, 3182–3197. [Google Scholar] [CrossRef]
  20. Felipe, A.; Martín, N.; Miranda, P.; Pardo, L. Testing with exponentially tilted empirical likelihood. Methodol. Comput. Appl. Probab. 2018, 20, 1319–1358. [Google Scholar] [CrossRef]
  21. Keziou, A.; Toma, A. A robust version of the empirical likelihood estimator. Mathematics 2021, 9, 829. [Google Scholar] [CrossRef]
  22. Keziou, A.; Toma, A. Robust Empirical Likelihood. In Geometric Science of Information, Proceedings of the 5th International Conference, GSI 2021, Paris, France, 21–23 July 2021; Springer International Publishing: Cham, Switzerland, 2021. [Google Scholar]
  23. Rüschendorf, L. On the minimum discrimination information theorem. Stat. Decis. 1984, 263–283. [Google Scholar]
  24. Cressie, N.; Read, T.R.C. Multinomial goodness-of-fit tests. J. R. Stat. Soc. Ser. B 1984, 46, 440–464. [Google Scholar] [CrossRef]
  25. Broniatowski, M.; Keziou, A. Minimization of ϕ divergences on sets of signed measures. Stud. Sci. Math. Hung. 2006, 43, 403–442. [Google Scholar]
  26. Hampel, F.R.; Ronchetti, E.; Rousseeuw, P.J.; Stahel, W. Robust Statistics: The Approach Based on Influence Functions; Wiley: New York, NY, USA, 1986. [Google Scholar]
  27. Ronchetti, E.M.; Huber, P.J. Robust Statistics; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
  28. van der Vaart, A.W. Asymptotic Statistics; Cambridge Series in Statistical and Probabilistic Mathematics; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Toma, A. Robust Z-Estimators for Semiparametric Moment Condition Models. Entropy 2023, 25, 1013. https://doi.org/10.3390/e25071013

AMA Style

Toma A. Robust Z-Estimators for Semiparametric Moment Condition Models. Entropy. 2023; 25(7):1013. https://doi.org/10.3390/e25071013

Chicago/Turabian Style

Toma, Aida. 2023. "Robust Z-Estimators for Semiparametric Moment Condition Models" Entropy 25, no. 7: 1013. https://doi.org/10.3390/e25071013

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop