Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach

Ochoa, Maicol; Cascos, Ignacio

doi:10.3390/math10183272

Open AccessArticle

Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach

by

Maicol Ochoa

^* and

Ignacio Cascos

^*

Department of Statistics, Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911 Leganés, Spain

^*

Authors to whom correspondence should be addressed.

Mathematics 2022, 10(18), 3272; https://doi.org/10.3390/math10183272

Submission received: 15 July 2022 / Revised: 31 August 2022 / Accepted: 5 September 2022 / Published: 9 September 2022

(This article belongs to the Special Issue Advances in Statistics: Theory, Methodology, Applications and Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

For a univariate distribution, its M-quantiles are obtained as solutions to asymmetric minimization problems dealing with the distance of a random variable to a fixed point. The asymmetry refers to the different weights awarded to the values of the random variable at either side of the fixed point. We focus on M-quantiles whose associated losses are given in terms of a power. In this setting, the classical quantiles are obtained for the first power, while the expectiles correspond to quadratic losses. The M-quantiles considered here are computed over distorted distributions, which allows to tune the weight awarded to the more central or peripheral parts of the distribution. These distorted M-quantiles are used in the multivariate setting to introduce novel families of central regions and their associated depth functions, which are further extended to the multiple output regression setting in the form of conditional and regression regions and conditional depths.

Keywords:

bivariate depth algorithm; data depth; distortion function; M-quantiles; multiple output regression

MSC:

62H05

1. Introduction

In multivariate statistics, the degree of centrality of a point

x

with respect to a multivariate probability distribution or data sample is known as the depth of

x

. Since the seminal introduction of the halfspace regions (sets of points whose halfspace depth is, at least, some given value) by Tukey [1] in the context of graphical representations of bivariate datasets, a large number of notions of depth have been proposed and studied in the statistical literature. We highlight here the simplicial depth introduced by Liu [2] proving some properties that have become standard requirements for subsequent depth proposals, the first formal study of the halfspace depth by Rousseeuw and Ruts [3], and the introduction of the zonoid depth by Koshevoy and Mosler [4]. See also Cascos [5], Liu et al. [6], Zuo and Serfling [7,8] for some systematic reviews of depth functions, central, or depth regions, and their properties.

There has been recent interest in the use of expectiles (and more generally M-quantiles) in the construction of central regions and depth functions, see Cascos and Ochoa [9], Daouia and Paindaveine [10]. These expectiles, first introduced by Newey and Powell [11] in the context of linear regression, are the solution to an asymmetrically weighted least squares minimization problem. Likewise, the M-quantiles constitute a natural generalization of the expectiles and the standard quantiles that keep the asymmetric weights, but whose loss function is not necessarily quadratic or an absolute value, see Breckling and Chambers [12].

As will be shown, the M-quantiles of a distribution can be tuned, in particular robustified, by introducing a distortion function which adjusts the weights awarded to the values in the support of the distribution in terms of the quantile they correspond to. The so-called distorted M-quantiles turn out to be the solution to an equation involving Choquet expectations with respect to a (distorted) non-additive probability. They constitute a novel extension of the classical quantiles whose applications in the construction of central regions and depth functions in the multivariate and multiple output regression frameworks will be described.

Throughout this manuscript, specific distortions that downweight (or upweight) the outermost observations from a dataset are considered. Based on them, distorted M-quantile regions and their associated depth functions, generalizing those in [9,10], are obtained. As an alternative to using central regions and depth functions in the modelling of multivariate datasets, some authors have opted for straightforwardly introducing geometric quantiles for multivariate data, see Chaudhuri [13], Koltchinskii [14]. In addition to being points with a certain degree of centrality with respect to a distribution, such geometric quantiles contain also information about their outwarding direction from the center.

In the last part of the manuscript, we follow the approach to multiple output regression with M-quantiles of Daouia and Paindaveine [10], Hallin et al. [15,16], Merlo et al. [17] with our distorted M-quantiles in order to present the notions of distorted M-quantile regression regions and conditional regions. The single output (distorted) M-quantile regression resembles the classical quantile regression introduced by Koenker and Basset [18], see also Koenker [19], while its extension to multi-output models is based on univariate projections of the response variable. In this setting, we also consider a notion of conditional data depth for the output of a regression model given the regressors.

The M-quantile conditional regions are intended to serve as a descriptive tool when analyzing the explained variables given specific values of the predictors. Thereby, the relationship between response and explanatory variables is captured in the trend followed by the regions, the variability of the response variables in the volume of the regions, and the dependency relation among the response variables in the shape of the regions. Through some examples with both real and simulated datasets, we illustrate the robustness of the distorted M-quantile regression models against the presence of extreme points, as well as that of the corresponding regions and depth when using appropriate distortion functions.

This manuscript is organized as follows: Section 2 contains a summary of the preliminary concepts, where the notions of non-additive probabilities and Choquet expectations are used to introduce the concept of M-quantiles with power loss function. In Section 3, some relevant properties of the distorted M-quantiles, including an inversion formula and brief discussion of the sample distorted expectiles, are presented. Section 4 is devoted to the notion of distorted M-quantile depth, which is obtained in terms of the distorted M-quantile regions. In Section 5, we use a projection-based M-quantile regression technique for multi-output models from which we obtain the M-quantile conditional and regression regions. Finally, a list of concluding remarks and future work is displayed in Section 6. Appendix A is placed at the end of the manuscript describing an algorithm to compute the bivariate M-quantile depth.

2. Preliminaries

2.1. M-Quantiles with Power Loss Function

Consider a general non-atomic probability space

(Ω, F, P)

and a random variable X with finite moment of order

r \geq 1

defined on it. For any

0 < α < 1

, the power M-quantile of X of order r and level

α

is the solution to the minimization problem

q_{α}^{(r)} (X) = arg min_{θ \in R} E [α {(X - θ)}_{+}^{r} + (1 - α) {(X - θ)}_{-}^{r}],

(1)

where

x_{+} = max {x, 0}

and

x_{-} = {(- x)}_{+}

for any

x \in R

.

The M-quantiles with power loss function defined in (1) satisfy appealing properties like translation equivariance, positive homogeneity, and being uniquely defined for

r > 1

, see [20] for the proofs and further details.

Notice that when

r = 1

, we obtain the classical quantiles for which the uniqueness is only achieved if X has a continuous distribution with strictly increasing cdf, while for

r = 2

, we obtain the expectiles, which we denote by

e_{α} (X) = q_{α}^{(2)} (X)

.

In order to relax the notation, hereafter we omit the order of the M-quantile (unless there is some need to emphasize it) and write

q_{α} (X)

for the M-quantile of X of order r,

q_{α}^{(r)} (X)

.

Observe that the loss function in (1) is

ρ (x, θ) = α {(x - θ)}_{+}^{r} + (1 - α) {(x - θ)}_{-}^{r}

, and the asymmetric weights for the losses to the left and right of

θ

are, respectively,

1 - α

and

α

. Differentiating

ρ

with respect to

θ

, we obtain the first order condition equation that renders the M-quantile of order

r > 1

as the unique solution to the equation:

α E {(X - θ)}_{+}^{r - 1} = (1 - α) E {(X - θ)}_{-}^{r - 1},

(2)

which only requires

{E | X |}^{r - 1}

to be finite. Equation (2) can be also used for

r = 1

with the agreement that

0^{0} = 0

, and loosing the uniqueness in the solution, in order to compute the standard quantiles.

2.2. Non-Additive Probabilities

A non-additive probability over the reference measurable space

(Ω, F)

is any normalized and monotone function from the

σ

-algebra

F

. That is, a function

T : F \to R

, such that

T (\emptyset) = 0

,

T (Ω) = 1

, and

T (B) \leq T (C)

if

B \subseteq C

, see [21,22]. The dual to T is the non-additive probability

\tilde{T} (B) = 1 - T (B^{c})

, where

B^{c}

is the complement of B.

Consider a measurable mapping X from

(Ω, F)

into

R_{+}

equipped with the Borel

σ

-algebra. The asymmetric Choquet expectation of X with respect to T is

{CE}_{T} X = \int_{0}^{\infty} T (X \geq x) d x,

(3)

where the term in the right appears in the form of a Riemann integral. It is known that the asymmetric Choquet expectation with respect to a non-additive probability is a positive homogeneous (

{CE}_{T} [λ X] = λ {CE}_{T} X

if

λ > 0

), monotone (

{CE}_{T} X \leq {CE}_{T} Y

if

X \leq Y

), and translation equivariant (

{CE}_{T} [b + X] = b + {CE}_{T} X

if

b \in R

with

X \geq max {0, - b}

) operator, see Denneberg [21] Prop. 5.1.

Any general

F

-measurable function

X : Ω \to R

can be written as

X = X_{+} - X_{-}

and its symmetric Choquet expectation is

{CE}_{T} X = {CE}_{T} X_{+} - {CE}_{\tilde{T}} X_{-}

. Since we have restricted the asymmetric Choquet expectation to mappings into

R_{+}

and both Choquet expectations match on those non-negative mappings, we can use the same notation for them. The symmetric Choquet expectation is positive homogeneous, monotone, and symmetric (

{CE}_{T} [- X] = - {CE}_{T} X

). For non-negative mappings,

X \geq 0

, this symmetry results into the asymmetry-type relation

{CE}_{T} [- X] = - {CE}_{\tilde{T}} X

.

2.3. Distortion Functions

A distortion function is any non-decreasing function

g : [0, 1] \to [0, 1]

such that

g (0) = 0

and

g (1) = 1

. When a distortion function g acts over a probability measure P, we obtain what is known as a distorted probability, which is defined as the non-additive probability

P^{*} (B) = g (P (B))

for

B \in F

. Correspondingly, when a distortion function g acts on a cumulative distribution function (cdf) F, it transforms F into the new cdf

g (F)

. The dual distortion function of g is

\tilde{g} (x) = 1 - g (1 - x)

and it acts on P transforming it on the dual to

P^{*}

. Finally, we say that a distortion function is symmetric when

\tilde{g} = g

, and in such a case the distorted probability also matches its dual. For further details, see Denneberg [21] Chapter 2.

Particular instances of symmetric distortion functions are the identity function, whose associated (distorted) probability is P itself, the trim distortion of parameter

0 \leq β < 1

g_{β} (x) = \{\begin{matrix} 0 & if 0 \leq x \leq β / 2 \\ \frac{x - β / 2}{1 - β} & if β / 2 < x < 1 - β / 2 \\ 1 & if 1 - β / 2 \leq x \leq 1 \end{matrix},

and the sigmoid distortion of parameter

δ > 0

defined as

h_{δ} (x) = \frac{1}{1 + {(\frac{1 - x}{x})}^{δ}} if 0 < x \leq 1 and h_{δ} (0) = 0 .

It is immediate that

h_{1}

is the identity function, while if

0 < δ < 1

(respectively,

δ > 1

)

h_{δ} (x)

has only one inflection point at

x = 1 / 2

and it is concave (resp. convex) on

(0, 1 / 2)

and convex (respectively, concave) on

(1 / 2, 1)

.

Notice that

h_{δ}

, with

δ > 1

, penalises the lower and upper tails of a given cdf, while

g_{β}

directly discards the outermost

β / 2

fraction of points in each tail of the distribution.

3. Distorted M-Quantiles and Expectiles

Imitating the definition of classical M-quantiles, the distorted M-quantile of X of order

r \geq 1

and level

0 < α < 1

with distortion function g is defined as a minimizer of

α {CE}_{P^{*}} {(X - θ)}_{+}^{r} + (1 - α) {CE}_{\tilde{P^{*}}} {(X - θ)}_{-}^{r}

, that is, as the solution

θ \in R

to the equation

α {CE}_{P^{*}} {(X - θ)}_{+}^{r - 1} = (1 - α) {CE}_{\tilde{P^{*}}} {(X - θ)}_{-}^{r - 1} .

(4)

If

r > 1

, it can be deduced from its alternative representation as the standard M-quantile of some other random variable in Theorem 1, that the solution is unique. In the special case that

r = 1

, we take the largest solution to Equation (4) in order to fit our purpose of inserting the halfspace regions inside the M-quantile framework.

As for standard M-quantiles, we also omit the order, r, in the notation, and the distorted M-quantile of X of level

α

(with distortion function g) is denoted as

q_{α}^{*} (X)

in what follows, while the distorted M-quantile associated with the dual distortion function,

\tilde{g}

is denoted as

{\tilde{q}}_{α}^{*} (X)

. Observe that for symmetric distortion functions, such as the trim or the sigmoid ones, both Choquet expectations in (4) are taken with respect to the same distorted probability

P^{*}

.

The following result, which is derived using the inverse transform method and the properties of the Choquet expectation, enlightens about the nature and properties of distorted M-quantiles.

Theorem 1.

Given two random variables

X, Y

, two real numbers

r \geq 1

,

0 < α < 1

, and a distortion function g, such that

{CE}_{P^{*}} X_{+}^{r - 1}, {CE}_{\tilde{P^{*}}} X_{-}^{r - 1}, {CE}_{P^{*}} Y_{+}^{r - 1}, {CE}_{\tilde{P^{*}}} Y_{-}^{r - 1}

are finite, then

1.: The distorted M-quantile of X of order r and level α is the M-quantile of order r and level α of a random variable Z (on the same probability space) whose cdf is $\tilde{g} (F_{X})$ , that is,

$q_{α}^{*} (X) = q_{α} (Z) with F_{Z} = \tilde{g} (F_{X}) .$
2.: Monotonicity on the level. If $0 < α \leq β < 1$ , then $q_{α}^{*} (X) \leq q_{β}^{*} (X)$ .
3.: Upper and lower M-quantiles. $q_{α}^{*} (- X) = - {\tilde{q}}_{1 - α}^{*} (X)$ and if g is symmetric, $q_{α}^{*} (- X) = - q_{1 - α}^{*} (X)$ .
4.: Translation equivariance. $q_{α}^{*} (b + X) = b + q_{α}^{*} (X)$ for $b \in R$ .
5.: Positive homogeneity. $q_{α}^{*} (λ X) = λ q_{α}^{*} (X)$ if $λ \geq 0$ .
6.: Monotonicity. If $X \leq Y$ , then $q_{α}^{*} (X) \leq q_{α}^{*} (Y)$ .

Proof.

In order to prove 1, let Z be a random variable defined on the same probability space as X and whose cdf is

F_{Z} = \tilde{g} (F_{X})

. Due to the relation

\tilde{g} (x) = 1 - g (1 - x)

for all x, the right term of Equation (4) can be written in terms of Z as

\begin{matrix} {CE}_{P^{*}} {(X - θ)}_{+}^{r - 1} & = \int_{0}^{\infty} g (P ({(X - θ)}_{+}^{r - 1} \geq x)) d x \\ = \int_{0}^{\infty} g (1 - F_{X} (x^{1 / (r - 1)} + θ)) d x \\ = \int_{0}^{\infty} 1 - F_{Z} (x^{1 / (r - 1)} + θ) d x \\ = E {(Z - θ)}_{+}^{r - 1}, \end{matrix}

where the last expression appears in terms of an expectation with respect to the probability in the considered probability space, P.

Since we can do a similar transformation with the left term of Equation (4), the distorted M-quantile of X of level

α

matches the undistorted M-quantile of Z of level

α

.

Statement 2 is immediate after 1 and the monotonicity of the M-quantiles in

α

, see [20] Prop. 5.

To prove 3, just observe that

{CE}_{P^{*}} {(- X - θ)}_{+}^{r - 1} = {CE}_{P^{*}} {(X - (- θ))}_{-}^{r - 1}

and similarly

{CE}_{\tilde{P^{*}}} {(- X - θ)}_{-}^{r - 1} = {CE}_{\tilde{P^{*}}} {(X - (- θ))}_{+}^{r - 1}

, so the weights and distorted probabilities are interchanged between the terms of Equation (4).

Following 1, let

b \in R

, then

\begin{matrix} {CE}_{P^{*}} {(X + b - θ)}_{+}^{r - 1} & = \int_{0}^{\infty} g (P ({(X - (θ + b))}_{+}^{r - 1} \geq x)) d x \\ = \int_{0}^{\infty} g (1 - F_{X} (x^{1 / (r - 1)} + (θ - b))) d x \\ = \int_{0}^{\infty} 1 - F_{Z} (x^{1 / (r - 1)} + (θ - b)) d x \\ = E {(Z + b - θ)}_{+}^{r - 1} . \end{matrix}

Similarly, we have that

{CE}_{\tilde{P^{*}}} {(X + b - θ)}_{-}^{r - 1} = E_{P} {(Z + b - θ)}_{-}^{r - 1}

, so we conclude that

q_{α}^{*} (X + b) = q_{α} (Z + b)

. Since

q_{α} (Z + b) = q_{α} (Z) + b

and

q_{α}^{*} (X) = q_{α} (Z)

hold, then 4 is proved.

It also holds

{CE}_{P^{*}} {(λ X - θ)}_{+}^{r - 1} = E_{P} {(λ Z - θ)}_{+}^{r - 1}

and analogously

{CE}_{\tilde{P^{*}}} {(λ X - θ)}_{-}^{r - 1} = E_{P} {(λ Z - θ)}_{-}^{r - 1}

, which proves that

q_{α}^{*} (λ X) = q_{α} (λ Z)

, and since the undistorted M-quantile satisfies

q_{α} (λ Z) = λ q_{α} (Z)

then

5 .

is proved.

If

X \leq Y

then

{(X - θ)}_{+}^{r - 1} \leq {(Y - θ)}_{+}^{r - 1}

hence

{{(X - θ)}_{+}^{r - 1} \geq x} \subseteq {{(Z - θ)}_{+}^{r - 1} \geq x}

and since g is a non-decreasing function, we have

\begin{matrix} {CE}_{P^{*}} {(X - θ)}_{+}^{r - 1} & = \int_{0}^{\infty} g (P ({(X - θ)}_{+}^{r - 1} \geq x)) d x \\ \leq \int_{0}^{\infty} g (P ({(Y - θ)}_{+}^{r - 1} \geq x)) d x \\ = {CE}_{P^{*}} {(Y - θ)}_{+}^{r - 1}, \end{matrix}

and in the same way

{CE}_{\tilde{P^{*}}} {(X - θ)}_{-}^{r - 1} \leq {CE}_{\tilde{P^{*}}} {(Y - θ)}_{-}^{r - 1}

, hence

q_{α}^{*} (X) \leq q_{α}^{*} (Y)

. □

Remark 1.

The distorted M-quantiles of order

r = 2

(distorted expectiles) are denoted by

e^{*}

. The most central distorted expectile (obtained at level

α = 1 / 2

) of random variable X is the symmetric Choquet expectation of X with respect to

P^{*} = g (P)

, that is,

e_{1 / 2}^{*} (X) = {CE}_{P^{*}} X

, which is the mean of a random variable whose cdf is

\tilde{g} (F_{X})

.

The inverse distorted M-quantile of

x \in R

with respect to a random variable X is the level

α

for which x matches the distorted M-quantile of X,

x = q_{α}^{*} (X)

. It is computed solving (4) on

α

and yields

q_{X}^{* - 1} (x) = {(1 + \frac{{CE}_{P^{*}} {(X - x)}_{+}^{r - 1}}{{CE}_{\tilde{P^{*}}} {(X - x)}_{-}^{r - 1}})}^{- 1},

(5)

with the agreement

0 / 0 = 0

which is only needed if X follows a degenerate distribution at x. We use the symbol

e_{X}^{* - 1} (x)

to indicate the inverse distorted expectile function, corresponding to (5) for

r = 2

.

Sample Distorted M-Quantiles

Let

x_{(1)} \leq x_{(2)} \leq \dots \leq x_{(n)}

be an ordered sample of real numbers. Let g be a distortion function, and for

i = 1, \dots, n

define the weights

w_{i} = \tilde{g} (\frac{i}{n}) - \tilde{g} (\frac{i - 1}{n}) = g (1 - \frac{i - 1}{n}) - g (1 - \frac{i}{n}),

hence, it is immediate that

w_{i} \geq 0

for all i and

\sum_{i} w_{i} = 1

.

For

0 < α < 1

, as in (4), the sample distorted M-quantile, denoted

q_{n, α}^{*} (x_{(1)}, \dots, x_{(n)})

or simply

q_{n, α}^{*}

when there is no possible confusion with the sample, can be computed as the solution to the equation

α \sum_{x_{(i)} \geq q_{n, α}^{*}} w_{i} {(x_{(i)} - q_{n, α}^{*})}^{r - 1} = (1 - α) \sum_{x_{(i)} < q_{n, α}^{*}} w_{i} {(q_{n, α}^{*} - x_{(i)})}^{r - 1} .

(6)

Notice from (6) that the corresponding sample inverse distorted M-quantile is

q_{n}^{* - 1} (x) = {(1 + \frac{\sum_{x_{(i)} > x} w_{i} {(x_{(i)} - x)}^{r - 1}}{\sum_{x_{(i)} < x} w_{i} {(x - x_{(i)})}^{r - 1}})}^{- 1} .

(7)

Like the sample expectile, the sample distorted expectile (

r = 2

), denoted by

e_{n, α}^{*}

can be calculated from (6) by means of iterated reweighting, while its inverse follows (7) and is denoted by

e_{n, α}^{* - 1}

.

4. Distorted M-Quantile Central Regions and Depth

4.1. Distorted M-Quantile Central Regions

A central region associated with a random vector contains valuable information about the location, scatter, and dependency structure among the components in its location, size, and shape. Based on the (univariate) distorted M-quantiles, we build the distorted M-quantile central regions in terms of an intersection of halfspaces.

Consider a d-dimensional random vector

X

such that

{CE}_{P^{*}} {∥ X ∥}^{r - 1}

and

{CE}_{\tilde{P^{*}}} {∥ X ∥}^{r - 1}

are finite for some

r \geq 1

and

0 < α \leq 1

. The distorted M-quantile region of order r and level

α

is defined as

Q_{α}^{*} (X) = ⋂_{u \in S^{d - 1}} {x \in R^{d} : 〈 x, u 〉 \leq q_{1 - α}^{*} (〈 X, u 〉)},

(8)

where

S^{d - 1}

is the unit sphere in

R^{d}

and

〈 \cdot, \cdot 〉

is the standard inner product in

R^{d}

.

Notice that in the univariate case (

d = 1

), the region is the closed interval

Q_{α}^{*} (X) = [- q_{1 - α}^{*} (- X), q_{1 - α}^{*} (X)] = [{\tilde{q}}_{α}^{*} (X), q_{1 - α}^{*} (X)] .

If the distortion function g is the identity function, the (undistorted) M-quantile regions obtained were already studied in [10], while if further

r = 2

, the expectile regions were the specific focus of [9].

From its construction and Theorem 1, it is clear that the distorted M-quantile regions satisfy the properties that are stated below.

Proposition 1.

Let

X

be a d-dimensional random vector, fix

0 < α \leq 1

, and

r \geq 1

, such that

{CE}_{P^{*}} {∥ X ∥}^{r - 1}

and

{CE}_{\tilde{P^{*}}} {∥ X ∥}^{r - 1}

are finite, then

1.: Compactness and convexity. $Q_{α}^{*} (X)$ is a compact and convex subset of $R^{d}$ .
2.: Nesting. The lower the level, the wider the region: $Q_{α}^{*} (X) \subseteq Q_{β}^{*} (X)$ if $0 < β \leq α \leq 1$ .
3.: Affine equivariance. $Q_{α}^{*} (A X + b) = A Q_{α}^{*} (X) + b$ for any non-singular matrix $A \in R^{d \times d}$ and $b \in R^{d}$ .
4.: Contains the centermost point. If the distribution of $X$ is centrally symmetric about $y_{0}$ , that is, $P (y_{0} - X \in H) = P (X - y_{0} \in H)$ for every halfspace $H \subseteq R^{d}$ , then $y_{0} \in Q_{α}^{*} (X)$ whenever $Q_{α}^{*} (X)$ is non-void.

Proof.

Notice that the compactness and the convexity of these regions is due to the way that they are obtained as intersection of closed halfspaces, the nesting is immediate after the monotonicity of the M-quantiles in

α

, whereas the affine equivariance is a direct consequence of the translation equivariance and positive homogeneity of distorted M-quantiles, presented in Theorem 1.

We will only show here that for centrally symmetric distributions, every non-void distorted M-quantile central region contains the point of symmetry. Denote the point of symmetry by

y_{0}

and observe that the central region

Q_{α}^{*} (X)

introduced in (8) can be rewritten as

y_{0} + Q_{α}^{*} (X - y_{0}) = ⋂_{u \in S^{d - 1}} {y_{0} + x \in R^{d} : - q_{1 - α}^{*} (〈 y_{0} - X, u 〉) \leq 〈 x, u 〉 \leq q_{1 - α}^{*} (〈 X - y_{0}, u 〉)} .

After [23] Lem. 2.1, the distribution of

X

is centrally symmetric about

y_{0}

if, and only if, for every

u \in S^{d - 1}

, the random variables

〈 X - y_{0}, u 〉

and

〈 y_{0} - X, u 〉

share the same distribution, so their distorted M-quantiles must be identical. Consequently, either

q_{1 - α}^{*} (〈 X - y_{0}, u 〉) \geq 0

for all

u \in S^{d - 1}

and then

y_{0} \in Q_{α}^{*} (X)

, or alternatively

Q_{α}^{*} (X)

is void. □

4.2. Distorted M-Quantile Sample Regions

Given a sample

{x_{1}, x_{2}, \dots, x_{n}} \subset R^{d}

and a unitary vector

u \in S^{d - 1}

, let

π_{u}

be the permutation of

{1, 2, \dots, n}

, such that

〈 x_{π_{u} (1)}, u 〉 \leq \dots \leq 〈 x_{π_{u} (n)}, u 〉

, then for

0 < α \leq 1

, the sample distorted M-quantile region is either void or, for sufficiently small

α

, yields the polyhedral set:

Q_{n, α}^{*} = ⋂_{u \in S^{d - 1}} {x \in R^{d} : 〈 x, u 〉 \leq q_{n, 1 - α}^{*} (〈 x_{π_{u} (1)}, u 〉, \dots, 〈 x_{π_{u} (n)}, u 〉)} .

In Figure 1, we illustrate the robustness to outliers of the distorted M-quantile regions with the trim distortion function in comparison with the undistorted ones. This illustration includes the expectile regions and the halfspace ones.

For this example, we have simulated 190 observations of a bivariate normal random vector with mean vector

{(10, 20)}^{⊤}

and covariance matrix

(\begin{matrix} 5 & 4 \\ 4 & 4 \end{matrix})

and 10 further observations of a bivariate normal random vector with mean

{(16, 16)}^{⊤}

and identity covariance matrix. Specifically, Figure 1 presents, on the left, several distorted M-quantile regions (contours of the halfspace regions,

r = 1

, at the top, of the M-quantile regions for

r = 1.5

at the second row, of the expectile regions,

r = 2

, at the third row, and contours of the M-quantile regions for

r = 3

at the bottom) with a trim distortion level

β = 0.1

and, on the right, the corresponding (undistorted) M-quantile regions. The depth levels considered are

0.005

,

0.02

,

0.05

,

0.1

,

0.2

,

0.3

, and

0.4

.

Observe that the outermost halfspace regions (top right) are highly affected in the presence of outliers, while their trimmed counterparts (top left) barely change in shape with respect to the inner regions. These inner regions are almost identical on both sides. The situation with the remaining outer M-quantile regions is almost the same, but changes substantially with the inner regions. The greater the level r is, the more the M-quantile inner regions are also affected by the outliers. This situation is corrected with the distorted M-quantile regions.

4.3. Distorted M-Quantile Data Depth Function

Once we have introduced the M-quantile regions, it is possible to construct the corresponding distorted M-quantile depth for any fixed order

r \geq 1

in a natural way. The depth of each point

y \in R^{d}

is the maximum level

α

such that it belongs to the region

Q_{α}^{*} (X)

, that is,

{QD}^{*} (y; X) = sup {0 < α \leq 1 : y \in Q_{α}^{*} (X)} .

(9)

The so-defined distorted M-quantile depth satisfies the usual properties of a depth function.

Proposition 2.

Let

X

be a d-dimensional random vector and fix

r \geq 1

, such that

{CE}_{P^{*}} {∥ X ∥}^{r - 1}

and

{CE}_{\tilde{P^{*}}} {∥ X ∥}^{r - 1}

are finite, then

1.: Affine invariance. The depth of a point $y \in R^{d}$ is independent of the underlying coordinate system, that is, for any nonsingular matrix $A \in R^{d \times d}$ and $b \in R^{d}$

${QD}^{*} (A y + b; A X + b) = {QD}^{*} (y; X) .$
2.: Upper semicontinuity. ${QD}^{*} (\cdot; X)$ is an upper semicontinuous function.
3.: Maximality at center. If the distribution of $X$ is centrally symmetric about $y_{0}$ , then ${QD}^{*} (\cdot; X)$ attains its maximum value at $y_{0}$ .
4.: Decreasing on rays. If $y_{0} \in R^{d}$ is the deepest point and $y \in R^{d}$ , then for $0 \leq λ \leq 1$ it holds

${QD}^{*} (y; X) \leq {QD}^{*} (λ y + (1 - λ) y_{0}; X) .$
5.: Vanishing at infinity. The depth of a point $y$ goes to zero as $∥ y ∥ \to \infty$ .

It is not hard to realize that these properties follow from the ones of the distorted M-quantile regions. Notice further that, unlike for the (undistorted) expectile depth, the maximum value of the distorted expectile depth of a given distribution is not necessarily

1 / 2

, while, in the univariate setting, the distorted M-quantile depth renders

{QD}^{*} (y; X) = min {1 - q_{X}^{* - 1} (y), 1 - q_{- X}^{* - 1} (- y)} .

The result below gives us a useful expression to compute the multivariate distorted M-quantile depth of a given point.

Theorem 2.

The distorted M-quantile depth of a point

y \in R^{d}

with respect to the distribution of the d-dimensional random vector

X

satisfies:

{QD}^{*} (y; X) = {(1 + sup_{u \in S^{d - 1}} \frac{{CE}_{\tilde{P^{*}}} {〈 X - y, u 〉}_{-}^{r - 1}}{{CE}_{P^{*}} {〈 X - y, u 〉}_{+}^{r - 1}})}^{- 1} .

Proof.

Just notice that

\begin{matrix} {QD}^{*} (y; X) & = sup \{0 < α \leq 1 : y \in {Q^{*}}_{α} (X)\} \\ = sup \{0 < α \leq 1 : 〈 y, u 〉 \leq q_{1 - α}^{*} (〈 X, u 〉) for all u \in S^{d - 1}\} \\ = sup \{0 < α \leq 1 : q_{〈 X, u 〉}^{* - 1} (〈 y, u 〉) \leq 1 - α for all u \in S^{d - 1}\} \\ = inf_{u \in S^{d - 1}} (1 - q_{〈 X, u 〉}^{* - 1} (〈 y, u 〉)) \\ = {(1 + sup_{u \in S^{d - 1}} \frac{{CE}_{\tilde{P^{*}}} {〈 X - y, u 〉}_{-}^{r - 1}}{{CE}_{P^{*}} {〈 X - y, u 〉}_{+}^{r - 1}})}^{- 1} \end{matrix}

because of (5). □

The special case of Theorem 2 for

r = 2

is used in Appendix A to build Algorithm A1 for the exact computation of the bivariate distorted expectile depth. That algorithm, extensively discussed in the appendix, is not as fast as the one for the standard bivariate expectile depth in [9], because it must consider all possible sortings in terms of the less-or-equal order applied to the univariate projections of the data points. Nevertheless, it can be used with any distortion function, in particular with one that downweights the effect of the outlying observations, as can be seen in Figure 2. Since the deepest point with respect to the expectile depth is the sample mean, all expectile regions are sensitive to outliers and we find these modifications in terms of distortions rather convenient.

Observe that if the distortion function g is symmetric then the distorted M-quantile depth yields the very appealing expression

{QD}^{*} (y; X) = inf_{u \in S^{d - 1}} \frac{{CE}_{P^{*}} {〈 X - y, u 〉}_{+}^{r - 1}}{{CE}_{P^{*}} {| 〈 X - y, u 〉 |}^{r - 1}} .

Remark 2.

The trim distortion serves to get rid of outermost observations for each univariate projection in the computation of any M-quantile depth. Every distribution or d-dimensional dataset has a (set of) point(s) whose halfspace depth is, at least,

1 / (d + 1)

, see Donoho and Gasko [24]. This makes it sensible to consider the trim distortion function with some trimming level

β < 2 / (d + 1)

in order to build a depth function which is not affected by outlying observations. Further, the point of maximal depth would constitute a robust location estimate. This is particularly interesting for the expectile depth (

r = 2

), which, when not trimmed, attains its maximal value at the mean.

4.4. Multivariate Ranks Based on the Distorted M-Quantile Depth

Following the ideas summarized in [5] and given a multivariate dataset, we can assess ranks to the points in it in terms of their distorted M-quantile depths with respect to the dataset. After computing the depth of every point, rank 1 is assessed to the k points with the lowest depth, rank

k + 1

to the points with the second lowest depth, and so on, until the highest rank is assessed to the deepest point of the dataset. By means of this rank, we introduce a notion of ordering in

R^{d}

which could be used in further statistical analysis, see, for example, [25].

In Figure 2, we have plotted 39 points simulated from a bivariate normal distribution centered at the origin of coordinates

{(0, 0)}^{⊤}

with covariance matrix

(\begin{matrix} 1 & 0.9 \\ 0.9 & 1 \end{matrix})

along with an extreme outlier at

{(7, - 7)}^{⊤}

and ranked them by means of several distorted M-quantile depth functions: the halfspace one (top left), the expectile (top right), the distorted expectile with the trim distortion function

g_{0.1}

,

10 %

of trimming, (bottom left) and the sigmoid distortion with parameter

δ = 3

(bottom right). Notice that for the expectile depth, the highest ranks are awarded to the points lying closer to the outlier, among those that were simulated with regard to the normal distribution with mean at the origin of coordinates. When some observations are trimmed, the highest rank for the distorted expectile depth moves towards the origin of coordinates, but some inner points also assume rank 1. Finally, when the sigmoid distortion function is used, the points with the highest ranks are close to the origin of coordinates while all points inside the interior of the convex hull of the data cloud assume a rank greater than 1. Since many rank numbers are overlapping, they are printed in color. The deeper an observation is, the lighter the color that we have used for it.

The halfspace depth was calculated using the R package ddalpha by Pokotylo et al. [26].

5. Distorted M-Quantile Depth and Regression Models

Assume that the response of a linear regression model is given in terms of a d-dimensional random vector

Y

, while there are p regressors forming a p-dimensional random vector

X

. If

x \in R^{p}

, we use the notation

\overset{˚}{x}

to represent the element in

R^{p + 1}

whose first component is 1, and the remaining components are equal to those of

x

. For each observation

i = 1, \dots, n

, we can write the multivariate regression model as

Y_{i} = Θ {\overset{˚}{X}}_{i} + ϵ_{i},

where

Θ

is a

(p + 1) \times d

matrix with each row formed by the regression coefficients of the multiple linear regression model that fits each of the components of random vector

Y

, and

ϵ_{i}

is the (vector) error term associated with the i-th observation.

Instead of estimating matrix

Θ

, we will focus on studying the conditional central regions of

Y

for any level

α

and given the value assumed by the explanatory vector

X

.

5.1. Distorted M-Quantile Regression Hyperplanes for a Univariate Response

By analogy with (1), for a univariate response Y and a p-dimensional vector of regressors

X

, the distorted M-quantile regression hyperplane of order

r \geq 1

and level

0 < α \leq 1

is defined through the vector of coefficients

θ = (θ_{0}, \dots, θ_{p}) \in R^{p + 1}

minimizing the expression

q_{α}^{*} (Y | X) = arg min_{θ \in R^{p + 1}} [α {CE}_{P^{*}} {[Y - 〈 \overset{˚}{X}, θ 〉]}_{+}^{r} + (1 - α) {CE}_{\tilde{P^{*}}} {[Y - 〈 \overset{˚}{X}, θ 〉]}_{-}^{r}]

(10)

and results to be

H_{α}^{*} (Y | X) = {(x, y) \in R^{p + 1} : 〈 q_{α}^{*} (Y | X), \overset{˚}{x} 〉 = y}

. When the distortion function is

g (x) = x

, we denote the solution as

q_{α} (Y | X)

and the M-quantile regression hyperplane of order

r \geq 1

as

H_{α} (Y | X)

, while notations

{\tilde{q}}_{α}^{*} (Y | X)

and

{\tilde{H}}_{α}^{*} (Y | X)

correspond to the coefficients and hyperplane obtained for the dual distortion function.

The coefficients of the distorted M-quantile regression hyperplanes fulfill properties similar to those presented in Theorem 1 for the distorted M-quantiles.

Proposition 3.

For a response random variable Y,

r \geq 1

,

0 < α \leq 1

, and a p-variate vector of regressors

X

, it holds

1.: Upper and lower M-quantiles. $q_{α}^{*} (- Y | X) = - {\tilde{q}}_{1 - α}^{*} (Y | X)$ and if the distortion g is symmetric, $q_{α}^{*} (- Y | X) = - q_{1 - α}^{*} (Y | X)$ .
2.: Translation equivariance. $q_{α}^{*} (b + Y | X) = (b, 0, \dots, 0) + q_{α}^{*} (Y | X)$ for any $b \in R$ .
3.: Positive homogeneity. $q_{α}^{*} (λ Y | X) = λ q_{α}^{*} (Y | X)$ if $λ \geq 0$ .

In terms of the M-quantile regression hyperplane, 1. means that

H_{α}^{*} (- Y | X)

is identical to

{\tilde{H}}_{1 - α}^{*} (Y | X)

except for a change of sign in the last coordinate; 2. states that the only coefficient affected by a translation in Y is the one corresponding to the offset of the hyperplane,

θ_{0}

, so

H_{α}^{*} (b + Y | X)

matches

H_{α}^{*} (Y | X)

with a translation of b units in the last coordinate; finally 3. states that when the response is rescaled, then

H_{α}^{*} (λ Y | X)

matches

H_{α}^{*} (Y | X)

except for the fact that the last coordinate is multiplied by

λ

.

Due to the similarities in the way

q_{α}^{*} (Y)

and

q_{α}^{*} (Y | X)

are defined, the proof of Proposition 3 is similar to the proof of Theorem 1.

5.2. Distorted M-Quantile Regression and Conditional Regions

We use the name regression regions to refer to sets in the

(p + d)

-dimensional space containing the observations of both, the explanatory and response variables, while under conditional regions given some specific value of the explanatory variables, we refer to sets in the d-dimensional space of the response variables.

5.2.1. Univariate Response

The distorted M-quantile regression region of level α in a simple linear regression model is the set of points in

R^{p + 1}

comprised between the regression hyperplanes

H_{1 - α}^{*} (Y | X)

and

{\tilde{H}}_{α}^{*} (Y | X)

whenever

H_{1 - α}^{*} (Y | X)

lies above

{\tilde{H}}_{α}^{*} (Y | X)

on the last coordinate (the one associated with the response variable). Since

- q_{1 - α}^{*} (- Y | X) = {\tilde{q}}_{α}^{*} (Y | X)

, see Proposition 3, we can write it as

Q_{α}^{*} (Y | X) = \{(x, y) \in R^{p + 1} : 〈 - q_{1 - α}^{*} (- Y | X), \overset{˚}{x} 〉 \leq y \leq 〈 q_{1 - α}^{*} (Y | X), \overset{˚}{x} 〉\},

and recall that if g is symmetric,

- q_{1 - α}^{*} (- Y | X) = q_{α}^{*} (Y | X)

.

The distorted M-quantile conditional region when

X = x_{0}

is the interval given by the projection on the last coordinate of the intersection of

Q_{α}^{*} (Y | X)

with the affine line constituted by all points whose first p coordinates are identical to

x_{0}

. We can indistinctively write this region using set or interval notation,

\begin{matrix} Q_{α}^{*} (Y | X = x_{0}) & = {y \in R : (x_{0}, y) \in Q_{α}^{*} (Y | X)} \\ = [〈 - q_{1 - α}^{*} (- Y | X), {\overset{˚}{x}}_{0} 〉, 〈 q_{1 - α}^{*} (Y | X), {\overset{˚}{x}}_{0} 〉], \end{matrix}

and it is void for some values of

α

and

x_{0}

. If g is symmetric, the interval adopts the form

[〈 q_{α}^{*} (Y | X), {\overset{˚}{x}}_{0} 〉, 〈 q_{1 - α}^{*} (Y | X), {\overset{˚}{x}}_{0} 〉]

.

After Proposition 3, it is immediate that the univariate distorted M-quantile conditional regions are affine equivariant, since

Q_{α}^{*} (b + a Y | X = x_{0}) = b + a Q_{α}^{*} (Y | X = x_{0})

for

a, b \in R

.

5.2.2. General (Multivariate) Response

If the response

Y

is a d-dimensional random vector, we define the distorted M-quantile regression regions in terms of intersections of halfspaces. Specifically, all univariate projections of the response are obtained after multiplication times every element in the unit sphere

S^{d - 1}

. The regression hyperplane of all the projections of the response are obtained, and finally the regression regions are produced as intersections of halfspaces whose boundaries are precisely the regression hyperplanes,

Q_{α}^{*} (Y | X) = ⋂_{u \in S^{d - 1}} \{(x, y) \in R^{p + d} : 〈 y, u 〉 \leq 〈 q_{1 - α}^{*} (〈 Y, u 〉 | X), \overset{˚}{x} 〉\} .

The distorted M-quantile conditional region of level

α

and given that the regressors assume value

x_{0} \in R^{p}

is the subset of

R^{d}

obtained as the projection on the last d coordinates of the intersection of the distorted M-quantile regression region

Q_{α}^{*} (Y | X)

with the d-dimensional affine space constituted by all points whose first p coordinates are identical to

x_{0}

,

\begin{matrix} Q_{α}^{*} (Y | X = x_{0}) = & {y \in R^{d} : (x_{0}, y) \in Q_{α}^{*} (Y | X)} \\ = & ⋂_{u \in S^{d - 1}} \{y \in R^{d} : 〈 y, u 〉 \leq 〈 q_{1 - α}^{*} (〈 Y, u 〉 | X), {\overset{˚}{x}}_{0} 〉\} . \end{matrix}

Despite the fact that distorted M-quantile conditional regions might be void for certain values of

α

and

x_{0}

, they still satisfy some of the usual properties of the classical central regions:

Compactness and convexity. $Q_{α}^{*} (Y | X = x_{0})$ is a compact and convex set in $R^{d}$ .
Affine invariance. $Q_{α}^{*} (A Y + b | X = x_{0}) = A Q_{α}^{*} (Y | X = x_{0}) + b$ for any non-singular matrix $A \in R^{d \times d}$ and vector $b \in R^{d}$ .

It is immediate that these properties are derived from the ones of the M-quantile regression hyperplanes. Observe that, among the properties of the conditional regions, there is not a nesting one. The reason is that the regression hyperplanes at different levels might well intersect and, consequently, one regression or conditional region of some given level might contain some points that do not lie inside the corresponding region of a lower level.

5.3. Distorted M-Quantile Conditional Depth

Once the conditional regions have been defined, the distorted M-quantile conditional depth of a point

y \in R^{d}

given that the regressors assume value

x_{0} \in R^{p}

is obtained, as usual, by

{QD}^{*} (y; Y | X = x_{0}) = sup {0 < α \leq 1 : y \in Q_{α}^{*} (Y | X = x_{0})} .

Notice that this depth can also be defined in terms of the distorted M-quantile regression regions as the supremum of all levels

α

, such that

(x_{0}, y) \in Q_{α}^{*} (Y | X)

.

Intuitively, the so-defined distorted M-quantile conditional depth satisfies some of the usual properties of a depth function:

Affine invariance. The conditional depth of a point $y \in R^{d}$ is independent of the underlying coordinate system in the space of the response variables, that is, for any non-singular matrix $A \in R^{d \times d}$ and $b \in R^{d}$ , it holds ${QD}^{*} (A y + b; A Y + b | X = x_{0}) = {QD}^{*} (y; Y | X = x_{0})$ .
Upper semi-continuity. The function ${QD}^{*} (\cdot; Y | X = x_{0})$ is upper semi-continuous.
Vanishing at infinity. The depth of a point $y$ goes to zero as $∥ y ∥ \to \infty$ .

5.4. Sample Distorted M-Quantile Conditional Regions and Depth

Consider a sample of n joint observations of a d-dimensional response variable

y_{i}

and a p-dimensional predictor

x_{i}

for

1 \leq i \leq n

, let

u \in S^{d - 1}

be any unit vector and denote by

π_{u} : {1, \dots, n} \to {1, \dots, n}

the permutation that sorts the sample points in such a way that the univariate projections of the response variables are sorted in an increasing manner,

〈 y_{π_{u (1)}}, u 〉 \leq \dots \leq 〈 y_{π_{u (n)}}, u 〉

. Given

0 < α \leq 1

, the sample distorted M-quantile hyperplane of order r in direction

u

is determined by the coefficients

θ = (θ_{0}, \dots, θ_{p})

that minimize the expression

\begin{matrix} arg min_{θ \in R^{p + 1}} α \sum_{i = 1}^{n} w_{i} {[〈 y_{π_{u (i)}}, u 〉 - 〈 {\overset{˚}{x}}_{π_{u (i)}}, θ 〉]}_{+}^{r} + (1 - α) \sum_{i = 1}^{n} w_{i} {[〈 y_{π_{u (i)}}, u 〉 - 〈 {\overset{˚}{x}}_{π_{u (i)}}, θ 〉]}_{-}^{r}, \end{matrix}

(11)

which are denoted by

q_{n, α}^{*} (u)

. As usual,

w_{i} = \tilde{g} (i / n) - \tilde{g} ((i - 1) / n) = g (1 - (i - 1) / n) - g (1 - i / n)

for all i.

The sample (distorted) M-quantile conditional region at level

α

given

x_{0} \in R^{p}

is obtained as

Q_{n, α}^{*} (x_{0}) = ⋂_{u \in S^{d - 1}} \{y \in R^{d} : 〈 y, u 〉 \leq 〈 q_{n, 1 - α}^{*} (u), {\overset{˚}{x}}_{0} 〉\},

while the depth of

y \in R^{d}

given

x_{0}

is

{QD}_{n}^{*} (y | x_{0}) = sup {0 < α \leq 1 : y \in Q_{n, α}^{*} (x_{0})} .

We have eluded the regression regions in this empirical section, since their basic goal is to serve in the construction of the conditional regions, but they can be simply introduced in the same manner of their population counterparts.

In Figure 3 we have represented the average of the educational index (

Y_{1}

), the life expectancy (

Y_{2}

) and the income index (X) as defined in http://hdr.undp.org/en/data (accessed on 9 May 2022). This database contains information about human development from countries all over the world with records ranging between 1990 and 2019. The conditional regions were built given the

0.05

,

0.5

, and

0.95

quantiles of X by considering M-quantile regression models with several power loss functions, including

r = 1

(halfspace depth regions) and

r = 2

(expectile regions). The values of

α

are taken as 10 equispaced points in the interval

[0, 0.45]

. Notice that some of those regions remain unplotted since they happen to be void. The identity distortion function (undistorted regions) has been used to obtain the charts on the left, while the charts on the right were built using the trim distortion function

g_{0.1}

with

10 %

of trimming. Remember that the undistorted regions were already presented in [10,15].

At first sight, all the charts suggest that the educational index (

Y_{1}

) and life expectancy (

Y_{2}

) are both positively associated with the income index (X). A closer inspection to the shape of the conditional regions also suggests that the educational index and the life expectancy are positively associated as well, which is more evident given the extreme values of the income index (see the regions conditioned on the

0.05

and

0.95

quantiles of X). Furthermore, the volume of the regions suggests that the higher the value of the regressor, the lower the variability of the conditional distribution of the response variables, which is particularly evident when considering the trimming level of

10 %

.

6. Conclusions

After a short introduction to the notion of distorted M-quantiles with power loss functions, emphasizing the distorted expectiles, some novel notions of central regions and data depth are proposed. Both the distorted M-quantile depth and regions meet the classical properties required to any depth function and family of central regions, that is, the distorted M-quantile depth function is affine invariant, attains a unique maximum value at the center of the distribution (whenever such a center exists), decreases through rays from the deepest point, and vanishes at infinity, whereas their associated regions form a family of compact, convex, and nested sets which are equivariant under affine transformations. The sample distorted M-quantile depth and regions were introduced and some algorithm is provided for their computation in the bivariate case.

The distorted M-quantile conditional regions, obtained through a multiple output M-quantile regression model, seem to capture relevant information of the joint distribution of the explained variables conditioned to a given value of the regressors. Henceforth, we consider that the conditional regions, together with their associated depth, could be useful as a multivariate data analysis tool. In the future, we plan to design specific algorithms for the computation of these conditional depths.

Author Contributions

Conceptualization, I.C.; methodology, I.C. and M.O.; software, I.C. and M.O.; validation, M.O.; formal analysis, I.C. and M.O.; investigation, I.C. and M.O.; resources, I.C. and M.O.; data curation, M.O.; writing—original draft preparation, I.C. and M.O.; writing—review and editing, I.C. and M.O.; visualization, M.O.; supervision, I.C.; project administration, I.C.; funding acquisition, I.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the Spanish Ministry of Science and Innovation under grant PID2021-123592OB-I00 and the V Regional Plan for Scientific Research and Technological Innovation 2016–2020 of the Community of Madrid, an agreement with Universidad Carlos III de Madrid in the action of “Excellence for University Professors”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data used in Figure 3 is available at http://hdr.undp.org/en/data (accessed on 9 May 2022).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Bivariate Distorted Expectile Depth Routine

The computation of the exact (undistorted) bivariate expectile depth of a point can be performed by means of an algorithm with time complexity

O (n log n)

, where n is the sample size, see [9]. Similarly to the algorithms for the bivariate halfspace and simplicial depths in [27], such routine moves along the rays with origin in the point whose depth is computed and passing through the points in the dataset, and its time complexity is determined by the one of the algorithm that sorts the angles formed by such rays and some fixed reference ray.

The computation of the distorted expectile depth is slightly more complicated due to the (possibly) different weights awarded to each data point by the distortion function for each univariate projection. Assume that we want to compute the depth of the origin of coordinates

0 \in R^{2}

, or alternatively subtract the point whose depth is to be computed from each of the points in the sample to reach that situation. Since all possible orderings of the univariate projections of the sample points must be considered, we use a circular sequence algorithm, which serves quite efficiently for this task.

In first place, the angles formed by each of the

(\binom{n}{2})

straight lines containing each pair of points from the sample with coordinate axis X are computed, as well as the angles formed by the coordinate axis and the rays from the origin of coordinates passing through the points in the dataset, and they are sorted, which limits the best possible time complexity to

O (n^{2} log n)

. Then, these angles are used to obtain all possible sortings in terms of a univariate projection and the sign of the projection itself. This type of algorithms are commonly used to compute the extreme points of a bivariate central region, see [28] for the halfspace regions, ref. [29] for the zonoid regions, ref. [30] for the expected convex hull regions, or [9] for the expectile regions. Nevertheless it has also been used in the computation of some notions of depth, like the bivariate projection depth in [31].

Appendix A.1. Preliminaries

Consider a sample

x_{1}, \dots, x_{n} \in R^{d}

, a distortion function g, and

r = 2

. After Theorem 2, the sample distorted expectile depth of the origin of coordinates with respect to the previous sample is

{QD}_{n}^{*} (0) = {(1 + sup_{u \in S^{d - 1}} \frac{\sum w_{i} {〈 x_{π_{u} (i)}, u 〉}_{-}}{\sum w_{i} {〈 x_{π_{u} (i)}, u 〉}_{+}})}^{- 1} .

(A1)

In the bidimensional setting,

d = 2

, we can write the unit vector

u \in S^{1}

as

u = (cos γ, sin γ)

for some

γ \in [0, 2 π)

. Consider the two points

{\bar{x}}_{u_{+}}^{w} = \sum_{〈 x_{π_{u} (i)}, u 〉 > 0} w_{i} x_{π_{u} (i)} and {\bar{x}}_{u_{-}}^{w} = \sum_{〈 x_{π_{u} (i)}, u 〉 < 0} - w_{i} x_{π_{u} (i)},

(A2)

which correspond to the weighted sums of the sample points ordered according to their univariate projections times

u

, in first place in the halfspace with inner normal vector

u

and then in the halfspace with inner normal vector

- u

.

Write now the former points as their norms times a unit vector, which in terms of angles

ϕ_{+}

and

ϕ_{-}

correspond to

{\bar{x}}_{u_{+}}^{w} = ∥ {\bar{x}}_{u_{+}}^{w} ∥ (cos ϕ_{+}, sin ϕ_{+})

and

{\bar{x}}_{u_{-}}^{w} = ∥ {\bar{x}}_{u_{-}}^{w} ∥ (cos ϕ_{-}, sin ϕ_{-})

, then (A1) renders:

{QD}_{n}^{*} (0) = {(1 + sup_{0 \leq γ < 2 π} \frac{∥ {\bar{x}}_{u_{-}}^{w} ∥}{∥ {\bar{x}}_{u_{+}}^{w} ∥} \frac{cos (γ - ϕ_{-})}{cos (γ - ϕ_{+})})}^{- 1} .

(A3)

Observe next that

{\bar{x}}_{u_{-}}^{w}

and

{\bar{x}}_{u_{+}}^{w}

for

u = (cos γ, sin γ)

depend on

γ

, but are constant over each subinterval of

[0, 2 π)

at which the sign of each

〈 x_{i}, u 〉

remains unchanged and also the permutation

π_{u}

remains fixed. The second part of the function to be maximized is

f (γ) = \frac{cos (γ - ϕ_{-})}{cos (γ - ϕ_{+})},

whose monotonicity depends only on the sign of the sine of

ϕ_{-} - ϕ_{+}

, since the derivative of f is

f^{'} (γ) = \frac{sin (ϕ_{-} - ϕ_{+})}{{cos}^{2} (γ - ϕ_{+})} .

This means that the maximum at each subinterval will be attained for

γ

lying at one of its endpoints.

Finally, the proposed algorithm will consider the partition of

[0, 2 π)

with breaking points at the angles for which either

〈 x_{i}, u 〉 = 0

for some

i = 1, \dots, n

or

〈 x_{i}, u 〉 = 〈 x_{j}, u 〉

for some

i \neq j

.

Appendix A.2. The Routine

The source code for the bivariate distorted expectile depth computation in R is available on the GitHub repository https://github.com/icascos/expdepth as function distexpdepth (available since 28 June 2022).

The input consist in a matrix

X

of size

n \times 2

with all data points in general position (no more than two points in the same straight line) and a distortion function g, while the output is the distorted expectile depth of the origin of coordinates for the given data sample and distortion function.

Algorithm A1: Bivariate distorted expectile depth

Remark A1.

Since there are n points,

n + (\binom{n}{2})

angles must be considered, hence computing and ordering the angles of each pair of points can be performed with an algorithm of time complexity

O (n^{2} log n)

. The main loop is run precisely

2 n + 2 (\binom{n}{2})

times, but the number of operations at each iteration does not depend on the sample size n, since only one point is being added, subtracted, or reweighted.

Remark A2.

If the distortion function is symmetric,

g = \tilde{g}

, it is possible to skip step (*) as long as every time we update Ratio, we also compare with the inverse of the quotient of the projections of the weighted sums of the data points at each of the two halfplanes.

References

Tukey, J. Mathematics and the picturing of data. Proc. Int. Congr. Math. 1975, 2, 523–531. [Google Scholar]
Liu, R.Y. On a notion of data depth based on random simplices. Ann. Stat. 1990, 18, 405–414. [Google Scholar] [CrossRef]
Rousseeuw, P.; Ruts, I. The depth function of a population distribution. Metrika 1999, 49, 213–244. [Google Scholar]
Koshevoy, G.; Mosler, K. Zonoid trimming for multivariate distributions. Ann. Stat. 1997, 25, 1998–2017. [Google Scholar] [CrossRef]
Cascos, I. Data depth: Multivariate statistics and geometry. In New Perspectives in Stochastic Geometry; Kendall, W.S., Molchanov, I., Eds.; Oxford University Press: Oxford, UK, 2010; pp. 398–423. [Google Scholar]
Liu, R.Y.; Parelius, J.M.; Singh, K. Multivariate analysis by data depth: Descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh). Ann. Stat. 1999, 27, 783–858. [Google Scholar] [CrossRef]
Zuo, Y.; Serfling, R. General notions of statistical depth function. Ann. Stat. 2000, 28, 461–482. [Google Scholar]
Zuo, Y.; Serfling, R. Structural properties and convergence results for contours of sample statistical depth functions. Ann. Stat. 2000, 28, 483–499. [Google Scholar]
Cascos, I.; Ochoa, M. Expectile depth: Theory and computation for bivariate datasets. J. Multivar. Anal. 2021, 184, 104757. [Google Scholar] [CrossRef]
Daouia, A.; Paindaveine, D. From halfspace M-depth to multiple-output expectile regression. arXiv 2019, arXiv:1905.12718. [Google Scholar]
Newey, W.; Powell, J. Asymmetric least squares estimation and testing. Econometrika 1987, 55, 819–847. [Google Scholar] [CrossRef]
Breckling, J.; Chambers, R. M-quantiles. Biometrika 1988, 75, 761–771. [Google Scholar] [CrossRef]
Chaudhuri, P. On a Geometric Notion of Quantiles for Multivariate Data. J. Am. Stat. Assoc. 1996, 91, 862–872. [Google Scholar] [CrossRef]
Koltchinskii, V.I. M-estimation, convexity and quantiles. Ann. Stat. 1997, 25, 435–477. [Google Scholar] [CrossRef]
Hallin, M.; Paindaveine, D.; Siman, M. Multivariate quantiles and multiple output regression quantiles: From L₁ optimization to halfspace depth. Ann. Stat. 2010, 38, 635–669. [Google Scholar] [CrossRef]
Hallin, M.; Lu, Z.; Paindaveine, D.; Siman, M. Local bilinear multiple-output quantile/depth regression. Bernoulli 2015, 21, 1435–1466. [Google Scholar] [CrossRef]
Merlo, L.; Petrella, L.; Salvati, N.; Tzavidis, N. Marginal M-quantile regression for multivariate dependent data. Comput. Stat. Data Anal. 2022, 173, 107500. [Google Scholar] [CrossRef]
Koenker, R.; Basset, G. Regression Quantiles. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
Koenker, R. Quantile Regression; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Bellini, F.; Klar, B.; Müller, A.; Gianin, E.R. Generalized quantiles as risk measures. Insur. Math. Econ. 2014, 54, 41–48. [Google Scholar] [CrossRef]
Denneberg, D. Non-Additive Measure and Integral; Kluwer Academic Publishers: Norwell, MA, USA, 1994. [Google Scholar]
Molchanov, I. Theory of Random Sets, 2nd ed.; Springer: London, UK, 2017. [Google Scholar]
Zuo, Y.; Serfling, R. On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. J. Stat. Plan. Inference 2000, 84, 55–79. [Google Scholar] [CrossRef]
Donoho, D.L.; Gasko, M. Breakdown properties of location estimates based on halfspace eepth and projected outlyingness. Ann. Stat. 1992, 20, 1803–1827. [Google Scholar] [CrossRef]
Liu, R.Y.; Singh, K. A Quality Index Based on Data Depth and Multivariate Rank Tests. J. Am. Stat. Assoc. 1993, 88, 252–260. [Google Scholar]
Pokotylo, O.; Mozharovskyi, P.; Dyckerhoff, R. Depth and Depth-Based Classification with R Package ddalpha. J. Stat. Softw. 2019, 91, 1–46. [Google Scholar] [CrossRef]
Rousseeuw, P.; Ruts, I. Algorithm AS 307: Bivariate Location Depth. J. R. Stat. Soc. Ser. C 1996, 45, 516–526. [Google Scholar] [CrossRef]
Ruts, I.; Rousseeuw, P. Computing depth contours of bivariate point clouds. Comput. Stat. Data Anal. 1996, 23, 153–168. [Google Scholar] [CrossRef]
Dyckerhoff, R. Computing zonoid trimmed regions of bivariate data sets. In COMPSTAT 2000. Proceedings in Computational Statistics; Bethlehem, J., Heijden, P., Eds.; Physica-Verlag: Heidelberg, Germany, 2000; pp. 295–300. [Google Scholar]
Cascos, I. The expected convex hull trimmed regions of a sample. Comput. Stat. 2007, 22, 557–569. [Google Scholar] [CrossRef]
Zuo, Y.; Lai, S. Exact computation of bivariate projection depth and the Stahel–Donoho estimator. Comput. Stat. Data Anal. 2011, 55, 1173–1179. [Google Scholar] [CrossRef]

Figure 1. M-quantile regions including halfspace and expectile ones with

10 %

of trimming and undistorted.

Figure 1. M-quantile regions including halfspace and expectile ones with

10 %

of trimming and undistorted.

Figure 2. Depth-based ranks for a bivariate normal sample with an outlier with the halfspace depth (

r = 1

), the expectile depth (

r = 2

), and two distorted expectile depths.

Figure 2. Depth-based ranks for a bivariate normal sample with an outlier with the halfspace depth (

r = 1

), the expectile depth (

r = 2

), and two distorted expectile depths.

Figure 3. Halfspace and M-quantile conditional regions for various power loss functions.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ochoa, M.; Cascos, I. Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach. Mathematics 2022, 10, 3272. https://doi.org/10.3390/math10183272

AMA Style

Ochoa M, Cascos I. Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach. Mathematics. 2022; 10(18):3272. https://doi.org/10.3390/math10183272

Chicago/Turabian Style

Ochoa, Maicol, and Ignacio Cascos. 2022. "Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach" Mathematics 10, no. 18: 3272. https://doi.org/10.3390/math10183272

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Data Depth and Multiple Output Regression, the Distorted M-Quantiles Approach

Abstract

1. Introduction

2. Preliminaries

2.1. M-Quantiles with Power Loss Function

2.2. Non-Additive Probabilities

2.3. Distortion Functions

3. Distorted M-Quantiles and Expectiles

Sample Distorted M-Quantiles

4. Distorted M-Quantile Central Regions and Depth

4.1. Distorted M-Quantile Central Regions

4.2. Distorted M-Quantile Sample Regions

4.3. Distorted M-Quantile Data Depth Function

4.4. Multivariate Ranks Based on the Distorted M-Quantile Depth

5. Distorted M-Quantile Depth and Regression Models

5.1. Distorted M-Quantile Regression Hyperplanes for a Univariate Response

5.2. Distorted M-Quantile Regression and Conditional Regions

5.2.1. Univariate Response

5.2.2. General (Multivariate) Response

5.3. Distorted M-Quantile Conditional Depth

5.4. Sample Distorted M-Quantile Conditional Regions and Depth

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Bivariate Distorted Expectile Depth Routine

Appendix A.1. Preliminaries

Appendix A.2. The Routine

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI