Inference for Convolutionally Observed Diffusion Processes

Nakakita, Shogo H; Uchida, Masayuki

doi:10.3390/e22091031

Open AccessArticle

Inference for Convolutionally Observed Diffusion Processes

by

Shogo H Nakakita

^1,2,3,*

and

Masayuki Uchida

^1,4

¹

Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyamacho, Toyonaka, Osaka 560-0043, Japan

²

Japanese Society for the Promotion of Science, 5 Chome-3-1 Kōjimachi, Chiyoda City, Tokyo 102-0083, Japan

³

The Ronin Institute for Independent Scholarship, 127 Haddon Place, Montclair, Montclair, NJ 07043, USA

⁴

Center for Mathematical Modeling and Data Science, Osaka University, 1-3 Machikaneyamacho, Toyonaka, Osaka 560-0043, Japan

^*

Author to whom correspondence should be addressed.

Entropy 2020, 22(9), 1031; https://doi.org/10.3390/e22091031

Submission received: 21 August 2020 / Revised: 10 September 2020 / Accepted: 11 September 2020 / Published: 15 September 2020

(This article belongs to the Special Issue Machine Learning Meets Stochastic Processes: New Trends for Understanding Complex Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

We propose a new statistical observation scheme of diffusion processes named convolutional observation, where it is possible to deal with smoother observation than ordinary diffusion processes by considering convolution of diffusion processes and some kernel functions with respect to time parameter. We discuss the estimation and test theories for the parameter determining the smoothness of the observation, as well as the least-square-type estimation for the parameters in the diffusion coefficient and the drift one of the latent diffusion process. In addition to the theoretical discussion, we also examine the performance of the estimation and the test with computational simulation, and show an example of real data analysis for one EEG data whose observation can be regarded as smoother one than ordinary diffusion processes with statistical significance.

Keywords:

convolutional observation; diffusion processes; parametric inference; partial observation; statistical modelling; stochastic differential equations

1. Introduction

We consider a d-dimensional diffusion process defined by the following stochastic differential equation,

\begin{matrix} d X_{t} = b (X_{t}, β) d t + a (X_{t}, α) d w_{t}, X_{- λ} = x_{- λ}, \end{matrix}

where

λ > 0

,

{\{w_{t}\}}_{t \geq - λ}

is a standard r-dimensional Wiener process,

x_{- λ}

is an

R^{d}

-valued random variable independent of

{\{w_{t}\}}_{t \geq - λ}

,

α \in Θ_{1}

and

β \in Θ_{2}

are unknown parameters,

Θ_{1} \subset R^{m_{1}}

and

Θ_{2} \subset R^{m_{2}}

are compact and convex parameter spaces,

a : R^{d} \times Θ_{1} \to R^{d} \otimes R^{r}

and

b : R^{d} \times Θ_{2} \to R^{d}

are known functions. Our concern is statistical estimation for

α

and

β

from observation.

θ_{⋆} = (α_{⋆}, β_{⋆})

denotes the true value of

θ : = (α, β)

.

We denote the observation as the discretised process

\{{\bar{X}}_{i h_{n}, n} : i = 0, \dots, n\}

with discretisation step

h_{n} > 0

such that

h_{n} \to 0

and

T_{n} : = n h_{n} \to \infty

, where the convoluted process

{\{{\bar{X}}_{t, n}\}}_{t \geq 0}

is defined as

\begin{matrix} {\bar{X}}_{t, n} : = \int_{t - \bar{ρ} h_{n}}^{t} V_{h_{n}} (t - s) X_{s} d s = \int_{R} V_{h_{n}} (t - s) X_{s} d s = (V_{h_{n}} * X) (t), \end{matrix}

where

V_{h_{n}}

is an

R^{d} \otimes R^{d}

-valued kernel function whose support is a subset of

[0, \bar{ρ} h_{n}]

, and

\bar{ρ} > 0

such that

{sup}_{n} \bar{ρ} h_{n} \leq λ

. In this paper, we specify

V_{h_{n}} = V_{ρ, h_{n}}

which is a parametric kernel function whose support is a subset of

[0, \bar{ρ} h_{n}]

defined as

\begin{matrix} V_{ρ, h_{n}}^{(i, j)} (t) : = \{\begin{matrix} {(ρ^{(i)} h_{n})}^{- 1} 1_{[0, ρ^{(i)} h_{n}]} (t) & if i = j and ρ^{(i)} > 0, \\ δ (t) & if i = j and ρ^{(i)} = 0, \\ 0 & if i \neq j, \end{matrix} \end{matrix}

δ (t)

is the Dirac-delta function,

ρ = {[ρ^{(1)}, \dots, ρ^{(d)}]}^{T} \in Θ_{ρ} : = {[0, \bar{ρ}]}^{d}

is the smoothing parameter determining the smoothness of observation. That is to say, the observed process is defined as follows:

\begin{matrix} {\bar{X}}_{i h_{n}, n}^{(ℓ)} = \{\begin{matrix} {(ρ^{(ℓ)} h_{n})}^{- 1} \int_{(i - ρ^{(ℓ)}) h_{n}}^{i h_{n}} X_{s}^{(ℓ)} d s & if ρ^{(ℓ)} > 0, \\ X_{i h_{n}}^{(ℓ)} & if ρ^{(ℓ)} = 0, \end{matrix} \end{matrix}

for all

ℓ = 1, \dots, d

. Let us consider both the problems that (i)

ρ

is a known parameter, and (ii)

ρ

is an unknown one whose true value is denoted by

ρ_{⋆}

and this is estimated by observation

\{{\bar{X}}_{i h_{n}, n}\}

, and the parameter space is denoted as

Ξ : = Θ_{ρ} \times Θ

, where

Θ : = Θ_{1} \times Θ_{2}

.

When assuming

ρ

as a known parameter, we can find literature for parametric estimation for

α

and/or

β

based on observation schemes which can be represented as special cases for some specific

ρ

. If

ρ = 0

, our scheme is simply equivalent to parametric inference based on discretely observed diffusion processes

\{X_{i h_{n}} : i = 0, \dots, n\}

studied in [1,2,3,4,5,6,7,8] and references therein. If

ρ = {[1, \dots, 1]}^{T}

, we can regard the problem as parametric estimation for integrated diffusion processes discussed in Gloter [9], Ditlevsen and Sørensen [10], Gloter [11], Gloter and Gobet [12], Sørensen [13]. Even for the case

ρ = {[0, \dots, 0, 1, \dots, 1]}^{T}

where some axes correspond to direct observation and the others do to integrated observation, we give consistent estimators for

α

and

β

by considering the scheme of convolutionally observed diffusion processes and this is one of the contributions of our study.

What is more, our contribution is to consider the scheme where

ρ

is unknown and succeed in representation of the microstructure noise which makes the observation smoother than the latent diffusion process itself, which can be seen in some biomedical time series data. Statistical modelling of biomedical time series data with stochastic differential equations has been one of the topics eagerly studied e.g., [14,15,16,17]. As [18] states, the existence of microstructure noise in financial data affects realised volatilities to increase as the subsampling frequency gets higher, for instance, see Figure 7.1 on p. 217 in [19]. However, realised volatilities of some biomedical data such as EEG decrease as subsampling frequency increases. For instance, some time series data for the 2nd participant in the dataset named Two class motor imagery (002-2014) of BNCI Horizon 2020 [20] show clear tendency of decreasing realised volatilities as subsampling frequency increases. Figure 1 shows the path of the 2nd axis of the data S02E.mat BNCI Horizon 2020 [20] for all 222 seconds (the observation frequency is 512 Hz, and hence the entire data size is 113664) and that for the first one second; it seems to perturb like a diffusion process.

We define realised volatilities with subsampling as for a sequence of real-valued observation

{\{Y_{i}\}}_{i = 0, \dots, n}

,

\begin{matrix} R V (k) = \sum_{1 \leq i \leq [n / k]} {(Y_{i k} - Y_{(i - 1) k})}^{2}, \end{matrix}

where

k = 1, \dots, 100

is the subsampling frequency parameter, and provide a plot of the realised volatilities the 2nd axis of the data S02E.mat in Figure 2:

The altitudes of the graph represented in the y-axis correspond to the values of the realised volatilities

R V (k)

with subsampling at every k observation represented in the x-axis. It is observable that the increasing subsampling frequency results in decreasing realised volatilities, which cannot be explained by the existent major microstructure noises, e.g., see [21,22,23,24,25]. To explain this phenomenon, we consider the smoother process than the latent one, though ordinarily microstructure noises make the observation rougher than the latent process, because quadratic variation of a sufficiently smooth function is zero. One way to deal with smoother observation than the latent state is convolutional observation. As a concrete example, we show a convolutionally observed diffusion process and its characteristics in realised volatilities: let us consider the following 1-dimensional stochastic differential equation defining an Ornstein–Uhlenbeck (OU) process:

\begin{matrix} d X_{t} = - 20 X_{t} d t + 10 d w_{t}, X_{- λ} = 0, \end{matrix}

where

λ = 10^{- 2 / 5}

. We simulate the stochastic differential equation by Euler–Maruyama method see [26] with parameters

n = 10^{7}

,

h_{n} = 10^{- 5}

, and

T_{n} = 10^{2}

and its convolution approximated by summation with the smoothing parameter

ρ = 10

(for details, see Section 5). Figure 3 shows the latent diffusion process and the convoluted observation on

[0, 1]

, and we can see that the observation is indeed smoothed compared to the latent state.

In Figure 4, we also give the plot of realised volatilities of the convolutionally observed process with subsampling as Figure 2.

It is seen that the convolutional observation of a diffusion process also has the characteristics of decreasing realised volatilities as subsampling frequency increases, which can bee seen in some biological data such as BNCI Horizon 2020 [20]. Of course, graphically comparing characteristics of simulation and real data is insufficient to verify the convolutional observation with smoothing parameter

ρ > 0

in 1-dimensional case; therefore, we propose statistical estimation method for unknown

ρ

and hypothesis test with the null hypothesis

H_{0} :

ρ = 0

and the alternative one

H_{1} :

ρ \neq 0

from convolutional observation in Section 3. Moreover, in Section 6, we examine the real EEG data plotted in Figure 2 by the statistical hypothesis testing we propose, and see it is more appropriate to consider the data as a convolutional observation of a latent diffusion process with

ρ \neq 0

rather than direct observation of the latent process, which indicates the validity to deal with the problem of the convolutional observation scheme with unknown

ρ

.

The paper is composed of the following sections: Section 2 gives the notations and assumptions used in this paper; Section 3 discusses the estimation and test for smoothing parameter

ρ

, and the discussion provides us with the tools to examine whether we should consider the convolutional observation scheme; Section 4 proposes the quasi-likelihood functions for the parameter of diffusion processes

α

and

β

, and corresponding estimators with consistency; Section 5 is for the computational simulation to examine the theoretical results in the previous sections; and Section 6 shows an application of the methods we propose in real data analysis.

2. Notations and Assumptions

2.1. Notations

First of all, we set

A (x, α) : = a {(x, α)}^{\otimes 2}

,

a (x) : = a (x, α_{⋆})

,

A (x) : = A (x, α_{⋆})

and

b (x) : = b (x, β_{⋆})

. We also give the notation for a matrix-valued function

G (x, α | ρ)

such that

G^{(i, j)} (x, α | ρ) : = A^{(i, j)} (x, α) f_{G} (ρ^{(i)}, ρ^{(j)})

, where

\begin{matrix} f_{G} (ρ^{(i)}, ρ^{(j)}) \\ : = \{\begin{matrix} 1 & if ρ^{(i)} = ρ^{(j)} = 0, \\ 1 - \frac{ρ^{(j)}}{2} & if ρ^{(i)} = 0, ρ^{(j)} \in (0, 1], \\ \frac{1}{2 ρ^{(j)}} & if ρ^{(i)} = 0, ρ^{(j)} \in (1, \bar{ρ}], \\ 1 - \frac{ρ^{(i)}}{2} & if ρ^{(i)} \in (0, 1], ρ^{(j)} = 0, \\ \frac{1}{2 ρ^{(i)}} & if ρ^{(i)} \in (1, \bar{ρ}], ρ^{(j)} = 0, \\ \frac{- 3 {(ρ^{(i)})}^{2} ρ^{(j)} + 3 ρ^{(i)} {(ρ^{(j)})}^{2} + 6 ρ^{(i)} ρ^{(j)} - 2 {(ρ^{(j)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (0, 1], ρ^{(i)} > ρ^{(j)}, \\ \frac{3 {(ρ^{(i)})}^{2} ρ^{(j)} - 3 ρ^{(i)} {(ρ^{(j)})}^{2} + 6 ρ^{(i)} ρ^{(j)} - 2 {(ρ^{(i)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (0, 1], ρ^{(i)} \leq ρ^{(j)}, \\ \frac{3 {(ρ^{(j)})}^{2} + 3 ρ^{(j)} - {(ρ^{(j)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)} \in (1, \bar{ρ}], ρ^{(j)} \in (0, 1], ρ^{(i)} > ρ^{(j)} + 1, \\ \frac{6 ρ^{(j)} - 1}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (1, \bar{ρ}], ρ^{(i)} > ρ^{(j)} + 1, \\ \frac{{(ρ^{(i)} - ρ^{(j)})}^{3} - 3 {(ρ^{(i)})}^{2} + 6 ρ^{(i)} ρ^{(j)} + 3 ρ^{(i)} - 1 - {(ρ^{(j)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)} \in (1, \bar{ρ}], ρ^{(j)} \in (0, 1], ρ^{(i)} \leq ρ^{(j)} + 1, \\ \frac{{(ρ^{(i)} - ρ^{(j)})}^{3} - 3 {(ρ^{(i)})}^{2} + 6 ρ^{(i)} ρ^{(j)} + 3 ρ^{(i)} - 3 {(ρ^{(j)})}^{2} + 3 ρ^{(j)} - 2}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (1, \bar{ρ}], ρ^{(j)} < ρ^{(i)} \leq ρ^{(j)} + 1, \\ \frac{- {(ρ^{(i)} - ρ^{(j)})}^{3} + 6 ρ^{(i)} ρ^{(j)} - {(ρ^{(i)})}^{3} - 3 {(ρ^{(j)})}^{2} + 3 ρ^{(j)} - 1}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)} \in (0, 1], ρ^{(j)} \in (1, \bar{ρ}], ρ^{(j)} \leq ρ^{(i)} + 1, \\ \frac{- {(ρ^{(i)} - ρ^{(j)})}^{3} - 3 {(ρ^{(i)})}^{2} - 3 {(ρ^{(j)})}^{2} + 6 ρ^{(i)} ρ^{(j)} + 3 ρ^{(i)} + 3 ρ^{(j)} - 2}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (1, \bar{ρ}], ρ^{(i)} \leq ρ^{(j)} \leq ρ^{(i)} + 1, \\ \frac{3 {(ρ^{(i)})}^{2} + 3 ρ^{(i)} - {(ρ^{(i)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)} \in (0, 1], ρ^{(j)} \in (1, \bar{ρ}], ρ^{(j)} > ρ^{(i)} + 1, \\ \frac{6 ρ^{(i)} - 1}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)}, ρ^{(j)} \in (1, \bar{ρ}], ρ^{(j)} > ρ^{(i)} + 1 . \end{matrix} \end{matrix}

The continuity of the function

f_{G}

is shown in the supplementary material. For the detailed discussion of the necessity of

f_{G}

, see Remark 4.

In addition, we also give the notation used throughout this paper.

For every matrix A, $A^{T}$ is the transpose of A, and $A^{\otimes 2} : = A A^{T}$ .
For every set of matrices A and B of the same size, $A [B] : = tr (A B^{T})$ . Moreover, for any $m \in N$ , $A \in R^{m} \otimes R^{m}$ and $u, v \in R^{m}$ , $A [u, v] : = v^{T} A u$ .
$v^{(ℓ)}$ and $A^{(ℓ_{1}, ℓ_{2})}$ denote the ℓ-th element of a vector v and the $(ℓ_{1}, ℓ_{2})$ -th one of a matrix A, respectively.
For any vector v and any matrix A, $|v| : = \sqrt{tr (v^{T} v)}$ and $∥A∥ : = \sqrt{tr (A^{T} A)}$ .
$(Ω, P, F, F_{t})$ denotes the stochastic basis, where $F_{t} : = σ (x_{- λ}, w_{s} : s \leq t)$ .
$μ (f (\cdot))$ denotes the integral of a $μ$ -integrable function f where $μ$ is a measure.

2.2. Assumptions

With respect to

X_{t}

, we assume the following conditions.

[A1]

(i): For a constant C, for all $x_{1}, x_{2} \in R^{d}$ ,

$\begin{matrix} sup_{α \in Θ_{1}} ∥a (x_{1}, α) - a (x_{2}, α)∥ + sup_{β \in Θ_{2}} |b (x_{1}, β) - b (x_{2}, β)| \leq C |x_{1} - x_{2}| . \end{matrix}$
(ii): For all $p \geq 0$ , ${sup}_{t \geq - λ} E_{θ_{⋆}} [{|X_{t}|}^{p}] < \infty$ .
(iii): There exists a unique invariant measure $ν_{0}$ on $(R^{d}, B (R^{d}))$ and for all $p \geq 1$ and $f \in L^{p} (ν_{0})$ with polynomial growth,

$\begin{matrix} \frac{1}{T} \int_{- λ}^{T} f (X_{t}) d t \to^{P} \int_{R^{d}} f (x) ν_{0} (d x) . \end{matrix}$

Remark 1.

For sufficient conditions of these regularity ones, please see [A1] and Remark 1 in Uchida and Yoshida [7].

[A2]

There exists

C > 0

such that

a : R^{d} \times Θ_{1} \to R^{d} \otimes R^{r}

and

b : R^{d} \times Θ_{2} \to R^{d}

have continuous derivatives satisfying

\begin{matrix} sup_{α \in Θ_{1}} |\partial_{x}^{j} \partial_{α}^{i} a (x, α)| & \leq C {(1 + |x|)}^{C}, 0 \leq i \leq 2, 0 \leq j \leq 2, \\ sup_{β \in Θ_{2}} |\partial_{x}^{j} \partial_{β}^{i} b (x, β)| & \leq C {(1 + |x|)}^{C}, 0 \leq i \leq 2, 0 \leq j \leq 2 . \end{matrix}

With the invariant measure

ν_{0}

,

ξ : = {[ρ^{T}, θ^{T}]}^{T}

, and

ξ_{⋆}

denoting the true value of

ξ

, we define

\begin{matrix} V_{1} (α | ξ_{⋆}) & : = - \int_{R^{d}} {∥G (x, α | ρ_{⋆}) - G (x, α_{⋆} | ρ_{⋆})∥}^{2} ν_{0} (d x), \\ V_{2} (β | ξ_{⋆}) & : = - \int_{R^{d}} {|b (x, β) - b (x, β_{⋆})|}^{2} ν_{0} (d x) . \end{matrix}

For these functions, let us assume the following identifiability conditions hold.

[A3]

There exist

χ_{1} (α_{⋆}) > 0

and

χ_{1}^{'} (β_{⋆}) > 0

such that for all

α \in Θ_{1}

and

β \in Θ_{2}

,

V_{1} (α | ξ_{⋆}) \leq - χ_{1} (α_{⋆}) {|α - α_{⋆}|}^{2}

and

V_{2} (β | ξ_{⋆}) \leq - χ_{1}^{'} (β_{⋆}) {|β - β_{⋆}|}^{2}

.

3. Estimation and Test of the Smoothing Parameter

In this section, we discuss the case where the smoothing parameter

ρ

of the kernel function

V_{ρ, h_{n}}

is unknown. The estimation is significant for estimation of

α

and

β

since we utilise the estimate for

ρ

in quasi-likelihood functions of

α

and

β

. The test problem for hypotheses

H_{0} : ρ = 0

and

H_{1} : ρ \neq 0

is also important to examine whether our framework of convolutional observation is meaningful.

3.1. Estimation of the Smoothing Parameter

For simplicity of notation, let us consider the case

\bar{ρ} > 2

; otherwise the discussion is quite parallel. We should note that for all

i = 1, \dots, d

,

\begin{matrix} G^{(i, i)} (x | ρ) = \{\begin{matrix} A^{(i, i)} (x) & if ρ^{(i)} = 0, \\ A^{(i, i)} (x) (1 - \frac{ρ^{(i)}}{3}) & if ρ^{(i)} \in (0, 1], \\ A^{(i, i)} (x) (\frac{1}{ρ^{(i)}} - \frac{1}{3 {(ρ^{(i)})}^{2}}) & if ρ^{(i)} \in (1, \bar{ρ}] . \end{matrix} \end{matrix}

Let us consider the estimation of

ρ^{(i)}

with using the next statistics: the full quadratic variation

\begin{matrix} \frac{1}{n h_{n}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{2} \to^{P} \{\begin{matrix} ν_{0} (A^{(i, i)} (\cdot)) & if ρ_{⋆}^{(i)} = 0, \\ ν_{0} (A^{(i, i)} (\cdot)) (1 - \frac{ρ_{⋆}^{(i)}}{3}) & if ρ_{⋆}^{(i)} \in (0, 1], \\ ν_{0} (A^{(i, i)} (\cdot)) (\frac{1}{ρ_{⋆}^{(i)}} - \frac{1}{3 {(ρ_{⋆}^{(i)})}^{2}}) & if ρ_{⋆}^{(i)} \in (1, \bar{ρ}], \end{matrix} \end{matrix}

because of Proposition 3 in supplementary material Appendix A, and the reduced quadratic variation defined as

\frac{1}{n h_{n}} \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2}

converges in probability as follows.

Lemma 1.

Under [A1], we have the convergence in probability such that

\begin{matrix} \frac{1}{n h_{n}} \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2} \\ \to^{P} \{\begin{matrix} ν_{0} (A^{(i, i)} (\cdot)) & if ρ_{⋆}^{(i)} = 0, \\ ν_{0} (A^{(i, i)} (\cdot)) (1 - \frac{ρ_{⋆}^{(i)}}{6}) & if ρ_{⋆}^{(i)} \in (0, 2], \\ ν_{0} (A^{(i, i)} (\cdot)) (\frac{2}{ρ_{⋆}^{(i)}} - \frac{4}{3 {(ρ_{⋆}^{(i)})}^{2}}) & if ρ_{⋆}^{(i)} \in (2, \bar{ρ}] . \end{matrix} \end{matrix}

Then we define the ratio of those statistics such that

\begin{matrix} R_{n}^{(i)} & : = (\frac{1}{n h_{n}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{2}) {(\frac{1}{n h_{n}} \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2})}^{- 1} \\ \to^{P} \{\begin{matrix} 1 & if ρ_{⋆}^{(i)} = 0, \\ (1 - \frac{ρ_{⋆}^{(i)}}{3}) {(1 - \frac{ρ_{⋆}^{(i)}}{6})}^{- 1} & if ρ_{⋆}^{(i)} \in (0, 1], \\ (\frac{1}{ρ_{⋆}^{(i)}} - \frac{1}{3 {(ρ_{⋆}^{(i)})}^{2}}) {(1 - \frac{ρ_{⋆}^{(i)}}{6})}^{- 1} & if ρ_{⋆}^{(i)} \in (1, 2], \\ (\frac{1}{ρ_{⋆}^{(i)}} - \frac{1}{3 {(ρ_{⋆}^{(i)})}^{2}}) {(\frac{2}{ρ_{⋆}^{(i)}} - \frac{4}{3 {(ρ_{⋆}^{(i)})}^{2}})}^{- 1} & if ρ_{⋆}^{(i)} \in (2, \bar{ρ}] \end{matrix} \\ = \{\begin{matrix} 1 & if ρ^{(i)} = 0, \\ (6 - 2 ρ_{⋆}^{(i)}) {(6 - ρ_{⋆}^{(i)})}^{- 1} & if ρ_{⋆}^{(i)} \in (0, 1], \\ (6 ρ_{⋆}^{(i)} - 2) {(6 {(ρ_{⋆}^{(i)})}^{2} - {(ρ_{⋆}^{(i)})}^{3})}^{- 1} & if ρ_{⋆}^{(i)} \in (1, 2], \\ (3 ρ_{⋆}^{(i)} - 1) {(6 ρ_{⋆}^{(i)} - 4)}^{- 1} & if ρ_{⋆}^{(i)} \in (2, \bar{ρ}], \end{matrix} \\ = : R (ρ_{⋆}^{(i)}), \end{matrix}

where R has the next property.

Lemma 2.

R is a

[(3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, 1]

-valued monotonically decreasing continuous function, and has a continuous inverse

R^{- 1} : [(3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, 1] \to [0, \bar{ρ}]

.

We define

{\hat{ρ}}_{n}

such that

\begin{matrix} {\hat{ρ}}_{n}^{(i)} & : = \{\begin{matrix} 0 & if R_{n}^{(i)} > 1, \\ R^{- 1} (R_{n}^{(i)}) & if R_{n}^{(i)} \in [(3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, 1], \\ \bar{ρ} & if R_{n}^{(i)} < (3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, \end{matrix} \end{matrix}

and then continuous mapping theorem for convergence in probability verifies the next result.

Theorem 1.

Under [A1],

{\hat{ρ}}_{n}

has consistency, i.e.,

{\hat{ρ}}_{n} \to^{P} ρ_{⋆}

.

Remark 2.

We can compute

y = R^{- 1} (x), x \in [(3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, 1]

by solving the following equations:

\begin{matrix} (i) & x = (6 - 2 y) {(6 - y)}^{- 1} & if x \in (4 / 5, 1], \\ (i i) & x = (6 - 2 y) {(6 y^{2} - y^{3})}^{- 1} & if x \in (5 / 8, 4 / 5], \\ (i i i) & x = (3 y - 1) {(6 y - 4)}^{- 1} & if x \in [(3 \bar{ρ} - 1) {(6 \bar{ρ} - 4)}^{- 1}, 5 / 8] . \end{matrix}

3.2. Test for Smoothed Observation

For all

i = 1, \dots, d

, we consider the next hypothesis testing:

\begin{matrix} H_{0} : ρ^{(i)} = 0, H_{1} : ρ^{(i)} > 0 . \end{matrix}

Let us consider the following test statistic:

\begin{matrix} T_{i, n} & : = \sqrt{\frac{n}{\frac{2}{3 n h_{n}^{2}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{4}}} \\ \times (\frac{1}{n h_{n}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{2} - \frac{1}{n h_{n}} \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2}) \\ = \sqrt{\frac{3 / 2}{\sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{4}}} \\ \times (\sum_{k = 1}^{n} {({\bar{X}}_{k, n}^{(i)} - {\bar{X}}_{k - 1, n}^{(i)})}^{2} - \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2}), \end{matrix}

and we abbreviate

T_{i, n}

to

T_{n}

if

d = 1

. Under

H_{0}

, we have the next result.

Theorem 2.

Under

H_{0}

and [A1], we have the convergence in law such that

\begin{matrix} T_{i, n} \to^{L} N (0, 1) . \end{matrix}

We also obtain the result to support the consistency of the test.

Theorem 3.

Under

H_{1}

and [A1], we have convergence such that for any

c \in R

,

\begin{matrix} P (T_{i, n} < c) \to 1 . \end{matrix}

Remark 3.

These results are intuitive since the quadratic variation and the reduced one with some appropriate scaling should converge to the same value if

H_{0}

holds and the quadratic variation with some appropriate scaling should converge to the value which is smaller than the value which the reduced one with scaling converge to.

Hence when we set the significance level

α_{sig} \in (0, 1)

, then we have the rejection region

\begin{matrix} T_{i, n} < Φ^{- 1} (α_{sig}) \end{matrix}

where

Φ

is the distribution function of the standard Gaussian distribution. Theorem 3 supports the consistency of the test.

This test is essential in terms of examining the validity to consider the scheme of convolutional observation: if

ρ = 0

, then the ordinary observation scheme can be applied, but if

ρ \neq 0

, then we have the motivation to consider the convolutional observation scheme.

4. Least Square Estimation of the Diffusion and Drift Parameters

Let us set the least-square quasi-likelihood functions such that

\begin{matrix} H_{1, n} (α | ρ) & : = - \sum_{k = 1}^{n} {∥\frac{1}{h_{n}} {({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2} - G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ)∥}^{2}, \\ H_{2, n} (β | ρ) & : = - \sum_{k = [{max}_{i} ρ^{(i)}] + 2}^{n} \frac{1}{h_{n}} {|{\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n} - h_{n} b ({\bar{X}}_{(k - 2 - [{max}_{i} ρ^{(i)}]) h_{n}, n}, β)|}^{2}, \end{matrix}

and the least-square estimators

{\hat{α}}_{n}

and

{\hat{β}}_{n}

satisfying

\begin{matrix} H_{1, n} ({\hat{α}}_{n} | ρ_{⋆}) = sup_{α \in Θ_{1}} H_{1, n} (α | ρ_{⋆}), H_{2, n} ({\hat{β}}_{n} | ρ_{⋆}) = sup_{β \in Θ_{2}} H_{2, n} (β | ρ_{⋆}) . \end{matrix}

when

ρ_{⋆}

is known, and

\begin{matrix} H_{1, n} ({\hat{α}}_{n} | {\hat{ρ}}_{n}) = sup_{α \in Θ_{1}} H_{1, n} (α | {\hat{ρ}}_{n}), H_{2, n} ({\hat{β}}_{n} | {\hat{ρ}}_{n}) = sup_{β \in Θ_{2}} H_{2, n} (β | {\hat{ρ}}_{n}) . \end{matrix}

when

ρ_{⋆}

is unknown.

Theorem 4.

Under [A1]–[A3],

{\hat{α}}_{n}

and

{\hat{β}}_{n}

are consistent, i.e.,

{\hat{α}}_{n} \to^{P} α_{⋆}

and

{\hat{β}}_{n} \to^{P} β_{⋆}

.

Remark 4.

The function

G

and

f_{G}

are indeed complex and confusing; hence, we can consider some alternative estimation methods with subsampling or pre-averaging. However, these methods also have the problem what size of subsampling or pre-averaging is proper and the result of the estimation can be dependent on tuning the subsampling size or pre-averaging one. Therefore, our work proposes the estimation method which uses the observation without subsampling or pre-averaging.

5. Simulations

In this simulation section, we only consider the case where

ρ

is unknown and should be estimated by data with the method proposed in Section 3.

5.1. 1-Dimensional Simulation

We examine the following 1-dimensional stochastic differential equation whose solution is a 1-dimensional Ornstein–Uhlenbeck (OU) process:

\begin{matrix} d X_{t} = (β^{(1)} X_{t} + β^{(2)}) d t + α d w_{t}, X_{- λ} = 0, \end{matrix}

α \in Θ_{1} : = [0.01, 10]

,

β \in Θ_{2} : = [- 10, - 0.01] \times [- 10, 10]

, and

λ = 10^{- 7 / 3}

. The procedure of the simulation is as follows: in the first place we iterate an approximated OU process by Euler–Maruyama scheme, for example, see [26] with simulation parameters

n_{sim} = 10^{5 + m}

,

h_{sim} = 10^{- 10 / 3 - m}

,

T_{sim} = 10^{5 / 3}

where

m \in N

is a parameter to determine the precision of approximation; secondly, we give the approximation of convolution by summation such that

\begin{matrix} {\bar{X}}_{i h_{n}, n} ≊ \{\begin{matrix} \frac{1}{[10^{m} ρ]} \sum_{k = 0}^{10^{m} - 1} X_{i h_{n} - k h_{sim}} & if [10^{m} ρ] \geq 1, \\ X_{i h_{n}} & if [10^{m} ρ] < 0, \end{matrix} \end{matrix}

where

i = 0, \dots, n

, the sampling frequency

h_{n} = 10^{- 10 / 3}

and

n = 10^{5 / 3}

. In this Section 5.1, we fix the true value of

α

and

β

as

α_{⋆} = 3

and

β_{⋆} = {[- 2, 1]}^{T}

, and change the true value of

ρ \in Θ_{ρ} : = [0, 100]

to see the corresponding changes of performance of estimation for

ξ

, and test for

ρ

in comparison to estimation by an existent method called local Gaussian approximation (LGA) for parametric estimation of discretely observed diffusion processes, e.g., see [4] which does not concern convolutional observation. All the numbers of iterations for different

ρ

’s are 1000.

In the first place, we see the estimation and test with small values of

ρ_{⋆}

such that

ρ_{⋆} = 0, 0.1, 0.2, \dots, 1

to observe how the performance of statistics changes by difference in

ρ

. Table 1 summarises the results of simulation of

{\hat{ρ}}_{n}

for

ρ

’s with respective empirical means and root mean square error (RMSE).

We can see the proposed estimator

{\hat{ρ}}_{n}

works well for small

ρ

. With respect to the performance of the test statistic

T_{n}

proposed in Section 3.2, Table 2 shows the empirical ratio of the number of iterations whose

T_{n}

is lower than some typical critical values where

Φ

indicates the distribution function of 1-dimensional standard Gaussian distribution as well as the maximum value of

T_{n}

in 1000 iterations.

Even for

ρ = 0.1

, the simulation result supports the theoretical discussion of the test with consistency. Because

Φ (10^{- 16}) = - 8.222

, all the iterations with

ρ \geq 0.3

result in rejection of

H_{0}

with substantially significance level

10^{- 16}

. Let us see the estimation for

α

and

β

by our proposal method and that by the LGA in Table 3.

Note that the biases of the estimation by LGA increase as the true value of

ρ

gets larger, while the estimation by our proposal method is not influenced by the true value of

ρ

. This result of the simulation supports the theoretical discussion in Section 4 stating the consistency of

{\hat{θ}}_{n}

, and necessity to consider the convolutional observation scheme where the LGA method does not work properly.

Secondly, we consider the estimation and test with large

ρ_{⋆}

such that

ρ_{⋆}

= 10, 15, 20 to see if our proposal methods work even for large

ρ

. We note that the maximum values of

T_{n}

for

ρ

= 10, 15, 20 in 1000 iterations are

- 55.091

,

- 68.462

and

- 79.105

, and hence we can detect the smoothed observation easily. Table 4 shows the empirical means and RMSEs of

{\hat{ρ}}_{n}

for

ρ

= 10, 15, 20 and we can see that the RMSEs increase as

ρ

’s increase; it indicates the difficulty to estimate

ρ

accurately when

ρ_{⋆}

is large.

Table 5 summarises the estimation for

θ

by means and RMSE, and tells us that although the large RMSE of

{\hat{ρ}}_{n}

results in increase of RMSE of

{\hat{α}}_{n}

, estimation by our method is substantially better than that by LGA.

5.2. 2-Dimensional Simulation

We consider the following 2-dimensional stochastic differential equation whose solution is a 2-dimensional OU process:

\begin{matrix} d [\begin{matrix} X_{t}^{(1)} \\ X_{t}^{(2)} \end{matrix}] & = ([\begin{matrix} β^{(1)} & β^{(2)} \\ β^{(4)} & β^{(5)} \end{matrix}] [\begin{matrix} X_{t}^{(1)} \\ X_{t}^{(2)} \end{matrix}] + [\begin{matrix} β^{(3)} \\ β^{(6)} \end{matrix}]) d t + [\begin{matrix} α^{(1)} & α^{(2)} \\ α^{(2)} & α^{(3)} \end{matrix}] d w_{t}, X_{- λ} = [\begin{matrix} 0 \\ 0 \end{matrix}], \end{matrix}

λ = 10^{- 7 / 3}

. The simulation is conducted with the settings as follows: firstly, we iterate the OU process by Euler–Maruyama scheme with the simulation sample size

n_{sim} = 10^{5 + m}

,

T_{sim} = 10^{5 / 3}

and discretisation step

h_{sim} = 10^{- 10 / 3 - m}

, where

m = 2

is the precision parameter for approximation of convolution; in the second place, we approximate the convoluted process with summation such that

\begin{matrix} {\bar{X}}_{i h_{n}, n}^{(j)} ≊ \{\begin{matrix} \frac{1}{[10^{m} ρ^{(j)}]} \sum_{k = 0}^{10^{m} - 1} X_{i h_{n} - k h_{sim}}^{(j)} & if [10^{m} ρ^{(j)}] \geq 1, \\ X_{i h_{n}}^{(j)} & if [10^{m} ρ^{(j)}] < 0, \end{matrix} \end{matrix}

where

i = 0, \dots, n

,

j = 1, 2,

the sampling scheme for inference is defined as

n = 10^{5}

and

h_{n} = 10^{- 10 / 3}

; the true value of

ρ

,

α

and

β

are set as

ρ_{⋆} = {[2, 4]}^{T}

,

α_{⋆} = {[2, 0, 3]}^{T}

,

β_{⋆} = {[- 2, - 0.4, 0, 0.1, - 3, 5]}^{T}

; the parameter spaces are defined as

Θ_{ρ} = {[0, 10]}^{2}

,

Θ_{1} = [1 + 10^{- 8}, 10] \times [- 1 + 10^{- 8}, 1 - 10^{- 8}] \times [1 + 10^{- 8}, 10]

, and

Θ_{2} = {[- 10, 10]}^{6}

; the total iteration number is set to 1000.

Table 6 summarises the estimation for

ρ

with the method proposed in Section 3 (the inverse of r is computationally obtained) with empirical means and empirical RMSEs of

{\hat{ρ}}_{n}

in 1000 iterations. We can see that

{\hat{ρ}}_{n}

is sufficiently precise to estimate the true value of

ρ

indeed in this result, which is significant to estimate the other parameters

α

and

β

.

We also note that the maximum values of the test statistics for smoothed observation proposed in Section 3.2 in 1000 iterations are

- 17.947

and

- 33.159

for each axis. The p-value for them are smaller than

10^{- 16}

; therefore, we can conclude that it is possible to detect the smoothed observation with the proposed test statistic in the case

ρ^{(i)} = 2.0

if

d = 2

from this result.

With respect to the estimation for

α

and

β

, we compare the estimates by our proposal method with that by LGA which does not concern convolutional observation. Table 7 is the summary for

α

estimate by both the methods:

We can see that the estimation precision for

α

by our proposal outperforms those by LGA. This results support validity of our estimation method if we have convolutional observation for diffusion processes. Regarding

β

, the simulation result is summarised in Table 8:

Though the estimation for

β^{(3)}

by our method has the smaller bias in comparison to that by LGA, the RMSE of our method is larger than that of LGA; in the estimation for other parameters, our proposal method outperforms the method by LGA. We can conclude that our proposal for estimation of

α

and

β

concerning convolutional observation performs better than that not considering this observation scheme.

6. Real Data Analysis

In this section, we analyse the EEG dataset named S02E.mat provided in “2. Two class motor imagery (002-2014)” of BNCI Horizon 2020 [20]. The datasets including S02E.mat are also studied by Steryl et al. [27].

6.1. Estimation and Test for the Smoothing Parameters

In the first place, we pick up the first 15 axes of the dataset and compute

{\hat{ρ}}_{n}

and

T_{n}

proposed in Section 3.1 and Section 3.2 respectively. The results are shown in Table 9.

We can observe that all the 15 time series data have the smoothing parameter

ρ > 0

with statistical significance when we assume ordinary significance levels. These results motivate us to use our methods for parametric estimation proposed in Section 4 when we fit stochastic differential equations for these data.

6.2. Parametric Estimation for a Diffusion Process

We fit a 1-dimensional OU process for the time series data in the 2nd column of the data file S02E.mat with 512 Hz observation for 222 s (the plot of the path can be seen in Figure 1), whose

{\hat{ρ}}_{n} = 1.037

is the largest among those for the 15 axes and it is larger than 0 with statistical significance. According to the simulation result shown in Section 5.1, this size of the smoothing parameter gives critical biases when we estimate

α

and

β

with LGA method not concerning convolutional observation scheme.

The stochastic differential equation with parameters

α \in Θ_{1} : = [0.01, 200]

and

β \in Θ_{2} : = [- 100, - 0.01] \times [- 100, 100]

is defined as follows:

\begin{matrix} d X_{t} = (β^{(1)} X_{t} + β^{(2)}) d t + α d w_{t}, X_{- λ} = x_{- λ} . \end{matrix}

We set 5 s as the time unit: hence

n =

113,664 and

h_{n} = 1 / (5 \times 512)

. If we fit the OU process with the LGA method, i.e., we do not concern convolutional observation scheme, we obtain the fitting result such that

\begin{matrix} d X_{t} = ((- 17.378) X_{t} + (- 1.091)) d t + (122.892) d w_{t}, X_{- λ} = x_{- λ} . \end{matrix}

In the next place, we fit

α

and

β

with the least square method proposed in Section 4, and then we have the next fitting result:

\begin{matrix} d X_{t} = ((- 2.146) X_{t} + (0.552)) d t + (151.919) d w_{t}, X_{- λ} = x_{- λ} . \end{matrix}

It is worth noting that this fitting result is substantially different to that by LGA as shown above: hence these results indicate the significance to examine if the observation is convoluted with the smoothing parameter

ρ > 0

and otherwise the estimation is strongly biased.

7. Summary

We have discussed the convolutional observation scheme which deals with the smoothness of observation in comparison to ordinary diffusion processes. The first contribution is to propose this new observation scheme with the statistical test to confirm whether this scheme is valid in real data. The second one is to prove consistency of the estimator

{\hat{ρ}}_{n}

for the smoothing parameter

ρ

, those for parameters in diffusion and drift coefficient, i.e.,

α

and

β

, of the latent diffusion process

\{X_{t}\}

. Thirdly, we have examined the performance of those estimators and the test statistics in computational simulation, and verified these statistics work well in realistic settings. In the fourth place, we have shown a real example of observation where

ρ \neq 0

holds with statistical significance.

If we combine the test for noise detection by Nakakita and Uchida [28] and that for smoothed observation proposed in this paper, we can test if the observed process is diffusion or not in terms that the observation is noisy or smoothed. Note that the realised volatilities of the noisy observation of diffusion processes increase as observation frequency increases while those of the smoothed observation decreases as the frequency grows. On that point, the noisy observation in ordinary meaning and the smoothed one are ‘opposite’ to each other.

These contributions, especially the third one, will cultivate the motivation to study statistical approaches for convolutionally observed diffusion processes furthermore, such as estimation of kernel function V appearing in the convoluted diffusion

{\bar{X}}_{t} : = (V * X) (t)

, test theory for parameters

α

and

β

as likelihood-ratio-type test statistics, for example, see [29,30], large deviation inequalities for quasi-likelihood functions and associated discussion of Bayes-type estimators, e.g., [6,31,32,33]. By these future works, it is expected that the applicability of stochastic differential equations in real data analysis and contributions to the areas with high frequency observation of phenomena such as EEG will be enhanced.

8. Proofs

8.1. the Results for Some Laws of Large Numbers

In this subsection, we give the notations and statements of propositions without proofs except for Proposition 3: the detailed proofs are given in supplementary material. We assume

Δ \leq λ

,

k \in N, M > 0

, and consider a class of

{\bar{R}}^{k} \otimes {\bar{R}}^{d}

-valued kernel functions on

R

denoted as

K (Δ, k, M)

such that for all

Φ_{Δ} \in K (Δ, k, M)

, it holds:

\begin{matrix} (i) & supp Φ_{Δ} \subset [0, Δ], \\ (ii) & for all f : [0, Δ] \times Ω \to R^{k}, ω \in Ω, |\int_{0}^{Δ} Φ_{Δ} (Δ - s) f (s, ω) d s| \leq M sup_{s \in [0, Δ]} |f (s, ω)| \\ (iii) & for all t_{0} \geq - λ, f : R^{d} \to R which is continuous and at most polynomial growth, \\ E [\int_{t - Δ}^{t} Φ_{Δ} (t - s) f (X_{s}) d s| F_{t_{0}}] = \int_{t - Δ}^{t} Φ_{Δ} (t - s) E [f (X_{s})| F_{t_{0}}] d s . \end{matrix}

Remark 5.

Note one sufficient condition for

Φ_{Δ} \in K (Δ, k, M)

is (i)

Φ_{Δ} : R \to R^{k} \otimes R^{d}

, (ii)

supp Φ_{Δ} \subset [0, Δ]

, (iii)

\int_{0}^{Δ} ∥Φ_{Δ} (Δ - s)∥ d s \leq M

and (iv)

B ([0, Δ])

-measurable since

\begin{matrix} |\int_{0}^{Δ} Φ_{Δ} (Δ - s) f (s, ω) d s| & \leq \int_{0}^{Δ} ∥Φ_{Δ} (Δ - s)∥ |f (s, ω)| d s \leq M sup_{s \in [0, Δ]} |f (s, ω)| \end{matrix}

for the Cauchy–Schwarz inequality, and Fubini’s theorem.

It is easily checked that

V_{ρ, h_{n}} \in K ({max}_{i = 1, \dots, d} ρ_{i} h_{n}, d, d)

.

Let p denote an integer such that

{sup}_{n \in N} p h_{n} \leq λ

,

Δ_{n} : = p h_{n}

. We set the sequence of the kernels

{\{Φ_{Δ_{n}, n}\}}_{n \in N}

such that

Φ_{Δ_{n}, n} \in K (Δ_{n}, d, M)

for some

M > 0

,

\int_{0}^{Δ_{n}} Φ_{Δ_{n}, n} d s = I_{d}

and there exist a matrix

B \in R^{d} \otimes R^{d}

such that

\begin{matrix} ∥\int_{0}^{Δ_{n} + h_{n}} (Φ_{Δ_{n}, n} ((Δ_{n} + h_{n}) - s) - Φ_{Δ_{n}, n} (Δ_{n} - s)) s d s - h_{n} B∥ \leq C h_{n}^{2} {(1 + |x|)}^{C}, \end{matrix}

a set

L \subset \{0, \dots, p\}

such that there exist functions

D_{ℓ} : R^{d} \to R^{d} \otimes R^{d}

for

ℓ \in L

such that

\begin{matrix} ∥E [(\int_{0}^{Δ_{n} + (1 + ℓ) h_{n}} Φ_{Δ_{n}, n} (Δ_{n} - s_{1}) (\int_{0}^{s_{1}} a (x) d w_{s_{2}}) d s_{1}) \\ (\int_{0}^{Δ_{n} + (1 + ℓ) h_{n}} (Φ_{Δ_{n}, n} ((Δ_{n} + (1 + ℓ) h_{n}) - s_{1}) - Φ_{Δ_{n}, n} (Δ_{n} + ℓ h_{n} - s_{1})) \\ {\times (\int_{0}^{s_{1}} a (x) d w_{s_{2}}) d s_{1})}^{T}] - h_{n} D_{ℓ} (x)∥ \\ \leq C h_{n}^{2} {(1 + |x|)}^{C}, \end{matrix}

a function

G : R^{d} \to R^{d} \otimes R^{d}

such that

\begin{matrix} ∥E [(\int_{0}^{Δ_{n} + h_{n}} (Φ_{Δ_{n}, n} ((Δ_{n} + h_{n}) - s_{1}) - Φ_{Δ_{n}, n} (Δ_{n} - s_{1})) (\int_{0}^{s_{1}} a (x) d w_{s_{2}}) d s_{1}) \\ {(\int_{0}^{Δ_{n} + h_{n}} (Φ_{Δ_{n}, n} ((Δ_{n} + h_{n}) - s_{1}) - Φ_{Δ_{n}, n} (Δ_{n} - s_{1})) (\int_{0}^{s_{1}} a (x) d w_{s_{2}}) d s_{1})}^{T}] \\ - h_{n} G (x)∥ \\ \leq C h_{n}^{2} {(1 + |x|)}^{C} . \end{matrix}

We define

\begin{matrix} {\bar{X}}_{t, n} = \int_{t - Δ_{n}}^{t} Φ_{Δ_{n}, n} (t - s) X_{s} d s, \end{matrix}

and the following random quantities such that

\begin{matrix} {\bar{ν}}_{n} (f (\cdot, ξ)) & : = \frac{1}{n} \sum_{i = 1}^{n} f ({\bar{X}}_{i h_{n}, n}, ξ), \\ {\bar{I}}_{ℓ, n} (v (\cdot, ξ)) & : = \frac{1}{n h_{n}} \sum_{i = 1 + ℓ}^{n} v ({\bar{X}}_{(i - 1 - ℓ) h_{n}, n}, ξ) [{\bar{X}}_{i h_{n}, n} - {\bar{X}}_{(i - 1) h_{n}, n} - (h_{n} B) b ({\bar{X}}_{(i - 1 - ℓ) h_{n}, n})], \\ {\bar{Q}}_{n} (M (\cdot, ξ)) & : = \frac{1}{n h_{n}} \sum_{i = 1}^{n} M ({\bar{X}}_{(i - 1) h_{n}, n}, ξ) [{({\bar{X}}_{i h_{n}, n} - {\bar{X}}_{(i - 1) h_{n}, n})}^{\otimes 2}], \end{matrix}

where

f : R^{d} \times Ξ \to R

,

v : R^{d} \times Ξ \to R^{d}

,

M : R^{d} \times Ξ \to R^{d} \otimes R^{d}

are in

C^{2}

-class, and their first and second derivatives and themselves are at most polynomial growth uniformly in

ξ \in Ξ

.

Proposition 1.

Under [A1],

{\bar{ν}}_{n} (f (\cdot, ξ)) \to^{P} ν_{0} (f (\cdot, ξ))

uniformly in

ξ \in Ξ .

Proposition 2.

If

ℓ \in L

and [A1] hold,

{\bar{I}}_{ℓ, n} (v (\cdot, ξ)) \to^{P} ν_{0} (\partial_{x} v [D_{ℓ}^{T}] (\cdot, ξ))

uniformly in

ξ \in Ξ .

Proposition 3.

Under [A1],

{\bar{Q}}_{n} (M (\cdot, ξ)) \to^{P} ν_{0} (M [G] (\cdot, ξ))

uniformly in

ξ \in Ξ .

We set

p = [\bar{ρ}] + 1

,

Δ_{n} = p h_{n}

and see the evaluation of B,

D_{ℓ}

and G when setting our kernel

\{Φ_{Δ, n}\} = \{V_{ρ, h_{n}}\}

as follows (for the derivation of the evaluation, see the supplementary material): we have

Δ_{n} = p h_{n}

,

B = I_{d}

,

D_{0} (x) = {D_{0} (x | ρ)|}_{ρ = ρ_{⋆}}

, where

D_{0}^{(i, j)} (x | ρ) = A^{(i, j)} (x) f_{D_{0}} (ρ^{(i)}, ρ^{(j)})

,

\begin{matrix} f_{D_{0}} (ρ^{(i)}, ρ^{(j)}) & : = \{\begin{matrix} 0 & if ρ^{(j)} = 0, \\ \frac{ρ^{(j)}}{2} & if ρ^{(i)} = 0, ρ^{(j)} \in (0, 1], \\ \frac{2 ρ^{(j)} - 1}{2 ρ^{(j)}} & if ρ^{(i)} = 0, ρ^{(j)} \in (1, \bar{ρ}], \\ \frac{6 ρ^{(i)} ρ^{(j)} - 3 {(ρ^{(i)})}^{2} - 3 ρ^{(i)}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(i)} > 0, ρ^{(i)} + 1 < ρ^{(j)}, \\ \frac{{(ρ^{(i)} - ρ^{(j)})}^{3} + 3 {(ρ^{(j)})}^{2} - 3 ρ^{(j)} + 1}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(j)} > 1, ρ^{(i)} < ρ^{(j)} \leq ρ^{(i)} + 1, \\ \frac{3 {(ρ^{(j)})}^{2} - 3 ρ^{(j)} + 1}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(j)} > 1, ρ^{(i)} \geq ρ^{(j)}, \\ \frac{{(ρ^{(i)} - ρ^{(j)})}^{3} + {(ρ^{(j)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(j)} \in (0, 1], ρ^{(i)} \in (0, ρ^{(j)}), \\ \frac{{(ρ^{(j)})}^{3}}{6 ρ^{(i)} ρ^{(j)}} & if ρ^{(j)} \in (0, 1], ρ^{(i)} \geq ρ^{(j)}, \end{matrix} \end{matrix}

D_{ℓ} = O

for

ℓ \geq [{max}_{i = 1, \dots, d} ρ_{*}^{(d)}] + 1

because of independent increments of the Wiener process, and

G (x) = {G (x | ρ)|}_{ρ = ρ_{⋆}}

where

G (x | ρ) = {G (x, α | ρ)|}_{α = α_{⋆}}

.

8.2. Proof of the Results in Section 3.1

Proof of Lemma 1.

By following the proof of the Proposition 3, it is sufficient to evaluate

\begin{matrix} \int_{0}^{(p + 2) h_{n}} \int_{0}^{(p + 2) h_{n}} A^{(i, i)} (x) min \{s, s^{'}\} (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s)) \\ (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s^{'}) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s^{'})) d s^{'} d s \end{matrix}

for the asymptotic behaviour of the reduced quadratic variation by choosing

\begin{matrix} M^{(ℓ_{1}, ℓ_{2})} = \{\begin{matrix} 1 & if ℓ_{1} = ℓ_{2} = i, \\ 0 & otherwise . \end{matrix} \end{matrix}

If

ρ^{(i)} = 0

,

\begin{matrix} \int_{0}^{(p + 2) h_{n}} \int_{0}^{(p + 2) h_{n}} A^{(i, i)} (x) min \{s, s^{'}\} (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s)) \\ (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s^{'}) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s^{'})) d s^{'} d s \\ = A^{(i, i)} (x) \int_{0}^{(p + 2) h_{n}} \int_{0}^{(p + 2) h_{n}} min \{s, s^{'}\} (δ ((p + 2) h_{n} - s) - δ (p h_{n} - s)) \\ (δ ((p + 2) h_{n} - s^{'}) - δ (p h_{n} - s^{'})) d s^{'} d s \\ = A^{(i, i)} (x) ((p + 2) h_{n} - 2 p h_{n} + p h_{n}) \\ = 2 h_{n} A^{(i, i)} (x), \end{matrix}

and if

ρ^{(i)} \in (0, \bar{ρ}]

,

\begin{matrix} \int_{0}^{(p + 2) h_{n}} \int_{0}^{(p + 2) h_{n}} A^{(i, i)} (x) min \{s, s^{'}\} (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s)) \\ (V_{ρ, h_{n}}^{(i, i)} ((p + 2) h_{n} - s^{'}) - V_{ρ, h_{n}}^{(i, i)} (p h_{n} - s^{'})) d s^{'} d s \\ = \frac{A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{0}^{(p + 2) h_{n}} \int_{0}^{(p + 2) h_{n}} min \{s, s^{'}\} \\ \times (1_{[0, ρ^{(i)} h_{n}]} ((p + 2) h_{n} - s) - 1_{[0, ρ^{(i)} h_{n}]} (p h_{n} - s)) \\ \times (1_{[0, ρ^{(i)} h_{n}]} ((p + 2) h_{n} - s^{'}) - 1_{[0, ρ^{(i)} h_{n}]} (p h_{n} - s^{'})) d s^{'} d s \\ = \frac{A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{(p + 2 - ρ^{(i)}) h_{n}}^{(p + 2) h_{n}} \int_{(p + 2 - ρ^{(i)}) h_{n}}^{(p + 2) h_{n}} min \{s, s^{'}\} d s^{'} d s \\ - \frac{2 A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{(p + 2 - ρ^{(i)}) h_{n}}^{(p + 2) h_{n}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} min \{s, s^{'}\} d s^{'} d s \\ + \frac{A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} min \{s, s^{'}\} d s^{'} d s \\ = \frac{A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{(p + 2 - ρ^{(i)}) h_{n}}^{(p + 2) h_{n}} (\int_{s}^{(p + 2) h_{n}} s d s^{'} + \int_{(p + 2 - ρ^{(i)}) h_{n}}^{s} s^{'} d s^{'}) d s \\ - \frac{2 A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} 1_{(2, \bar{ρ}]} (ρ^{(i)}) \int_{p h_{n}}^{(p + 2) h_{n}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} s^{'} d s^{'} d s \\ - \frac{2 A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} 1_{(2, \bar{ρ}]} (ρ^{(i)}) \int_{(p + 2 - ρ^{(i)}) h_{n}}^{p h_{n}} (\int_{s}^{p h_{n}} s d s^{'} + \int_{(p - ρ^{(i)}) h_{n}}^{s} s^{'} d s^{'}) d s \\ - \frac{2 A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} 1_{(0, 2]} (ρ^{(i)}) \int_{(p + 2 - ρ^{(i)}) h_{n}}^{(p + 2) h_{n}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} s^{'} d s^{'} d s \\ + \frac{A^{(i, i)} (x)}{{(ρ^{(i)} h_{n})}^{2}} \int_{(p - ρ^{(i)}) h_{n}}^{p h_{n}} (\int_{s}^{p h_{n}} s d s^{'} + \int_{(p - ρ^{(i)}) h_{n}}^{s} s^{'} d s^{'}) d s \\ = \{\begin{matrix} 2 h_{n} A^{(i, i)} (x) (1 - \frac{ρ^{(i)}}{6}) & if ρ^{(i)} \in (0, 2], \\ 2 h_{n} A^{(i, i)} (x) (\frac{2}{ρ^{(i)}} - \frac{4}{3 {(ρ^{(i)})}^{2}}) & if ρ^{(i)} \in (2, \bar{ρ}] . \end{matrix} \end{matrix}

Hence, we obtain the proof (for details, see the supplementary material). □

Proof of Lemma 2.

Continuity is obvious, and monotonicity is obtained as follows: if

ρ^{(i)} \in (0, 1]

,

\begin{matrix} \frac{d}{d ρ^{(i)}} (6 - 2 ρ^{(i)}) {(6 - ρ^{(i)})}^{- 1} & = ((- 2) (6 - ρ^{(i)}) - (6 - 2 ρ^{(i)}) (- 1)) {(6 - ρ^{(i)})}^{- 2} \\ = (- 12) {(6 - ρ^{(i)})}^{- 2} < 0, \end{matrix}

and if

ρ^{(i)} \in (1, 2]

,

\begin{matrix} \frac{d}{d ρ^{(i)}} (6 ρ^{(i)} - 2) {(6 {(ρ^{(i)})}^{2} - {(ρ^{(i)})}^{3})}^{- 1} \\ = (6 (6 {(ρ^{(i)})}^{2} - {(ρ^{(i)})}^{3}) - (6 ρ^{(i)} - 2) (12 ρ^{(i)} - 3 {(ρ^{(i)})}^{2})) {(6 {(ρ^{(i)})}^{2} - {(ρ^{(i)})}^{3})}^{- 2} \\ = 6 ρ^{(i)} (- 7 ρ^{(i)} + 4 + 2 {(ρ^{(i)})}^{2}) {(6 {(ρ^{(i)})}^{2} - {(ρ^{(i)})}^{3})}^{- 2} < 0, \end{matrix}

and if

ρ^{(i)} \in (2, \bar{ρ}]

,

\begin{matrix} \frac{d}{d ρ^{(i)}} (3 ρ^{(i)} - 1) {(6 ρ^{(i)} - 4)}^{- 1} & = (- 18) {(6 ρ^{(i)} - 4)}^{- 2} < 0 . \end{matrix}

The inverse can be obtained directly. □

Proof of Theorem 1.

It follows from Lemma 1, 2 and continuous mapping theorem. □

8.3. Proofs of the Results in Section 3.2

Proof of Theorem 2.

We can clearly prove the result by using Lemma 7 in Kessler [4], Proposition 7 in Nakakita and Uchida [28], and Slutsky’s theorem. □

Proof of Theorem 3.

By Lemma 1, there exists a number

ℓ < 0

such that

\begin{matrix} \frac{1}{n h_{n}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{2} - \frac{1}{n h_{n}} \sum_{2 \leq 2 k \leq n} {({\bar{X}}_{2 k h_{n}, n}^{(i)} - {\bar{X}}_{(2 k - 2) h_{n}, n}^{(i)})}^{2} \to^{P} ℓ < 0, \end{matrix}

and hence it is sufficient to show that

\begin{matrix} sup_{n \in N} E [\frac{2}{3 n h_{n}^{2}} \sum_{k = 1}^{n} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{4}] < \infty; \end{matrix}

and it is obvious that

\begin{matrix} {({\bar{X}}_{k h_{n}, n}^{(i)} - {\bar{X}}_{(k - 1) h_{n}, n}^{(i)})}^{4} \leq C {({\bar{X}}_{k h_{n}, n}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}})}^{4} + C {({\bar{X}}_{(k - 1) h_{n}, n}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}})}^{4} \end{matrix}

and

\begin{matrix} E [{({\bar{X}}_{k h_{n}, n}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}})}^{4}| F_{(k - \bar{ρ} - 1) h_{n}}] \\ = E [{(\frac{1}{ρ_{⋆}^{(i)} h_{n}} \int_{(k - ρ_{⋆}^{(i)}) h_{n}}^{k h_{n}} (X_{s}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}}) d s)}^{4}| F_{(k - \bar{ρ} - 1) h_{n}}] \\ \leq E [{(\frac{1}{ρ_{⋆}^{(i)} h_{n}} \int_{(k - ρ_{⋆}^{(i)}) h_{n}}^{k h_{n}} |X_{s}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}}| d s)}^{4}| F_{(k - \bar{ρ} - 1) h_{n}}] \\ \leq E [{(\frac{1}{ρ_{⋆}^{(i)} h_{n}} \int_{(k - ρ_{⋆}^{(i)}) h_{n}}^{k h_{n}} sup_{s^{'} \in [(k - ρ_{⋆}^{(i)}) h_{n}, k h_{n}]} |X_{s^{'}}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}}| d s)}^{4}| F_{(k - \bar{ρ} - 1) h_{n}}] \\ = E [sup_{s \in [(k - ρ_{⋆}^{(i)}) h_{n}, k h_{n}]} {|X_{s}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}}|}^{4}| F_{(k - \bar{ρ} - 1) h_{n}}] \\ \leq C h_{n}^{2} {(1 + |X_{(k - \bar{ρ} - 1) h_{n}}|)}^{C} \end{matrix}

by Proposition A in Gloter [9], and a parallel result holds for

{({\bar{X}}_{(k - 1) h_{n}, n}^{(i)} - X_{(k - \bar{ρ} - 1) h_{n}})}^{4}

. Hence we obtain the result. □

8.4. Proof of the Results in Section 4

Proof of Theorem 4.

We only deal with the case where

ρ_{⋆}

is unknown because the discussion for the case where

ρ_{⋆}

is known is parallel. First of all, we prove the consistency of

{\hat{α}}_{n}

. We obtain that

\begin{matrix} |\frac{1}{n} H_{1, n} (α | {\hat{ρ}}_{n}) - \frac{1}{n} H_{1, n} (α | ρ_{⋆})| \\ = |- \frac{1}{n} \sum_{k = 1}^{n} {∥\frac{1}{h_{n}} {({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2} - G ({\bar{X}}_{(k - 1) h_{n}, n}, α | {\hat{ρ}}_{n})∥}^{2} \\ + \frac{1}{n} \sum_{k = 1}^{n} {∥\frac{1}{h_{n}} {({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2} - G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ_{⋆})∥}^{2}| \\ \leq |\frac{2}{n h_{n}} \sum_{k = 1}^{n} (G ({\bar{X}}_{(k - 1) h_{n}, n}, α | {\hat{ρ}}_{n}) - G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ_{⋆})) [{({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2}]| \\ + |\frac{1}{n} \sum_{k = 1}^{n} ({∥G ({\bar{X}}_{(k - 1) h_{n}, n}, α | {\hat{ρ}}_{n})∥}^{2} - {∥G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ_{⋆})∥}^{2})| \\ \leq \frac{2}{n h_{n}} \sum_{k = 1}^{n} {|{\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n}|}^{2} \\ \times \sum_{i = 1}^{d} \sum_{j = 1}^{d} |A^{(i, j)} ({\bar{X}}_{(k - 1) h_{n}, n}, α)| |f_{G} ({\hat{ρ}}_{n}^{(i)}, {\hat{ρ}}_{n}^{(j)}) - f_{G} (ρ_{⋆}^{(i)}, ρ_{⋆}^{(j)})| \\ + \frac{1}{n} \sum_{k = 1}^{n} \sum_{i = 1}^{d} \sum_{j = 1}^{d} {|A^{(i, j)} ({\bar{X}}_{(k - 1) h_{n}, n}, α)|}^{2} |f_{G}^{2} ({\hat{ρ}}_{n}^{(i)}, {\hat{ρ}}_{n}^{(j)}) - f_{G}^{2} (ρ_{⋆}^{(i)}, ρ_{⋆}^{(j)})| \\ \leq \frac{C}{n h_{n}} \sum_{k = 1}^{n} {(1 + |{\bar{X}}_{(k - 1) h_{n}, n}|)}^{C} {|{\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n}|}^{2} \\ \times \sum_{i = 1}^{d} \sum_{j = 1}^{d} |f_{G} ({\hat{ρ}}_{n}^{(i)}, {\hat{ρ}}_{n}^{(j)}) - f_{G} (ρ_{⋆}^{(i)}, ρ_{⋆}^{(j)})| \\ + \frac{C}{n} \sum_{k = 1}^{n} {(1 + |{\bar{X}}_{(k - 1) h_{n}, n}|)}^{C} \sum_{i = 1}^{d} \sum_{j = 1}^{d} |f_{G}^{2} ({\hat{ρ}}_{n}^{(i)}, {\hat{ρ}}_{n}^{(j)}) - f_{G}^{2} (ρ_{⋆}^{(i)}, ρ_{⋆}^{(j)})| \\ \to^{P} 0 uniformly in α, \end{matrix}

because continuous mapping theorem holds. Therefore, it follows from Proposition 1 and Proposition 3 that

\begin{matrix} \frac{1}{n} H_{1, n} (α | {\hat{ρ}}_{n}) - \frac{1}{n} H_{1, n} (α_{⋆} | ρ_{⋆}) \\ = \frac{2}{n h_{n}} \sum_{k = 1}^{n} G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ_{⋆}) [{({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2}] \\ - \frac{1}{n} \sum_{k = 1}^{n} {∥G ({\bar{X}}_{(k - 1) h_{n}, n}, α | ρ_{⋆})∥}^{2} \\ - \frac{2}{n h_{n}} \sum_{k = 1}^{n} G ({\bar{X}}_{(k - 1) h_{n}, n}, α_{⋆} | ρ_{⋆}) [{({\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n})}^{\otimes 2}] \\ + \frac{1}{n} \sum_{k = 1}^{n} {∥G ({\bar{X}}_{(k - 1) h_{n}, n}, α_{⋆} | ρ_{⋆})∥}^{2} \\ + o_{P}^{*} (1) \\ \to^{P} V_{1} (α | ξ_{⋆}) uniformly in α \end{matrix}

where

o_{P}^{*} (1)

indicates the term converging in probability to zero uniformly in

θ

. Then we obtain that

{\hat{α}}_{n} \to α_{⋆}

in the same way as Kessler [4] with Assumption [A3].

In the next place, we consider the consistency of

{\hat{β}}_{n}

. Firstly, we consider the case

{max}_{i} ρ_{⋆}^{(i)} \in (ℓ - 1, ℓ)

for an integer

ℓ \in \{1, \dots, [\bar{ρ}] + 1\}

. Then it is sufficient to show

\begin{matrix} \frac{1}{n h_{n}} H_{2, n} (β | {\hat{ρ}}_{n}) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) \to^{P} V_{2} (β | ξ_{⋆}) uniformly in β \end{matrix}

due to Assumption [A3]. Because the evaluation

D_{j} (x) = O

where

j \geq [{max}_{i = 1, \dots, n} ρ_{*}^{(i)}] + 1

using independent increments of the Wiener process, Proposition 1 and Proposition 2 verify

\begin{matrix} F_{ℓ} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) \to^{P} V_{2} (β | ξ_{⋆}) uniformly in β, \end{matrix}

where

\begin{matrix} F_{j} (β) : = - \frac{1}{n h_{n}^{2}} \sum_{k = 1 + j}^{n} {|{\bar{X}}_{k h_{n}, n} - {\bar{X}}_{(k - 1) h_{n}, n} - h_{n} b ({\bar{X}}_{(k - 1 - j) h_{n}, n}, β)|}^{2} . \end{matrix}

In addition, the exact convergences such that

\begin{matrix} P (1_{\{ℓ\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) \to 1, P (1_{\{j\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) \to 0 \end{matrix}

hold for all

j \neq ℓ

, since for all

j = 1, \dots, [\bar{ρ}] + 1

,

\begin{matrix} P (1_{\{j\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) = P (max_{i} {\hat{ρ}}_{n}^{(i)} \in [j - 1, j)) . \end{matrix}

Therefore, for any

ϵ > 0

,

\begin{matrix} P (sup_{β \in Θ_{2}} |\frac{1}{n h_{n}} H_{2, n} (β | {\hat{ρ}}_{n}) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) - V_{2} (β | ξ_{⋆})| > ϵ) \\ \leq \sum_{j \neq ℓ} P (1_{\{j\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) \\ + P (\{1_{\{ℓ\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1\} \\ \cap \{sup_{β \in Θ_{2}} |F_{ℓ} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) - V_{2} (β | ξ_{⋆})| > ϵ\}) \\ \to 0 . \end{matrix}

For the case

{max}_{i} ρ_{⋆}^{(i)} = ℓ

for an integer

ℓ = \{0, \dots, [\bar{ρ}] + 1\}

, we similarly obtain

\begin{matrix} \frac{1}{n h_{n}} H_{2, n} (β | {\hat{ρ}}_{n}) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) \to^{P} V_{2} (β | ξ_{⋆}) uniformly in β \end{matrix}

because we have

\begin{matrix} F_{ℓ} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) \to^{P} V_{2} (β | ξ_{⋆}) uniformly in β, \\ F_{ℓ + 1} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) \to^{P} V_{2} (β | ξ_{⋆}) uniformly in β, \end{matrix}

and

\begin{matrix} P (1_{\{ℓ\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) + 1_{\{ℓ + 1\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) \to 1, \\ P (1_{\{j\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 0) \to 1, for all j \neq ℓ, ℓ + 1, \end{matrix}

and it holds that for any

ϵ > 0

,

\begin{matrix} P (sup_{β \in Θ_{2}} |\frac{1}{n h_{n}} H_{2, n} (β | {\hat{ρ}}_{n}) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) - V_{2} (β | ξ_{⋆})| > ϵ) \\ \leq \sum_{j \neq ℓ, ℓ + 1} P (1_{\{j\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1) \\ + P (\{1_{\{ℓ\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1\} \\ \cap \{sup_{β \in Θ_{2}} |F_{ℓ} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) - V_{2} (β | ξ_{⋆})| > ϵ\}) \\ + P (\{1_{\{ℓ + 1\}} ([max_{i} {\hat{ρ}}_{n}^{(i)}] + 1) = 1\} \\ \cap \{sup_{β \in Θ_{2}} |F_{ℓ + 1} (β) - \frac{1}{n h_{n}} H_{2, n} (β_{⋆} | ρ_{⋆}) - V_{2} (β | ξ_{⋆})| > ϵ\}) \\ \to 0 . \end{matrix}

Hence it is shown that

{\hat{β}}_{n} \to^{P} β_{⋆}

with Assumption [A3]. □

Supplementary Materials

The following are available online at https://www.mdpi.com/1099-4300/22/9/1031/s1.

Author Contributions

Conceptualization, S.H.N. and M.U.; Methodology, S.H.N. and M.U.; Software, S.H.N.; Validation, S.N.; Formal Analysis, S.H.N.; Investigation, S.H.N.; Resources, S.H.N. and M.U.; Data Curation, S.H.N.; Writing—Original Draft Preparation, S.H.N.; Writing—Review & Editing, M.U.; Visualization, S.H.N.; Supervision, M.U.; Project Administration, M.U.; Funding Acquisition, S.H.N. and M.U. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially funded by JST CREST, JSPS KAKENHI Grant Number JP17H01100 and JP20J10058.

Acknowledgments

This work was partially supported by Cooperative Research Program of the Institute of Statistical Mathematics.

Conflicts of Interest

The authors declare no conflict of interest.

References

Florens-Zmirou, D. Approximate discrete time schemes for statistics of diffusion processes. Statistics 1989, 20, 547–557. [Google Scholar] [CrossRef]
Yoshida, N. Estimation for diffusion processes from discrete observation. J. Multivar. Anal. 1992, 41, 220–242. [Google Scholar] [CrossRef] [Green Version]
Bibby, B.M.; Sørensen, M. Martingale estimating functions for discretely observed diffusion processes. Bernoulli 1995, 1, 17–39. [Google Scholar] [CrossRef]
Kessler, M. Estimation of an ergodic diffusion from discrete observations. Scand. J. Stat. 1997, 24, 211–229. [Google Scholar] [CrossRef]
Kessler, M.; Sørensen, M. Estimating equations based on eigenfunctions for a discretely observed diffusion process. Bernoulli 1999, 5, 299–314. [Google Scholar] [CrossRef]
Yoshida, N. Polynomial type large deviation inequalities and quasi-likelihood analysis for stochastic differential equations. Ann. Inst. Stat. Math. 2011, 63, 431–479. [Google Scholar] [CrossRef]
Uchida, M.; Yoshida, N. Adaptive estimation of an ergodic diffusion process based on sampled data. Stoch. Process. Their Appl. 2012, 122, 2885–2924. [Google Scholar] [CrossRef] [Green Version]
Uchida, M.; Yoshida, N. Adaptive Bayes type estimators of ergodic diffusion processes from discrete observations. Stat. Inference Stoch. Process. 2014, 17, 181–219. [Google Scholar] [CrossRef]
Gloter, A. Discrete sampling of an integrated diffusion process and parameter estimation of the diffusion coefficient. ESAIM Probab. Stat. 2000, 4, 205–227. [Google Scholar] [CrossRef]
Ditlevsen, S.; Sørensen, M. Inference for observations of integrated diffusion processes. Scand. J. Stat. 2004, 31, 417–429. [Google Scholar] [CrossRef]
Gloter, A. Parameter estimation for a discretely observed integrated diffusion process. Scand. J. Stat. 2006, 33, 83–104. [Google Scholar] [CrossRef]
Gloter, A.; Gobet, E. LAMN property for hidden processes: The case of integrated diffusions. Ann. l’Institut Henri Poincaré Probab. Stat. 2008, 44, 104–128. [Google Scholar] [CrossRef]
Sørensen, M. Prediction-based estimating functions: Review and new developments. Braz. J. Probab. Stat. 2011, 25, 362–391. [Google Scholar] [CrossRef]
Donnet, S.; Samson, A. A review on estimation of stochastic differential equations for pharmacokinetic/pharmacodynamic models. Adv. Drug Deliv. Rev. 2013, 65, 929–939. [Google Scholar] [CrossRef] [PubMed]
Delattre, M.; Genon-Catalot, V.; Samson, A. Mixtures of stochastic differential equations with random effects: Application to data clustering. J. Stat. Plan. Inference 2016, 173, 109–124. [Google Scholar] [CrossRef]
Picchini, U.; Forman, J.L. Bayesian inference for stochastic differential equation mixed effects models of a tumour xenography study. J. R. Stat. Soc. Ser. C Appl. Stat. 2019, 68, 887–913. [Google Scholar] [CrossRef]
Ruse, M.G.; Samson, A.; Ditlevsen, S. Inference for biomedical data by using diffusion models with covariates and mixed effects. J. R. Stat. Soc. Ser. C Appl. Stat. 2020, 69, 167–193. [Google Scholar] [CrossRef]
Zhang, L.; Mykland, P.A.; Aït-Sahalia, Y. A Tale of Two Time Scales: Determining Integrated Volatility with Noisy High-Frequency Data. J. Am. Stat. Assoc. 2005, 100, 1394–1411. [Google Scholar] [CrossRef]
Aït-Sahalia, Y.; Jacod, J. High-Frequency Financial Econometrics; Princeton University Press: Princeton, NJ, USA, 2014. [Google Scholar]
BNCI Horizon 2020. Two Class Motor Imagery. 2014. Available online: http://bnci-horizon-2020.eu/database/data-sets (accessed on 20 April 2019).
Jacod, J.; Li, Y.; Mykland, P.A.; Podolskij, M.; Vetter, M. Microstructure noise in the continuous case: The pre-averaging approach. Stoch. Process. Their Appl. 2009, 119, 2249–2276. [Google Scholar] [CrossRef] [Green Version]
Jacod, J.; Podolskij, M.; Vetter, M. Limit theorems for moving averages of discretized processes plus noise. Ann. Stat. 2010, 38, 1478–1545. [Google Scholar] [CrossRef]
Bibinger, M.; Hautsch, N.; Malec, P.; Reiß, M. Estimating the quadratic covariation matrix from noisy observations: Local method of moments and efficiency. Ann. Stat. 2014, 42, 1312–1346. [Google Scholar] [CrossRef]
Koike, Y. Quadratic covariation estimation of an irregularly observed semimartingale with jumps and noise. Bernoulli 2016, 22, 1894–1936. [Google Scholar] [CrossRef] [Green Version]
Ogihara, T. Parametric inference for nonsynchronously observed diffusion processes in the presence of market microstructure noise. Bernoulli 2018, 24, 3318–3383. [Google Scholar] [CrossRef] [Green Version]
Iacus, S.M. Simulation and Inference for Stochastic Differential Equations: With R Examples; Springer: New York, NY, USA, 2008. [Google Scholar]
Steyrl, D.; Scherer, R.; Faller, J.; Müller-Putz, G.R. Random forests in non-invasive sensorimotor rhythm brain-computer interfaces: A practical and convenient non-linear classifier. Biomed. Tech. 2016, 61, 77–86. [Google Scholar] [CrossRef]
Nakakita, S.H.; Uchida, M. Inference for ergodic diffusions plus noise. Scand. J. Stat. 2019, 46, 470–516. [Google Scholar] [CrossRef] [Green Version]
Kitagawa, H.; Uchida, M. Adaptive test statistics for ergodic diffusion processes sampled at discrete times. J. Stat. Plan. Inference 2014, 150, 84–110. [Google Scholar] [CrossRef]
Nakakita, S.H.; Uchida, M. Adaptive test for ergodic diffusions plus noise. J. Stat. Plan. Inference 2019, 203, 131–150. [Google Scholar] [CrossRef] [Green Version]
Ogihara, T.; Yoshida, N. Quasi-likelihood analysis for the stochastic differential equation with jumps. Stat. Inference Stoch. Process. 2011, 14, 189–229. [Google Scholar] [CrossRef]
Clinet, S.; Yoshida, N. Statistical inference for ergodic point processes and application to Limit Order Book. Stoch. Process. Their Appl. 2017, 127, 1800–1839. [Google Scholar] [CrossRef] [Green Version]
Nakakita, S.H.; Uchida, M. Quasi-likelihood analysis of an ergodic diffusion plus noise. arXiv 2018, arXiv:1806.09401. [Google Scholar]

Figure 1. The path of the second column of S02E.mat of BNCI Horizon 2020 [20] for all 222 seconds (left) and the first one second (right).

Figure 2. Realised volatilities with subsampling of the 2nd axis of data S02E.mat in two class motor imagery (002-2014) [20].

Figure 3. The left figure is the plot of the latent diffusion process, and the right one is that of the convolutionally observed process on

[0, 1]

respectively.

Figure 3. The left figure is the plot of the latent diffusion process, and the right one is that of the convolutionally observed process on

[0, 1]

respectively.

Figure 4. The realised volatilities of the convolutionally observed diffusion process with subsampling.

Table 1. Estimation performance of

ρ

with small

ρ

.

Table 1. Estimation performance of

ρ

with small

ρ

.

$ρ$	0.0	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9	1.0
mean	$0.00990$	$0.0971$	$0.198$	$0.298$	$0.398$	$0.498$	$0.598$	$0.699$	$0.799$	$0.899$	$0.999$
RMSE	$(0.0182)$	$(0.0256)$	$(0.0235)$	$(0.0215)$	$(0.0197)$	$(0.0180)$	$(0.0164)$	$(0.0150)$	$(0.0135)$	$(0.0123)$	$(0.0110)$

Table 2. Empirical ratio of test statistic

T_{n}

less than some critical values, and the maximum value of

T_{n}

in 1000 iterations.

Table 2. Empirical ratio of test statistic

T_{n}

less than some critical values, and the maximum value of

T_{n}

in 1000 iterations.

	Empirical Ratio of $T_{n}$ Less Than…					Max. of $T_{n}$
	$Φ^{- 1} (0.10)$	$Φ^{- 1} (0.05)$	$Φ^{- 1} (0.025)$	$Φ^{- 1} (0.01)$	$Φ^{- 1} (0.001)$	Max. of $T_{n}$
$ρ = 0.0$	$0.101$	$0.053$	$0.025$	$0.005$	$0.000$	$3.060$
$ρ = 0.1$	$0.989$	$0.980$	$0.966$	$0.914$	$0.759$	$- 0.710$
$ρ = 0.2$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 4.593$
$ρ = 0.3$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 9.341$
$ρ = 0.4$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 13.985$
$ρ = 0.5$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 19.152$
$ρ = 0.6$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 24.816$
$ρ = 0.7$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 30.848$
$ρ = 0.8$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 37.557$
$ρ = 0.9$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 44.829$
$ρ = 1.0$	$1.000$	$1.000$	$1.000$	$1.000$	$1.000$	$- 52.759$

Table 3. Estimation of

θ

by the proposed method and LGA with small

ρ

.

Table 3. Estimation of

θ

by the proposed method and LGA with small

ρ

.

		The Proposed Method			LGA
		$α$	$β^{(1)}$	$β^{(2)}$	$α$	$β^{(1)}$	$β^{(2)}$
True Value		$3.0$	$- 2.0$	$1.0$	$3.0$	$- 2.0$	$1.0$
$ρ = 0.0$	mean	$3.004$	$- 2.091$	$1.036$	$2.999$	$- 2.095$	$1.037$
$ρ = 0.0$	RMSE	$(0.0109)$	$(0.318)$	$(0.497)$	$(0.00679)$	$(0.320)$	$(0.497)$
$ρ = 0.1$	mean	$2.999$	$- 2.091$	$1.035$	$2.949$	$- 2.026$	$1.003$
$ρ = 0.1$	RMSE	$(0.0146)$	$(0.319)$	$(0.496)$	$(0.0509)$	$(0.297)$	$(0.480)$
$ρ = 0.2$	mean	$2.998$	$- 2.091$	$1.035$	$2.898$	$- 1.955$	$0.967$
$ρ = 0.2$	RMSE	$(0.0142)$	$(0.319)$	$(0.496)$	$(0.102)$	$(0.290)$	$(0.464)$
$ρ = 0.3$	mean	$2.998$	$- 2.092$	$1.036$	$2.846$	$- 1.885$	$0.932$
$ρ = 0.3$	RMSE	$(0.0139)$	$(0.319)$	$(0.497)$	$(0.155)$	$(0.299)$	$(0.452)$
$ρ = 0.4$	mean	$2.998$	$- 2.091$	$1.036$	$2.792$	$- 1.815$	$0.897$
$ρ = 0.4$	RMSE	$(0.0135)$	$(0.319)$	$(0.497)$	$(0.208)$	$(0.324)$	$(0.442)$
$ρ = 0.5$	mean	$2.998$	$- 2.092$	$1.036$	$2.738$	$- 1.744$	$0.862$
$ρ = 0.5$	RMSE	$(0.0132)$	$(0.319)$	$(0.497)$	$(0.262)$	$(0.361)$	$(0.436)$
$ρ = 0.6$	mean	$2.998$	$- 2.091$	$1.036$	$2.683$	$- 1.674$	$0.827$
$ρ = 0.6$	RMSE	$(0.0129)$	$(0.319)$	$(0.497)$	$(0.317)$	$(0.408)$	$(0.434)$
$ρ = 0.7$	mean	$2.998$	$- 2.092$	$1.036$	$2.626$	$- 1.604$	$0.792$
$ρ = 0.7$	RMSE	$(0.0126)$	$(0.319)$	$(0.497)$	$(0.374)$	$(0.460)$	$(0.434)$
$ρ = 0.8$	mean	$2.998$	$- 2.092$	$1.036$	$2.568$	$- 1.534$	$0.757$
$ρ = 0.8$	RMSE	$(0.0124)$	$(0.319)$	$(0.496)$	$(0.432)$	$(0.517)$	$(0.439)$
$ρ = 0.9$	mean	$2.998$	$- 2.092$	$1.036$	$2.509$	$- 1.464$	$0.722$
$ρ = 0.9$	RMSE	$(0.0121)$	$(0.319)$	$(0.497)$	$(0.491)$	$(0.578)$	$(0.445)$
$ρ = 1.0$	mean	$2.998$	$- 2.091$	$1.036$	$2.449$	$- 1.394$	$0.687$
$ρ = 1.0$	RMSE	$(0.0119)$	$(0.319)$	$(0.497)$	$(0.551)$	$(0.640)$	$(0.456)$

Table 4. The performance of

{\hat{ρ}}_{n}

for

ρ = 10, 15, 20

in 1000 iterations.

Table 4. The performance of

{\hat{ρ}}_{n}

for

ρ = 10, 15, 20

in 1000 iterations.

	$ρ = 10$	$ρ = 15$	$ρ = 20$
mean of ${\hat{ρ}}_{n}$	$9.919$	$14.980$	$19.751$
RMSE of ${\hat{ρ}}_{n}$	$(0.145)$	$(0.240)$	$(0.409)$

Table 5. Estimation of

θ

by the proposed method with large

ρ

.

Table 5. Estimation of

θ

by the proposed method with large

ρ

.

		The Proposed Method			LGA
		$α$	$β^{(1)}$	$β^{(2)}$	$α$	$β^{(1)}$	$β^{(2)}$
True Value		$3.0$	$- 2.0$	$1.0$	$3.0$	$- 2.0$	$1.0$
$ρ = 10$	mean	$2.989$	$- 2.101$	$1.030$	$0.933$	$- 0.204$	$0.0811$
$ρ = 10$	RMSE	$(0.0347)$	$(0.323)$	$(0.496)$	$(2.067)$	$(1.796)$	$(0.920)$
$ρ = 15$	mean	$2.996$	$- 2.095$	$1.027$	$0.765$	$- 0.138$	$0.0473$
$ρ = 15$	RMSE	$(0.0475)$	$(0.321)$	$(0.495)$	$(2.235)$	$(1.862)$	$(0.953)$
$ρ = 20$	mean	$2.977$	$- 2.090$	$1.024$	$0.664$	$- 0.104$	$0.0302$
$ρ = 20$	RMSE	$(0.0526)$	$(0.319)$	$(0.493)$	$(2.336)$	$(1.896)$	$(0.970)$

Table 6. Summary for

ρ

estimate.

Table 6. Summary for

ρ

estimate.

	$ρ^{(1)}$	$ρ^{(2)}$
true value	$2.0$	$4.0$
empirical mean	$1.988$	$3.966$
empirical RMSE	$(0.0207)$	$(0.0514)$

Table 7. Summary for

α

estimate.

Table 7. Summary for

α

estimate.

		$α^{(1)}$	$α^{(2)}$	$α^{(3)}$
	True Value	$2.0$	$0.0$	$3.0$
Our proposal	mean	$1.993$	$0.000256$	$2.992$
Our proposal	RMSE	$(0.0115)$	$(0.00739)$	$(0.0213)$
LGA	mean	$1.295$	$- 0.00320$	$1.442$
LGA	RMSE	$(0.705)$	$(0.0154)$	$(1.558)$

Table 8. Summary for

β

estimate.

Table 8. Summary for

β

estimate.

		$β^{(1)}$	$β^{(2)}$	$β^{(3)}$	$β^{(4)}$	$β^{(5)}$	$β^{(6)}$
	True Value	$- 2.0$	$- 0.4$	$0.0$	$0.1$	$- 3.0$	$5.0$
Our proposal	mean	$- 2.137$	$- 0.408$	$- 0.0439$	$0.0788$	$- 3.103$	$5.091$
Our proposal	RMSE	$(0.362)$	$(0.252)$	$(0.540)$	$(0.473)$	$(0.399)$	$(0.777)$
LGA	mean	$- 0.917$	$0.340$	$- 0.326$	$- 0.696$	$0.221$	$1.243$
LGA	RMSE	$(1.093)$	$(0.802)$	$(0.386)$	$(0.804)$	$(3.242)$	$(3.765)$

Table 9. The values of

{\hat{ρ}}_{n}

and

T_{n}

for the first 15 axes of S02.mat by BNCI Horizon 2020 [20].

Table 9. The values of

{\hat{ρ}}_{n}

and

T_{n}

for the first 15 axes of S02.mat by BNCI Horizon 2020 [20].

	1st axis	2nd axis	3rd axis	4th axis	5th axis
${\hat{ρ}}_{n}$	$0.449$	$1.037$	$0.894$	$0.736$	$0.937$
$T_{n}$	$- 20.398$	$- 58.631$	$- 46.649$	$- 35.201$	$- 49.741$
	6th axis	7th axis	8th axis	9th axis	10th axis
${\hat{ρ}}_{n}$	$0.951$	$0.971$	$1.017$	$0.958$	$0.967$
$T_{n}$	$- 51.392$	$- 52.607$	$- 55.455$	$- 51.221$	$- 51.797$
	11th axis	12th axis	13th axis	14th axis	15th axis
${\hat{ρ}}_{n}$	$0.949$	$0.649$	$0.952$	$0.977$	$0.932$
$T_{n}$	$- 50.457$	$- 30.094$	$- 50.633$	$- 50.978$	$- 48.842$

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nakakita, S.H.; Uchida, M. Inference for Convolutionally Observed Diffusion Processes. Entropy 2020, 22, 1031. https://doi.org/10.3390/e22091031

AMA Style

Nakakita SH, Uchida M. Inference for Convolutionally Observed Diffusion Processes. Entropy. 2020; 22(9):1031. https://doi.org/10.3390/e22091031

Chicago/Turabian Style

Nakakita, Shogo H, and Masayuki Uchida. 2020. "Inference for Convolutionally Observed Diffusion Processes" Entropy 22, no. 9: 1031. https://doi.org/10.3390/e22091031

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inference for Convolutionally Observed Diffusion Processes

Abstract

1. Introduction

2. Notations and Assumptions

2.1. Notations

2.2. Assumptions

3. Estimation and Test of the Smoothing Parameter

3.1. Estimation of the Smoothing Parameter

3.2. Test for Smoothed Observation

4. Least Square Estimation of the Diffusion and Drift Parameters

5. Simulations

5.1. 1-Dimensional Simulation

5.2. 2-Dimensional Simulation

6. Real Data Analysis

6.1. Estimation and Test for the Smoothing Parameters

6.2. Parametric Estimation for a Diffusion Process

7. Summary

8. Proofs

8.1. the Results for Some Laws of Large Numbers

8.2. Proof of the Results in Section 3.1

8.3. Proofs of the Results in Section 3.2

8.4. Proof of the Results in Section 4

Supplementary Materials

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI