Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis

Liang, Zhenjie; Weng, Futian; Ma, Yuanting; Xu, Yan; Zhu, Miao; Yang, Cai

doi:10.3390/math10071140

Open AccessArticle

Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis

by

Zhenjie Liang

^1,†,

Futian Weng

^2,3,4,†

,

Yuanting Ma

⁵,

Yan Xu

^6,7,

Miao Zhu

^8,* and

Cai Yang

^9,*

¹

School of Economics, Xiamen University, Xiamen 361005, China

²

School of Medicine, Xiamen University, Xiamen 361005, China

³

National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China

⁴

Data Mining Research Center, Xiamen University, Xiamen 361005, China

⁵

School of Economics and Management, East China Jiaotong University, Nanchang 330013, China

⁶

School of Mathematical Sciences, Ocean University of China, Qingdao 266100, China

⁷

National Economic Engineering Laboratory, Dongbei University of Finance and Economics, Dalian 116025, China

⁸

School of Statistics, Huaqiao University, Xiamen 361005, China

⁹

School of Business Administration, Hunan University, Changsha 410082, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2022, 10(7), 1140; https://doi.org/10.3390/math10071140

Submission received: 1 March 2022 / Revised: 30 March 2022 / Accepted: 31 March 2022 / Published: 1 April 2022

(This article belongs to the Topic Data Science and Knowledge Discovery)

Download

Browse Figures

Versions Notes

Abstract

:

Information and communication technology have enabled the collection of high-frequency financial asset time series data. However, the high spatial and temporal resolution nature of these data makes it challenging to compare financial asset characteristics patterns and identify the risk. To address this challenge, a method for realized volatility calculation based on the functional data analysis (FDA) method is proposed. A time–price functional curve is constructed by the functional data analysis method to calculate the realized volatility as the curvature integral of the time–price functional curve. This method could effectively eliminate the interference of market microstructure noise, which could not only allow capital asset price to be decomposed into a continuous term and a noise term by asymptotic convergence, but also could decouple the noise from the discrete-time series. Additionally, it could obtain the value of volatility at any given time, which is no concern about correlations between repeated, mixed frequencies and unequal intervals sampling problems and relaxes the structural constraints and distribution setting of data acquisition. To demonstrate our methods, we analyze a per-second level financial asset dataset. Additionally, sensitivity analysis on the selection of the no equally spaced sample is conducted, and we further add noise to ensure the robustness of our methods and discuss their implications in practice, especially being conducive to more micro analysis of the volatility of the financial market and understanding the rapidly changing changes.

Keywords:

functional data analysis; high frequency data; Bernstein basis function; curvature

MSC:

91G70

1. Introduction

In recent years, with the rapid and convenient acquisition of high-frequency data of asset returns, scholars and investors have given an increasing amount of attention to volatility modeling [1,2]. ARCH class model, SV class model, and realized volatility class model are used to calculate volatility [3,4,5]. However, these three class models characterize volatility indirectly through rate of return and cannot depict dynamic changes at the intraday level. In particular, modeling intraday volatility usually uses continuous time stochastic process methods [6,7,8], which assume that volatility is generated by a potentially unknown diffusion process. However, these methods are unable to describe and characterize the long memory and periodicity intraday fluctuations. A commonality in these methods depends on the discrete-time points at which the measurements are taken, which are often called point-based realized volatility calculation models [9]. When the data sampling frequency is high enough and the sampled data do not cause micro-market interference, these methods of calculating realized volatility are in theory consistent estimators of true volatility.

With the development of data acquisition technology, the frequency of financial data acquisition is getting higher and higher. As the data sampling frequency increases, the interference of market microstructure noise on data becomes more and more obvious, such as bid–ask spread, infrequent trading, and so on [10,11,12]. If each sample is noisy, point-based realized volatility calculation models are challenged. For example, the sum of returns squares extremely depend on two consecutive price sampling points from

t - 1

to t, where t denotes a time variable. The sampling points themselves are biased and different sampling points may lead to different calculations [13]. Second, high-frequency data have a long memory, which indicates that past shocks persist into the future and have a big impact on the expected future [14,15,16]. In this framework, systemic changes in trading should be considered.From a temporal evolution perspective, if the price is considered and analyzed as discrete time series, the underlying stochastic process that generates these observations cannot be ascertained in terms of statistical analysis. That is to say, the asset price shifting from a time t-1 to t is not independent from current characteristics but should be considered systemic changes. Therefore, faced with these challenges, it is necessary to analyze the high-frequency data’s volatility modeling.

A method about the curve-based realized volatility calculation model is proposed. Discrete time series can be a sampling set of curves, which can be observed over time [17,18]. On the one side, a functional data analysis method could often be used to extract additional information from discrete time series, such as their derivatives to measure the velocity and acceleration of the price curve [19]. On the other side, applying the functional data analysis method could help identify the repeated patterns of price changes in financial markets [20,21]. High-frequency data contain a rich source of information, which provides opportunities to analyze the dynamic changes of short time intervals. However, these studies do not consider decoupling noise from discrete time series. The price is assumed to follow the Brownian semimartingale process, and the integral volatility can be estimated by the realized volatility [22]. Therefore, firstly, this study constructs a time–price functional curve on the basis of discrete observations by the functional data analysis method through the Bernstein basis function. Secondly, referring to the theory of differential geometry, this study takes the curvature integral of the time–price functional curve as the realized volatility of high-frequency data in a certain period of time, which is called functional volatility (FV). In addition, the period of time could be set arbitrarily, so we can calculate the realized volatility at any time scale. Even when the time scale is small enough, instantaneous realized volatility could be estimated.

Different from point-based realized volatility calculation methods, the curve-based realized volatility calculation method has three advantages. First, the curve-based realized volatility calculation method could obtain the information about the sampling time point from t-1 and t, where t denotes a time variable. The point-based realized volatility calculation method ignores the capital asset price’s underlying dynamic shifts from one time point to the next. For example, if two adjacent time points have equal price, the return will be zero. However, the shift from one time point to the next time point comprises much information about price changes. Second, the curve-based realized volatility calculation method could effectively eliminate the interference of market microstructure noise on data. The functional data analysis method allows capital asset price to be decomposed into a continuous term and a noise term by asymptotic convergence, which could decouple the noise from the discrete-time series. Third, the curve-based realized volatility calculation method treats the whole curve as a single entity and there is also no concern about correlations between repeated, mixed frequencies and unequal intervals sampling problems [23]. Point-based realized volatility calculation methods are limited by the correlation of repeated samples, the sampling process for irregular data. The functional data analysis method relaxes the structural constraints and distribution setting of data acquisition, which represents a change in framework towards the handling of discrete time series.

The remainder of this study is as follows: Section 2 is the Methods. The Simulation experiments are described in Section 3. Conclusions and future work are described in Section 4.

2. Methods

2.1. Determination of Basis Function

Functional data analysis (FDA) takes the functional data as the research object, regarding the observed data as a whole [24]. Consider a one-dimensional functional data

X_{1} (t), X_{2} (t), \dots, X_{n} (t)

, which is an implementation of a stochastic process

X_{t}

on a closed interval

F

. From the dimension of time t, functional data are a kind of infinite-dimensional data [25]. In reality, it is difficult to obtain the curve with complete observation and without measurement error. Therefore, we first assume that

W_{i j} = X_{i} (T_{i j}) + U_{i j}, T_{i j} \in F, 1 ⩽ i ⩽ n, 1 ⩽ j ⩽ M_{i}

(1)

where

U_{i j}

is the independent and identically distributed observation error and is independent of

X_{i}

, satisfies

E (U_{i j}) = 0, E (U_{i j}^{2}) = σ_{u}^{2}

. The actual observation data are

(T_{i j}, W_{i j})

,

i = 1, 2, \dots, n; j = 1, 2, \dots, M_{i}

.

Due to the infinite dimension of functional data, it is significant to reduce the dimension. The common method is to expand the functional data using a set of bases. In more detail, suppose that

ϕ_{1} (t), ϕ_{2} (t), \dots

is a set of orthogonal bases defined on a closed interval

I

:

X (t) = \sum_{k = 1}^{\infty} ξ_{k} ϕ_{k} (t)

(2)

where

ξ_{k} = \int_{I} X (t) ϕ_{k} (t) d t

is the projection of

X_{t}

on

ϕ_{k} (t)

. In practice, we need to truncate the number of expansions, that is,

X (t) \approx \sum_{k = 1}^{K} ξ_{k} ϕ_{k} (t)

. In this way, the infinite-dimensional functional data

X_{t}

are approximated to the sum of the expansions of finite terms. The information contained in infinite-dimensional function data can be expressed by a finite-dimensional vector

{(ξ_{1}, ξ_{2}, \dots, ξ_{K})}^{⊤}

to achieve the purpose of dimension reduction.

Let the time series data be

Y_{i}, i = 0, 1, \dots, n

, and the fitting model can be constructed as follows:

Y (t) = \sum_{j = 0}^{m} β_{j} φ_{j} (t) + ε (t), 0 \leq t \leq 1, m < n

(3)

where

φ_{j} (t), j = 0, 1, \dots, m

are a set of basis functions;

β_{j}, j = 0, 1, \dots, m

denote the coefficient vector to be determined;

ε_{j} (t), j = 0, 1, \dots, m

represent the noise.

The time series

Y_{t}

given here is the result of parameterization of the original time series data

Y_{i}

(i = 0, 1, \dots, n)

. The determined parameters are divided into

Δ_{t} : t_{0} < t_{1} < \dots < t_{n}

. In this way, we are faced with the problem of which basis function should be chosen. First of all, it should be noted that polynomial functions can meet the requirements of mining useful information from complex data, and it is easy to calculate the function value and the derivative value of each order to realize visualization.

All the m-degree polynomials constitute the m-degree polynomial space, and any group of M + 1 linearly independent polynomials in a polynomial space of degree M can be regarded as a group of bases [26]. To better reflect the regularity of complex data, the number of data transformation peaks and valleys is described by m. Through computer input and interactive modification of the fitting curve, we can achieve the goal of description.

The same curve can be represented by different polynomial basis functions, with different properties. Power polynomials

t^{j}, j = 1, \dots, m

reflect the simplest polynomial basis [27]. The curve fitted by the power base has the advantages of simple form and easy calculation. However, the geometric mean of the coefficient vector in the power-based polynomial curve equation is not obvious. In addition, when the order is large, the coefficient matrix is ill-conditioned due to the system of linear equations that must be solved [28]. Lagrange basis function

L_{j} (t) = \prod_{j \neq i = 0}^{m} \frac{(t - t_{i})}{(t_{j} - t_{i})}

,

j = 0, 1, \dots, m

is normative and has obvious regularity [29]. However, its derivation is complicated, and all data points need to be recalculated every time data are added. That is not suitable for the requirements of data mining. Fourier basis

e^{i ω t}

can reveal the internal relationship between time and spectrum. However, when using Fourier transform, we need to use all the time-domain information of the signal, which lacks the time-domain positioning function [30]. Considering the characteristics of human–computer interaction and data mining, we select the Bernstein basis function in this paper. As a basis function, Bernstein polynomials are classical Bézier curves and become the foundation to develop complex curves and surfaces. It has many excellent properties, such as normalization, symmetry, recurrence, segmentation and control hull [31]. Therefore, we select the Bernstein basis function in this paper. The model fitted by Bernstein basis function is

Y (t) = \sum_{j = 0}^{m} β_{j} B_{j, m} (t) + ε (t), 0 \leq t \leq 1

(4)

Here,

β_{j}, j = 0, 1, \dots, m

denotes the coefficient vector, which is called the control point of fitting curve. Basis function

B_{j, m} (t) = C_{m}^{j} t^{j} {(1 - t)}^{m - j}, 0 \leq t \leq 1, j = 0, 1, \dots, m

(5)

is known as the Bernstein basis function.

Besides the good properties of normalization, symmetry, recurrence and segmentation, the Bernstein function also has a convex–hull property [32]. The convex hull of a point set is defined as the set of all convex combinations formed by the elements of the point set. The convex hull property of the fitting curve with the Bernstein basis function means that the curve always lies in the convex hull of its control vertex (see Figure 1).

2.2. Bernstein Basis Function Modeling

Consider a time series as

Y (t)

,

0 \leq t \leq 1

. Let mth-degree Bernstein polynomials be as basis functions [31]

B_{j, m} (t) = C_{m}^{j} t^{j} {(1 - t)}^{m - j}, 0 \leq t \leq 1, j = 0, 1, \dots, m

(6)

Construct the actual model

Y (t) = \sum_{j = 0}^{m} β_{j} B_{j, m} (t) + ε (t)

(7)

Fitting the time series data points, the sample regression equation is

\hat{Y} (t) = \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t), 0 \leq t \leq 1

(8)

Here, the parameter estimation of

β_{j}

is denoted by

{\hat{β}}_{j}

. Therefore, the model based on Bernstein basis function is

Y (t) = \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t) + e (t)

(9)

where

{\hat{β}}_{j}, j = 0, 1, \dots, m

is the estimator of the control vertex.

B_{j, m} (t)

denotes the Bernstein basis function.

e (t)

expresses the error term, which is

e (t) = Y (t) - \hat{Y} (t)

. We can further use the properties of the constructed curve to analyze the development law of the studied phenomenon.

It is noteworthy that

\hat{Y} (t)

is the value of fitting data point

Y (t)

on the curve (Equation (8)). The actual value obtained after the interference adjustment parameter is

Y (t)

in Equation (9). Moreover, stochastic variable

ε (t)

represents an error, including data measurement error and random error. Suppose

ε (t) \sim N (0, σ^{2})

, and

cov [ε (t_{1}), ε (t_{2})] = 0

for

t_{1} \neq t_{2}

.

In this paper, the least-square method is utilized to estimate the control points

β_{j}, j = 0, 1, \dots, m

. The time-series data

Y_{i}

,

i = 0, 1, \dots, n

are first parameterized. Let

τ_{i}

be the indexes corresponding to

Y_{i}

,

i = 0, 1, \dots, n

,

τ_{i} \geq 0

.

By normalizing the parameterization results of the above, the normalized parameterization results are generated:

t_{i} = \frac{τ_{i}}{m a x (τ_{i})}, i = 0, 1, \dots, n

(10)

In the measurement of high-frequency asset volatility, n is the number of samples per day. Then, the fitted asset price curve can be determined by the least square approach. Let the fitted curve be

\hat{Y} (t_{i}) = \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t_{i}), i = 0, 1, \dots, n

(11)

The sample model is

Y (t_{i}) = \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t_{i}) + e (t_{i})

(12)

Calculate the control point to minimize the following formula:

E = \sum_{i = 0}^{n} {(Y (t_{i}) - \hat{Y} (t_{i}))}^{2}

(13)

That is,

E ({\hat{β}}_{0}, {\hat{β}}_{1}, \dots, {\hat{β}}_{m}) = \sum_{i = 0}^{n} {(Y (t_{i}) - \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t_{i}))}^{2}

(14)

According to the least square method, the control vertex can be obtained

[\begin{matrix} {\hat{β}}_{0} \\ {\hat{β}}_{1} \\ ⋮ \\ {\hat{β}}_{m} \end{matrix}] = {(Φ^{T} Φ)}^{- 1} Φ^{T} [\begin{matrix} Y (t_{0}) \\ Y (t_{1}) \\ ⋮ \\ Y (t_{n}) \end{matrix}]

(15)

where

Φ = [\begin{matrix} B_{0, m} (t_{0}) & B_{1, m} (t_{0}) & \dots & B_{m, m} (t_{0}) \\ B_{0, m} (t_{1}) & B_{1, m} (t_{1}) & \dots & B_{m, m} (t_{1}) \\ ⋮ & \dots & ⋱ & \dots \\ B_{0, m} (t_{n}) & B_{1, m} (t_{n}) & \dots & B_{m, m} (t_{n}) \end{matrix}]

(16)

Φ^{T}

is the transpose of

Φ

.

In this way,

m + 1

control points

{\hat{β}}_{0}, {\hat{β}}_{1}, \dots, {\hat{β}}_{m}

in Equation (8) are estimated. The model requires that the fitting curve must coincide with the beginning and end of the original curve. Therefore, we modify the head and tail control points:

{\hat{β}}_{0} \to Y (t_{0})

,

{\hat{β}}_{m} \to Y (t_{n})

to ensure that the piecewise fitting curve can be successfully spliced to form an overall fitting curve, which can represent the whole data samples.

Then, we can obtain the fitting curve

\hat{Y} (t) = \sum_{j = 0}^{m} {\hat{β}}_{j} B_{j, m} (t) 0 \leq t \leq 1

(17)

As for our model, the only parameter to be determined is m, which is finally optimal by the minimum generalized cross-validation criterion (GCV):

G C V (m) = \frac{\sum_{i = 1}^{N} {(Y (t) - \sum_{j = 0}^{m} {\hat{β}}_{j} β_{j, m} (t_{i}))}^{2}}{{(1 - M (m) / N)}^{2}}

(18)

where

M (m)

denotes the number of effective parameters in the model, and N is the number of actual observation samples.

2.3. Volatility Measurement

Intuitively, after constructing the function curve of coarse high-frequency data. It is more natural to quantify the volatility through the characteristics of the function than to measure the realized volatility. As a result, we can naturally measure asset volatility by measuring the curve fluctuation degree of the asset curve.

The curvature of the function can be used to determine the degree of volatility of the measurement curve. A curve’s curvature is the rotation rate of the tangent direction angle of a point on this curve to the arc length. It is defined by the differential and represents the curve’s bending at a certain point. The greater the curvature, the greater the degree of curve bending, that is, the greater the degree of fluctuation [33].

Let the constructed asset fluctuation curve equation be

y = f (t)

, which contains the second derivative. The curvature of a function curve at point M is defined as

K = \frac{|y^{″}|}{{(1 + y^{' 2})}^{\frac{3}{2}}}

(19)

According to Equation (11), we can calculate the first and second derivatives of the asset change curve as follows:

y^{'} = \frac{\partial \hat{Y} (t)}{\partial t} = \sum_{j = 0}^{m} {\hat{β}}_{j} [\frac{t m - j}{t (t - 1)}] B_{j, m} (t),

(20)

\begin{matrix} y^{″} & = \frac{\partial^{2} \hat{Y} (t)}{\partial t^{2}} \\ = \sum_{j = 0}^{m} {\hat{β}}_{j} [\frac{m^{2} + m}{{(t - 1)}^{2}} + \frac{j^{2} - j}{t^{2} {(t - 1)}^{2}} + \frac{2 j (1 - m)}{t {(t - 1)}^{2}}] B_{j, m} (t), \end{matrix}

(21)

where

0 \leq t \leq 1

. Based on differential geometry theory, the total curvature of a curve is equal to the integral of curvature:

F V_{t} = \int_{t_{j}}^{t_{j + 1}} \frac{|y^{″}|}{{(1 + y^{' 2})}^{\frac{3}{2}}} d t

(22)

The formula above can characterize the volatility of high-frequency asset fluctuations and then be utilized as a volatility measure, which can be called functional volatility (FV). It is noted that the time interval here can be any period.

3. Experimental Analysis

To verify the effectiveness of our model, we use real data for empirical analysis and test the unequal spaced sample and noise on a high-frequency asset dataset. Due to a large amount of data, it is impossible to fit well with a low degree polynomial. Therefore, according to the references [34,35], we use 200 data sample points for fitting each time.

The data was obtained from a Thompson Ruter Tick History database (TRTH), which have asset changes per second. In particular, the original data are equidistant. We randomly select 1000 days and use the above model to calculate the daily functional volatility as the gold standard. Figure 2 and Figure 3 illustrate the asset price from 2012 to 2016 years used in this paper and daily volatility measured using complete dataset.

The relative error is chosen as the evaluation criterion in this paper, which can be expressed as

Error = \frac{1}{N} \sum_{i = 1}^{N} \frac{|\hat{F V_{i}} - F V_{i}|}{F V_{i}}

(23)

where

{\hat{F V}}_{i}

,

i = 0, 1, \dots, N

denotes the the daily volatility obtained by the original dataset, and the

F V_{i}

indicates the corresponding volatility under simulated conditions.

In detail, we design the following two simulation settings.

We randomly remove a certain proportion of data from the original daily data, that is, produce non-equidistant high-frequency data. The corresponding proportion is controlled by DropRate.
We randomly add noise $r * s i g m a$ to the original data, where r is randomly chosen from 0 to 1. Parameter $s i g m a$ determines the degree of the added noise.

We presented the maximum (sub), average (avg), and standard (std) error of relative error across 500 replications in Table 1 and Table 2. Results perform the efficiency of our method to address non-equidistant and noisy situations, in which the averaged and related error is less than 10%. In terms of the maximum related errors, the errors are more than 10% when DropRate is equal to 0.3 or the sigma exceeds 0.4.

It is worth noting that we randomly remove the original data in Simulation 1 to produce non-equidistant asset data. At the same time, it also means that we estimate the model with missing data. This also verifies that our model can overcome the case with missing values.

Usually, stock prices are modeled by lognormal distributions [36]. Therefore, we generate 500 days of high-frequency asset price data through lognormal distribution to further verify the effectiveness of our model. Following the above settings, we consider the deviation of the model in the case of random loss of samples and adding noise. Table 3 and Table 4 demonstrate the relate error results under different drop rate and sigma. Results show that our method has good robustness.

4. Conclusions and Future Work

In recent decades, financial data collection technologies have evolved to allow more intensive sampling of temporal, spatial, and other continuous measurements. At the same time, the available financial data are increasingly complex, such as data being recorded continuously at an interval and data being recorded intermittently at several discrete points in time. Faced with the real situation, this paper proposed the FDA approach to measure the volatility indicators.

The FDA approach represents a change in philosophy on financial time series data processing. Classical multivariate statistical techniques could only obtain information from sampling time points, and they do not take advantage of additional information that could be implied by sampling information between points in time.

The FDA approach indicates that the financial time series data reflect the influence of certain smooth functions that are assumed to underlie the observations. Some additional information could be extracted from the smoothness of underlying functions [37]. For example, topology information, such as derivatives and curvature, could be obtained by curve calculation [38,39]. Therefore, we develop a set of volatility calculation models based on the concept of curvature integration.

Measurement volatility based on the FDA approach could provide a new analytical idea about the financial market for scholars, investors, and policy managers. On the one side, the FDA approach could measure the volatility at any period, especially the instantaneous volatility. This helps us to delve into micro time and explore financial markets. In particular, for policy managers, markets can be monitored in real time. On other hand, the FDA approach represents the financial time series data as a whole curve. Volatility measurement is an example. Time series data are a common form of the finance market. Therefore, in the research on the finance market, it is a good consideration to use the FDA approach or nest the FDA approach in existing methods, such as dimension reduction, clustering, and classification. Possibly, some new, valuable, interesting analyses and findings will emerge.

For future research, this work can be extended to discuss the following three aspects. First, Bernstein polynomials are classical Bézier curves that have become the foundation to develop complex curves and surfaces. Therefore, more complex Bézier polynomials as the basis function of volatility measure deserve to be studied. Second, how to produce the curvature graph for realized volatility curve is also an interesting topic, which can refer to these articles [40]. Finally, as for the measurement method of functional volatility, future research can consider obtaining a better fitting curve through the kernel method and verifying its mathematical properties.

Author Contributions

Conceptualization, M.Z. and C.Y.; methodology, Z.L. and F.W.; software, F.W. and Y.M.; validation, Y.X., F.W., and C.Y.; formal analysis, Z.L.; investigation, F.W.; data curation, Z.L. and Y.M.; writing—original draft preparation, F.W. and C.Y.; writing—review and editing, F.W. and C.Y.; visualization, M.Z.; supervision, Y.X.; project administration, Y.X.; funding acquisition, M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the financial support provided by the major project of the National Social Science Foundation (20&ZD137).

Conflicts of Interest

The authors declare no conflict of interest.

References

Weng, F.; Zhang, H.; Yang, C. Volatility forecasting of crude oil futures based on a genetic algorithm regularization online extreme learning machine with a forgetting factor: The role of news during the COVID-19 pandemic. Resour. Policy 2021, 73, 102148. [Google Scholar] [CrossRef] [PubMed]
Duttilo, P.; Gattone, S.; Di Battista, T. Volatility modeling: An overview of equity markets in the euro area during COVID-19 Pandemic. Mathematics 2021, 9, 1212. [Google Scholar] [CrossRef]
Andersen, T.; Bollerslev, T.; Diebold, F.; Labys, P. The distribution of realized exchange rate volatility. J. Am. Stat. Assoc. 2001, 96, 42–55. [Google Scholar] [CrossRef]
Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econom. 1986, 31, 307–327. [Google Scholar] [CrossRef] [Green Version]
Hansen, P.; Huang, Z.; Shek, H. Realized GARCH: A joint model for returns and realized measures of volatility. J. Appl. Econom. 2012, 27, 877–906. [Google Scholar] [CrossRef]
Poon, S.; Granger, C. Practical issues in forecasting volatility. Financ. Anal. J. 2005, 61, 45–56. [Google Scholar] [CrossRef] [Green Version]
Andersen, T.; Bollerslev, T.; Diebold, F. Parametric and nonparametric volatility measurement. In Handbook of Financial Econometrics: Tools and Techniques; North-Holland: Chicago, IL, USA, 2010; pp. 67–137. [Google Scholar]
Hurvich, C.; Moulines, E.; Soulier, P. Estimating long memory in volatility. Econometrica 2005, 73, 1283–1328. [Google Scholar] [CrossRef] [Green Version]
Liu, C.; Maheu, J. Are there structural breaks in realized volatility? J. Financ. Econom. 2008, 6, 326–360. [Google Scholar] [CrossRef]
Bandi, F.M.; Russell, J.R. Separating microstructure noise from volatility. J. Financ. Econ. 2006, 79, 655–692. [Google Scholar] [CrossRef]
Hansen, P.; Lunde, A. Realized variance and market microstructure noise. J. Bus. Econ. Stat. 2006, 24, 127–161. [Google Scholar] [CrossRef]
Zhang, L.; Mykland, P.A.; Aït-Sahalia, Y. A tale of two time scales: Determining integrated volatility with noisy high-frequency data. J. Am. Stat. Assoc. 2005, 100, 1394–1411. [Google Scholar] [CrossRef]
Duong, D.; Swanson, N. Empirical evidence on the importance of aggregation, asymmetry, and jumps for volatility prediction. J. Econom. 2015, 187, 606–621. [Google Scholar] [CrossRef] [Green Version]
Breidt, F.; Crato, N.; Delima, P. The detection and estimation of long memory in stochastic volatility. J. Econom. 1998, 83, 325–348. [Google Scholar] [CrossRef] [Green Version]
Baillie, R.; Cecen, A.; Han, Y. High frequency Deutsche Mark-US Dollar returns: FIGARCH representations and non linearities. Multinatl. Financ. J. 2000, 4, 247–267. [Google Scholar] [CrossRef]
Granger, C. Long memory relationships and the aggregation of dynamic models. J. Econom. 1980, 14, 227–238. [Google Scholar] [CrossRef]
Alvarez, A.; Panloup, F.; Pontier, M.; Savy, N. Estimation of the instantaneous volatility. Stat. Inference Stoch. Process. 2012, 15, 27–59. [Google Scholar] [CrossRef] [Green Version]
Müller, H.; Sen, R.; Stadtmüller, U. Functional data analysis for volatility. J. Econom. 2011, 165, 233–245. [Google Scholar] [CrossRef] [Green Version]
Shang, H. Forecasting intraday S&P 500 index returns: A functional time series approach. J. Forecast. 2017, 36, 741–755. [Google Scholar]
Kokoszka, P.; Miao, H.; Zhang, X. Functional dynamic factor model for intraday price curves. J. Financ. Econom. 2015, 13, 456–477. [Google Scholar] [CrossRef] [Green Version]
Shang, H.; Yang, Y.; Kearney, F. Intraday forecasts of a volatility index: Functional time series methods with dynamic updating. Ann. Oper. Res. 2019, 282, 331–354. [Google Scholar] [CrossRef] [Green Version]
Yu, C.; Fang, Y.; Li, Z.; Zhang, B.; Zhao, X. Non-Parametric Estimation of High-Frequency Spot Volatility for Brownian Semimartingale with Jumps. J. Time Ser. Anal. 2014, 35, 572–591. [Google Scholar] [CrossRef]
Wang, J.; Chiou, J.; Müller, H. Functional data analysis. Annu. Rev. Stat. Its Appl. 2016, 3, 257–295. [Google Scholar] [CrossRef] [Green Version]
Ramsay, J. When the data are functions. Psychometrika 1982, 47, 379–396. [Google Scholar] [CrossRef]
Kokoszka, P.; Reimherr, M. Introduction to Functional Data Analysis; Chapman and Hall/CR: London, UK, 2017. [Google Scholar]
Ler, K. A brief proof of a maximal rank theorem for generic double points in projective space. Trans. Am. Math. Soc. 2001, 353, 1907–1920. [Google Scholar]
Beaton, A.; Tukey, J. The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics 1974, 16, 147–185. [Google Scholar] [CrossRef]
Hatefi, E.; Hatefi, A. Nonlinear Statistical Spline Smoothers for Critical Spherical Black Hole Solutions in 4-dimension. arXiv 2022, arXiv:2201.00949. [Google Scholar]
Dahiya, V. Analysis of Lagrange Interpolation Formula. IJISET-Int. J. Innov. Sci. Eng. Technol. 2014, 1, 619–624. [Google Scholar]
Wang, X.; Wang, J.; Wang, X.; Yu, C. A Pseudo-Spectral Fourier Collocation Method for Inhomogeneous Elliptical Inclusions with Partial Differential Equations. Mathematics 2022, 10, 296. [Google Scholar] [CrossRef]
Farouki, R. The Bernstein polynomial basis: A centennial retrospective. Comput. Aided Geom. Des. 2012, 29, 379–419. [Google Scholar] [CrossRef]
Farouki, R.; Goodman, T. On the optimal stability of the Bernstein basis. Math. Comput. 1996, 65, 1553–1566. [Google Scholar] [CrossRef] [Green Version]
Kühnel, W. Differential Geometry; American Mathematical Society: Providence, RI, USA, 2015. [Google Scholar]
Jianping, Z.; Zhiguo, L.; Caiyun, C. A New Predictive Model on Data Mining—Predicting Arithmetic of Bernstein Basic Function Fitting and Its Application for Stock Market. Syst. Eng. Theory Pract. 2003, 9, 35–41. [Google Scholar]
Shaojun, Z.; Hong, L. An improved model based on fitting predictions to Bernstein Basic Function. Stat. Decis. 2015, 8, 20–23. [Google Scholar]
Wang, S. A class of distortion operators for pricing financial and insurance risks. J. Risk Insur. 2000, 1, 15–36. [Google Scholar] [CrossRef] [Green Version]
Levitin, D.; Nuzzo, R.; Vines, B.; Ramsay, J. Introduction to functional data analysis. Can. Psychol. Can. 2007, 48, 135. [Google Scholar] [CrossRef] [Green Version]
Ferraty, F.; Mas, A.; Vieu, P. Nonparametric regression on functional data: Inference and practical aspects. Aust. N. Z. J. Stat. 2007, 49, 267–286. [Google Scholar] [CrossRef] [Green Version]
Mas, A.; Pumo, B. Functional linear regression with derivatives. J. Nonparametr. Stat. 2009, 21, 19–40. [Google Scholar] [CrossRef] [Green Version]
Farin, G. Class a Bézier curves. Comput. Aided Geom. Des. 2006, 7, 573–581. [Google Scholar] [CrossRef]

Figure 1. Convex hull diagram with four control points

P_{0}

,

P_{1}

,

P_{2}

,

P_{3}

.

Figure 1. Convex hull diagram with four control points

P_{0}

,

P_{1}

,

P_{2}

,

P_{3}

.

Figure 2. Asset prices from 2012 to 2016.

Figure 3. Daily volatility measured using the complete dataset.

Table 1. Related error results under different drop rates on real data.

DropRate	sub	avg	std
0.1	0.003993	0.000017	0.000192
0.2	0.035771	0.000225	0.001836
0.3	0.115362	0.000767	0.006560

Table 2. Relate error results under different sigma on real data.

Sigma	sub	avg	std
0.1	0.032560	0.016152	0.009657
0.2	0.065114	0.031076	0.019011
0.3	0.097928	0.049194	0.029173
0.4	0.130392	0.064272	0.038869

Table 3. Related error results under different drop rates on simulation data.

DropRate	sub	avg	std
0.1	0.001301	0.000209	0.000108
0.2	0.012924	0.000368	0.000861
0.3	0.239079	0.001688	0.012562

Table 4. Related error results under different sigma on simulation data.

Sigma	sub	avg	std
0.1	0.000072	0.00004	0.000571
0.2	0.000145	0.000114	0.002248
0.3	0.097928	0.000116	0.001764

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liang, Z.; Weng, F.; Ma, Y.; Xu, Y.; Zhu, M.; Yang, C. Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis. Mathematics 2022, 10, 1140. https://doi.org/10.3390/math10071140

AMA Style

Liang Z, Weng F, Ma Y, Xu Y, Zhu M, Yang C. Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis. Mathematics. 2022; 10(7):1140. https://doi.org/10.3390/math10071140

Chicago/Turabian Style

Liang, Zhenjie, Futian Weng, Yuanting Ma, Yan Xu, Miao Zhu, and Cai Yang. 2022. "Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis" Mathematics 10, no. 7: 1140. https://doi.org/10.3390/math10071140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Measurement and Analysis of High Frequency Assert Volatility Based on Functional Data Analysis

Abstract

1. Introduction

2. Methods

2.1. Determination of Basis Function

2.2. Bernstein Basis Function Modeling

2.3. Volatility Measurement

3. Experimental Analysis

4. Conclusions and Future Work

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI