Next Article in Journal
Bivariate Volatility Modeling with High-Frequency Data
Next Article in Special Issue
A Review of the ‘BMS’ Package for R with Focus on Jointness
Previous Article in Journal / Special Issue
On the Forecast Combination Puzzle
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecast Bitcoin Volatility with Least Squares Model Averaging

College of Business, Shanghai University of Finance and Economics, Shanghai 200433, China
Econometrics 2019, 7(3), 40; https://doi.org/10.3390/econometrics7030040
Submission received: 11 July 2019 / Revised: 6 September 2019 / Accepted: 11 September 2019 / Published: 14 September 2019
(This article belongs to the Special Issue Bayesian and Frequentist Model Averaging)

Abstract

:
In this paper, we study forecasting problems of Bitcoin-realized volatility computed on data from the largest crypto exchange—Binance. Given the unique features of the crypto asset market, we find that conventional regression models exhibit strong model specification uncertainty. To circumvent this issue, we suggest using least squares model-averaging methods to model and forecast Bitcoin volatility. The empirical results demonstrate that least squares model-averaging methods in general outperform many other conventional regression models that ignore specification uncertainty.
JEL Classification:
C52; C53; G12; G17

1. Introduction

Bitcoin, the first and still one of the foremost applications of blockchain technology by far, was introduced early in 2008. Until the end of December 2018, the market capitalization of Bitcoin was roughly $65 billion with $3800 per token. As for the whole Bitcoin network, by the end of December 2018, there are more than 10,000 full nodes distributed across the world and roughly $2.5 billion of value transacted on the main network. With the growth of the Bitcoin market, many investors are starting to view it as an emerging new asset class. In September 2015, the Commodity Futures Trading Commission (CFTC) in the United States officially designated Bitcoin as a commodity. Improved measures of Bitcoin volatility enable us to better gauge the current level of volatility and to understand its dynamics. Most importantly, Bitcoin volatility is now directly tradable,1 which accredits the importance of Bitcoin volatility forecasting.
How to model and predict the volatility of financial assets is an interesting topic in risk management. Traditional approaches employ parametric models such as the generalized autoregressive conditional heteroskedasticity (GARCH) or stochastic volatility models. Recently a new approach to modeling volatility dynamics has relied on improved measures of ex post volatility composed from high-frequency intraday data. This new measure is called realized volatility (RV), which possesses a slowly decaying autocorrelation function, sometimes known as long-memory.2 Various models have been proposed to capture stylized facts of realized volatility series, such as the fractionally integrated autoregressive moving average (ARFIMA) models3 used in Andersen et al. (2001b) and the heterogeneous autoregressive (HAR) model proposed by Corsi (2009). Compared with the ARFIMA model, the HAR model soon gained popularity because of its computational simplicity (e.g., ordinary least squares) and excellent out-of-sample performance.4
The HAR model can provide an intuitive economic interpretation that agents with three frequencies of trading (daily, weekly, and monthly) perceive and respond to, which changes the corresponding components of volatility.5 Nevertheless, the suitability of such a specification is not subject to enough verification. Craioveanu and Hillebrand (2012) employed a parallel computing method to investigate all the possible combinations of lags in the additive model. Others tested the validity of the lag structure in the conventional HAR model from a model selection perspective; see, e.g., Audrino et al. (2015,2016); and Audrino et al. (2016), among others. While the lag terms in the HAR model survive the tests based on the least absolute shrinkage and selection operator (LASSO) and the adaptive LASSO (Audrino et al. 2015; Audrino and Knaus 2016) only in the case of simulated data by the HAR model, there is strong evidence in Audrino et al. (2016) that casts some doubts on the fixed choice of aggregation frequencies in the HAR model. In particular, Audrino et al. (2016) found that a conventional fixed lag structure was not statistically sustained by the group LASSO estimates for certain individual stocks in an unstable market environment such as the 2007–2009 crisis. They addressed the above issue with a proposed flexible HAR model, built dynamically from the group LASSO estimates.
The above conclusions may or may not hold in Bitcoin volatility forecasting considering the unique features of the crypto asset market. To tackle this question from a different angle, we consider the forecast implication of a flexible lag structure generated by the least squares model-averaging method. Unlike the model selection approach that picks only one winning model out of a pool of candidate models, model averaging calculates the weighted average of a group of candidate models. Barnard (1963) first discussed the concept of “model combination” in a paper studying airline passenger data. Buckland et al. (1997) suggested using the exponential Akaike information criterion (AIC) estimates as the model weights and proposed the model averaged AIC. There exists many other averaging-type approaches that provide a means to tackle model uncertainty, for instance, the Bayesian model-averaging method discussed in length in Hoeting et al. (1999), the weighted-average least squares method by Magnus et al. (2010), and the random forest method by Breiman (2001), among others.
The performance of the model-averaging method heavily relies on the weights chosen for the estimation process. In a pioneering study, Hansen (2007) proposed the Mallows model averaging (MMA) method that is asymptotically optimal in the sense of achieving the lowest possible mean squared errors. Wan et al. (2010) completed the theoretical foundation of the MMA. Extensions of the MMA that allow possible structural breaks, near unit root, and heteroskedasticity can be found in Hansen (2009,2010), and Hansen and Racine (2012), respectively. Xie (2015) proposed the prediction model averaging (PMA) method. Zhao et al. (2016) extended the PMA method to allow for heteroskedastic error terms (HPMA). Liu and Okui (2013) also proposed a heteroskedasticity-robust Mallows’ C p model-averaging method (HRCP).
There is a growing literature on solving the model uncertainty issue in volatility forecasting with least squares model averaging. Lehrer et al. (2018) proposed the model averaging HAR (MAHAR) method that optimally averages the forecasts of HAR models with different lag indexes. Qiu et al. (2019) showed that the above method can be extended to a more complicated HAR model with estimators of the variation of positive and negative returns (semi-variance components). Besides the above methods, we consider the approach designed by Qiu and Xie (2018), who proposed the heteroskedasticity-robust model averaging HAR method (H-MAHAR) that mainly applies the HPMA as the core model averaging estimator to exchange rate volatility. As a complement to the HPMA, we also include the jackknife model averaging (JMA) and the heteroskedasticity robust C p (HRCP) model averaging estimators as companion methods in this paper.
In the empirical exercise, we consider a series of estimators including 9 conventional regression methods, 1 LASSO method, and 4 model-averaging methods to model and forecast the realized variance of Bitcoin prices. We show that the model-averaging methods that account for model uncertainty generally outperform the conventional regressions and the model-selection-based LASSO method. Moreover, the heteroskedasticity-robust methods tend to perform relatively better. Compared with non-model-averaging methods, the H-MAHAR method yields the highest forecasting accuracy in most of the exercises. The improvement that H-MAHAR provides is statistically significant at the 5% level, as confirmed by the Giacomini–White test (Giacomini and White 2006).
The reminder of the paper is arranged as follows. Section 2 provides a more detailed overview of existing HAR strategies. Section 3 discusses the way to model uncertainty under heteroskedasticity using least squares model averaging. Section 4 describes the data. Section 5 presents the empirical results, where we compared 14 methods in rolling window exercises. In all cases, model-averaging methods tended to have the dominating performance. To examine the robustness of the results, we tried different experimental settings in Section 6. Section 7 concludes this paper.

2. Prior HAR-Type Strategies to Forecast Volatility

Following Andersen and Bollerslev (1998), we estimate daily RV at day t ( RV t ) by summing the corresponding M equally spaced intra-daily squared returns r t , j . Here, the subscript t indexes day t and j indicates the time within day t,
RV t j = 1 M r t , j 2
where t = 1 , 2 , , T , j = 1 , 2 , , M , and r t , j define continuously compounded high-frequency returns by differing log-prices p t , j ( r t , j = p t , j p t , j 1 ).
Among the RV models, the HAR model proposed by Corsi (2009) is quite prevalent. Not only is this because the HAR model accurately approximates the long-memory and multiscaling properties of RV but also this is very easy to implement in practice. The standard HAR model in Corsi (2009) postulates that the h-step-ahead daily RV t + h can be described by
RV t + h = β 0 + β d RV t ( 1 ) + β w RV t ( 5 ) + β m RV t ( 22 ) + e t + h ,
where the explanatory variables can take the general form of RV t ( l ) . RV t ( l ) is defined by
RV t ( l ) l 1 s = 1 l RV t s
where l is the period averages of daily RV, β is the coefficients, and { e t } t is a zero mean innovation process. The standard HAR model in Equation (2) is pinned down by some vector of lag index l = [ 1 , 5 , 22 ] .
Andersen et al. (2007) extended the standard HAR model two ways. First, they added the daily jump component J t to Equation (2) to explicitly capture its impacts. The extended model is denoted the HAR-J model:
RV t + h = β 0 + β d RV t ( 1 ) + β w RV t ( 5 ) + β m RV t ( 22 ) + β j J t + e t + h ,
where the empirical measurement of the squared jumps is J t = max ( RV t BPV t , 0 ) and the standardized realized bipower variation (BPV) is defined as
BPV t ( 2 / π ) 1 j = 2 M | r t , j 1 | | r t , j | .
Second, through a decomposition of RV into the continuous sample path and the jump component based on the Z 1 , t statistic, Andersen et al. (2007) reconstructed the HAR-J model by explicitly incorporating the two types of volatility components mentioned above. The Z 1 , t statistic identifies the “significant” jumps C J t and the continuous sample path components C S P t respectively as
CSP t I ( Z t Φ α ) · RV t + I ( Z t Φ α ) · BPV t , CJ t = I ( Z t > Φ α ) · max ( RV t BPV t , 0 ) ,
where Z t is the ratio statistic in Huang and Tauchen (2005)6 and Φ α is the cumulative distribution function (CDF) of a standard Gaussian distribution with an α level of significance. The daily, weekly, and monthly average components of CSP t and CJ t are then constructed in the same manner as RV ( l ) in Equation (3). The model specification for the continuous HAR-J, in other words, the HAR-CJ, is given by
RV t + h = β 0 + β d c CSP t ( 1 ) + β w c CSP t ( 5 ) + β m c CSP t ( 22 ) + β d j CJ t ( 1 ) + β w j CJ t ( 5 ) + β m j CJ t ( 22 ) + e t + h .
Note the HAR-CJ model explicitly controls for the weekly and monthly effects of continuous jumps through the CJ t ( 1 ) , CJ t ( 5 ) , and CJ t ( 22 ) terms, whereas the HAR-J model consists of only one aggregate jump term J t . Thus, the HAR-J model can be regarded as a special and restrictive case of the HAR-CJ model for β d = β d c + β d j , β j = β d j , β w = β w c + β w j , and β m = β m c + β m j .
To capture the role of the “leverage effect” in predicting volatility dynamics, Patton and Sheppard (2015) developed a group of models using signed realized measures. The first model, denoted as HAR-RS-I, decomposes the daily RV in the standard HAR model (Equation (2)) into two asymmetric semi-variances: RS t + and RS t .
RV t + h = β 0 + β d + RS t + + β d RS t + β w RV t ( 5 ) + β m RV t ( 22 ) + e t + h ,
where RS t = j = 1 M r t , j 2 · I ( r t , j < 0 ) and RS t + = j = 1 M r t , j 2 · I ( r t , j > 0 ) . To verify whether the realized semi-variances add something beyond the classical leverage effect, Patton and Sheppard (2015) augmented the HAR-RS-I model with a term interacting the lagged RV with an indicator for negative lagged daily returns RV t ( 1 ) · I ( r t < 0 ) . The second model in Equation (7) is named HAR-RS-II.
RV t + h = β 0 + β 1 RV t ( 1 ) · I ( r t < 0 ) + β d + RS t + + β d RS t + β w RV t ( 5 ) + β m RV t ( 22 ) + e t + h ,
where RV t ( 1 ) · I ( r t < 0 ) is designed to capture the effect of negative daily returns. As in the HAR-CJ model, the third and fourth models in Patton and Sheppard (2015), denoted as HAR-SJ-I and HAR-SJ-II respectively, disentangle the signed jump variations and the BPV from the volatility process.
RV t + h = β 0 + β d j SJ t + β d b p v BPV t + β w RV t ( 5 ) + β m RV t ( 22 ) + e t + h ,
RV t + h = β 0 + β d j SJ t + β d j + SJ t + + β d b p v BPV t + β w RV t ( 5 ) + β m RV t ( 22 ) + e t + h ,
where SJ t = RS t + RS t , SJ t + = SJ t · I ( SJ t > 0 ) , and SJ t = SJ t · I ( SJ t < 0 ) . The HAR-SJ-II model extends the HAR-SJ-I model by distinguishing the effect of a positive jump variation from that of a negative jump variation.

3. Model Uncertainty

It has been a tradition for the past literature to assume the lag structure of the HAR model to be l = [ 1 , 5 , 22 ] , which mimics the daily, weekly, and monthly traders in traditional financial markets that only open on workdays. On the other hand, given the 24/7 nonstop nature of bitcoin trading, it may not be appropriate to set the lag index at [ 1 , 5 , 22 ] . An initial guess for the lag index would be l = [ 1 , 7 , 30 ] that represents the tradition of daily, weekly, and monthly averages. However, the suitability of such a specification is subject to a statistical investigation, which is likely to cause evident model uncertainty.
Suppose the dependent variable is y = [ RV 1 , , RV T ] and the explanatory variable is X = [ x 1 , , x T ] ,7 where the specification of x t takes the general form of the HAR model
x t = 1 , RV t h ( l 1 ) , RV t h ( l 2 ) , , RV t h ( l p ) .
Here, we do not restrict the lag index l = [ l 1 , l 2 , , l p ] to be [ 1 , 5 , 22 ] . Instead, we acknowledge the specification uncertainty in l and consider a group of M candidate models to approximate the true data generating process. Following an usual approach in the model averaging literature, the set of M candidate models is constructed by taking a full permutation of all the lags from RV t h ( l 1 ) to RV t h ( l p ) ( RV t h ( l 1 ) , , RV t h ( l p ) and [ l 1 , , l p ] = [ 1 , , 30 ] ). The maximum lag order l p is chosen as 30. In this way, there are distinct model weights assigned to each HAR-type model with different lag combinations. Moreover, as the underlying data sets vary, this will alter the relevant model weights, which effectively makes the method dynamic and data-driven.
Note that the model averaging estimator with pre-screened candidate models is implemented in this paper, since keeping the total number of candidate models manageable or slowing its convergence to infinity is a necessary condition to maintain the asymptotic optimality of least square model averaging estimators. However, in the context of the HAR model with a maximum lag order of l p , we could end up with 2 l p candidate models and the number of potential models grows exponentially with l p . To solve this issue, we first apply the model screening method, for example, the adaptive regression by mixing with the model selection (ARMS) approach by Yuan and Yang (2005) or the hetero-robust model screening (HRMS) approach by Xie (2017). Both methods shrink the number of potential models by specifying model selection criteria before model averaging to an appropriate degree.
The true model is presumed to be
y = μ + e ,
where y = [ y 1 , , y T ] , μ = [ μ 1 , , μ T ] , and e = [ e 1 , , e T ] . μ t can be considered the conditional mean in the period t, μ t = E ( y t | y t h , y t h 1 , ) , and the error term e t has the zero conditional mean E ( e t | y t h , y t h 1 , ) = 0 . Note that the error term e is assumed to be heteroskedastic such that E ( e t 2 | x t ) = σ t 2 , which reflects a more realistic characterization of the realized volatility for a wide class of financial assets. In addition, we also hypothesize that e is not serially correlated and E ( e e | X ) = Ω = diag { σ 1 2 , , σ T 2 } .8 Let the mth candidate model be
y = X m β m + e m ,
where X m are subsets of columns of X . With X m at hand, β m can be estimated by β ^ m = X m X m 1 X m y , and thus, μ is estimated by
μ ^ m = X m β ^ m = X m X m X m 1 X m y = P m y ,
where P m is a projection matrix for the model m. Extending from Hansen (2008), the optimal mean-square h-period ahead forecast is the conditional mean μ T + h . Therefore, the least-squares forecast of y T + h from the mth approximation model is then y ^ T + h m = μ ^ T + h m = x T + h m β ^ m . Note that by the definition of Equation (10), x T + h m is observable in period t.
We obtain the forecasts of y T + h from all approximation models and define the vector of forecasts y ^ T + h
y ^ T + h y ^ T + h 1 , y ^ T + h 2 , , y ^ T + h M .
The model averaging forecast is simply the weighted average of y ^ T + h such that
y ^ T + h ( w ) w y ^ T + h = m = 1 M w m y ^ T + h m ,
where w = w 1 , , w M is a weight vector in the unit simplex in R M
H w [ 0 , 1 ] M : m = 1 M w m = 1 .
The performance of model averaging forecast crucially depends on the weight vector w . The model averaging estimator of the conditional mean is then given by
μ ^ ( w ) P ( w ) y ,
where P ( w ) m = 1 M w m P m is the averaged projection matrix. The H-MAHAR method is the heteroskedasticity-robust version of the model averaging HAR (MAHAR) method proposed by Lehrer et al. (2018). The MAHAR criterion function is defined as follows:
MAHAR ( w ) = y μ ^ ( w ) y μ ^ ( w ) T + k ( w ) T k ( w ) ,
where k ( w ) m = 1 M w m k m is the effective number of parameters and k m is the number of regressors in the model m. We estimate the MAHAR weight estimator by minimizing the MAHAR criterion function under the restriction of w H .
Like most model selection and model averaging criteria, the H-MAHAR criterion balances between the fit and the complexity of a model:
H - MAHAR ( w ) = y μ ^ ( w ) y μ ^ ( w ) + 2 tr P ( w ) Ω ^ ( w ) ,
where Ω ^ ( w ) diag { e ^ 1 2 ( w ) , , e ^ T 2 ( w ) } is the averaged estimate of the Ω matrix using model averaging residuals e ^ ( w ) = [ e ^ 1 2 ( w ) , , e ^ T 2 ( w ) ] = y μ ^ ( w ) .
The criterion in Equation (15) can be implemented to compute the empirical weight vector w ^ through
w ^ = arg min w H H MAHAR ( w ) .
Therefore, we obtain the model averaging forecast of y T + h following y ^ T + h ( w ^ ) = w ^ y ^ T + h . Note that the H-MAHAR estimator can be considered an extension to the model averaging with averaging covariance matrix (MAACM) estimator of Zhao et al. (2016) under the HAR framework, whereas the original MAACM estimator assumes no dynamic model structures.
Another heteroskedasticity-robust model-averaging method is the JMA estimator by Hansen and Racine (2012). The original JMA deals with cross-sectional data. Zhang et al. (2013) proved the asymptotic optimality of the JMA estimator under a dependent time-series. The JMA estimator is also known as leave-one-out cross-validation model averaging. As its name indicates, the JMA requires the use of jackknife residuals for the average estimator. The jackknife residual vector for model m can be conveniently expressed as e ^ m J = D m e ^ m , where e ^ m is the least squares residual vector and D m is the n × n diagonal matrix with the ith diagonal element equal to ( 1 h i m ) 1 . The term h i m is the ith diagonal element of the projection matrix P m . Define an n × M matrix with all the jackknife residuals, in which E ^ J = e ^ ( 1 ) J , , e ^ ( M ) J . The least squares cross-validation criterion for the JMA is simply
JMA n ( w ) = 1 n w E ^ J E ^ J w
with model weights w estimated through w ^ = argmin w H JMA n ( w ) .
Liu and Okui (2013) adopted the same model setup to propose the HRCP model averaging estimator for linear regression models with heteroskedastic errors. They demonstrated the asymptotic optimality of the HRCP estimator when the error term exhibits heteroskedasticity. They proposed estimating the model weights by the following feasible HRCP criterion:
HRCP ( w ) = y P ( w ) y 2 + 2 i = 1 n e ^ i 2 p i i ( w )
with w ^ = arg min w H HRCP ( w ) . Obtaining w by minimizing Equation (16) under the condition w H is a quadratic optimization process.
Equation (16) includes a preliminary estimate e ^ i that must be obtained prior to estimation. Liu and Okui (2013) discussed several ways to obtain e ^ i in practice. When the models are nested, Liu and Okui (2013) suggested using the residuals from the largest model. When the models are non-nested, they recommended building a model that contains all the regressors in the potential models and taking the corresponding predicted residuals. In addition, a degree-of-freedom correction on e ^ i is reccomended to improve finite-sample properties. For example, when the mth model is chosen to obtain e ^ i , we may use
e ^ = n / ( n k m ) ( I P m ) y
instead of ( I P m ) y to generate the preliminary estimate e ^ i .

4. Data Description

Binance was founded in September 2017 and is now the largest crypto exchange around the world. Since the Bitcoin to U.S. dollar (BTC/USD) price data on Binance has only recently become available, we use the data from 1 January 2018 to 20 December 2018 for this exercise. The total number of daily observations is 352. We estimate the daily RV using Equation (1) at the 5-min interval.
The evolution of the RV data over this period is plotted by the solid line in the upper panel of Figure 1, whereas the horizontal axis represents the date and the vertical axis on the left-hand side stands for RV. Besides RV, the price of BTC/USD is also depicted by the dashed line with the vertical axis on the right-hand- ide representing the price. We also list the corresponding daily trading volume in the lower panel of Figure 1. As seen in Figure 1, the dynamics of the RV follow the movements of price and volume: the RV increases as the price changes dramatically, which is usually accompanied by a noticeable peak in the trading volume.
Table 1 presents summary statistics for the data and p-values of both the Jarque–Bera (JB) test for normality and of the Augmented Dickey–Fuller (ADF) tests for unit root. Note that, for the JB and ADF test statistics that are outside tabulated critical values, we report the maximum (0.999) or minimum (0.001) p-values. In Table 1, we consider the first half, the second half, and full samples in columns 2–4, respectively. Each of the series exhibits tremendous variability and a large range across the respective sample period. Furthermore, none of the series are normally distributed or nonstationary at the 5% level.

5. The Empirical Exercise

To investigate the relative prediction efficiency of the H-MAHAR estimator and its comparison methods, we conduct an h-step-ahead rolling window exercise of forecasting the BTC/USD RV for various forecasting horizons.9 Table 2 lists each estimator considered in the exercise. For all the HAR-type estimators in Panel A, except the HAR-Full model with all the lagged covariates from 1 to 30, we set l = [ 1 , 7 , 30 ] . For the model-averaging methods in Panel B, our general unrestricted model that includes all covariates is the HAR-Full model which only replaces RV t ( 1 ) 10 with the semi-variance components from the HAR-RS-I. The candidate model set is first pre-screened by the ARMS method of Yuan and Yang (2005), and we only pick the top 10 models. The tuning parameter in LASSO is estimated through a 5-fold cross-validation.11 Throughout the experiment, the window length is fixed at 100 observations. We also tried other window lengths and reached similar conclusions. See Section 6.2 for additional details.
We first consider the case of one-day-ahead forecast ( h = 1 ). The results of the prediction experiment are reported in Table 3. The estimation strategies are listed in the first column, and the remaining columns present alternative criteria to evaluate the forecast performance. The criteria include (i) the mean squared forecast error (MSFE), (ii) the mean absolute forecast error (MAFE), (iii) the standard deviation of the forecast error (SDFE), and (iv) the Mincer–Zarnowitz pseudo R 2 .
To ease interpretation, the results that identify the estimator with the best performance in each column of Table 3 is marked in bold. The performance of autoregressive models, represented by the AR(1) and HAR-Full models, is weak. For each panel, the HAR-type methods demonstrate noticeably improved performances relative to the autoregressive models. In the case of Bitcoin volatility, there is not much gain from including the jump and/or semi-variance components in the standard HAR model. The above set of results suggests that the heterogeneity in modeling Bitcoin volatility cannot be fully accommodated by simply adding extra covariates to the linear model. The least squares model-averaging methods that acknowledge model uncertainty show superior forecasting accuracy under all the evaluation criteria. Among the averaging methods, H-MAHAR displays the best performance. On the other hand, the model-selection-based LASSO method has the worst performance in this situation.
To examine if the improvement from the least squares model-averaging methods is statistically significant, we perform the modified Giacomini–White (GW) test (Giacomini and White 2006)12 of the null hypothesis that the column method performs equally as well as the row method in terms of MAFE. The corresponding p-values are presented in Table 4 for h = 1 . We see that the gains in forecast accuracy from the model-averaging methods relative to other strategies are statistically significant at the 5% level.
By exploring weight estimates of the H-MAHAR estimator on the full dataset, we can shed light on both the relative importance of the candidate models and the inclusion of various HAR-type lagged components. The models that are assigned the five highest weights by the H-MAHAR estimator are described in Table 5 (presented in the 2nd row of Table 5 in a descending fashion). The “x” sign indicates that the corresponding covariate (listed in the first column) is contained in the model. Certain variables, like RV t and RV t ( 30 ) , are included in every model, but variables like RV t + or RV t ( 10 ) are excluded from each of the top five dominant models.
Throughout our analysis, we find that the incorporation of negative semi-variances improves the prediction accuracy and explains a large fraction of the variation in RV, which is consistent with the finding of the literature (Patton and Sheppard 2015). The H-MAHAR method places large weights on models with HAR components of lag indices greater than 15, which may be in part due to the strong short-term performance of the RV t variable. We also observe that HAR components with high lag indices (for example, RV t ( 29 ) and RV t ( 30 ) ) mimicking the long-term dynamics of RV are intensively picked by the model averaging process. Most importantly, none of the top 5 models has the conventional lag index specification of [ 1 , 7 , 30 ] . The above exercise uncovers the sheer existence of model uncertainty for Bitcoin volatility and accredits the use of model-averaging methods.

6. Robustness Check

In this section, we perform three robustness checks on our results in Section 5. We first extend the exercises to relatively longer forecast horizons. Specifically, we consider h = 2 , 3 , and 4. In the second robustness check, we consider alternative window lengths. In the last robustness check, the H-MAHAR method is compared with Model 1 from Table 5, the one with the highest model weight among all candidate models.

6.1. Various Forecast Horizons

Table 6 represents the forecast performance of the considered estimators for h = 2 , 3, and 4 periods ahead.13 Table 7 examines the statistical significance of the forecasting accuracy improvement. For all h periods, the forecasts by least squares model averaging estimators dominate those by other methods in general. Among all the model averaging estimators, the HRCP method is seen to perform the best in most times according to the criteria we used, although such improvement is not statistically significant according to the results in Table 7.

6.2. Alternative Window Lengths

In the main exercises, we set the window length at L = 100 . In this section, we also tried other window lengths such as L = 50 and 200. We present the estimation results for h = 1 . Although not reported here, we also tried other forecast horizons and the robustness remains intact.
Table 8 shows the forecast performance of all the methods for various window lengths. In all the cases, the H-MAHAR estimator yields the smallest MSFE, MAFE, and SDFE and the largest Pseudo R 2 . We examine the statistical significance of the forecast accuracy improvement in Table 9. The small p-values on the H-MAHAR method against other methods, especially that with no model averaging estimators, indicate that the improvement is significant at the 5% level in most cases.

7. Conclusions

In this paper, we study the forecast performance of least squares model-averaging methods when predicting Bitcoin volatility. Our method allows for a more general lag structure under the HAR framework, instead of restricting it to daily, weekly, and monthly frequencies. Specially, we estimate the semi-variance HAR models in Patton and Sheppard (2015) with the least squares model-averaging method and consider constructing the potential model set with a full permutation of all of the possible lags and the maximum lag order of 30. The H-MAHAR-embedded model is data-driven, as the empirical weights on potential models with different lag combinations vary with underlying volatility series and forecast horizons.
In the out-of-sample application to high-frequency data of the realized variance of BTD/USD, we provide suggestive evidence that there exists excessive model uncertainty when modeling the Bitcoin volatility by conventional regression methods. We further demonstrate that the model-averaging methods can generally outperform conventional regression methods under various forecast criteria as well as across all forecast horizons ( h = 1 , 2 , 3 , 4 ) . Specifically, we apply the GW test to examine the statistical significance of the improvement made by the model-averaging method. We reveal that the model-averaging method, especially the one robust to heterskedasicity (the H-MAHAR), performs significantly better than conventional regressions at a 5% confidence level. Therefore, the least squares model-averaging methods adapt themselves remarkably well to a relatively short sample with evident model uncertainty.
This research also shed some light on future works related to the emerging asset class such as the cryptocurrency. When a new asset class is introduced, proper asset valuation theory is always invented with lags and institutional investors will hesitate to enter the market for risk control purposes. Regulations and technology developments are also likely to keep the market structure susceptible to shocks and to cause great price variations. Moreover, the lack of trading data of long durations is particularly a concern compared with other well-established asset classes. In this situation, model averaging contributes to alleviating model specification uncertainty and even to controlling for heteroskedasticity. There are still some interesting questions left to further research, for instance, the deep relationship between the crypto trading environment (i.e., the impact of sentiment ) and volatility data structure.

Funding

This research was funded by the National Natural Science Foundation of China grant number 71701175 and the Humanities and Social Science Fund of Ministry of Education of China grant number 17YJC790174.

Acknowledgments

I wish to thank Yue Qiu, Guanxi Yi, and Jun Yu, seminar participants at the SoFiE 2019 Conference in Shanghai from Xiamen University, Shanghai University of Finance and Economics, and Singapore Management University, respectively, for their helpful comments and suggestions. The usual caveat applies.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Andersen, Torben G., and Tim Bollerslev. 1998. Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts. International Economic Review 39: 885–905. [Google Scholar] [CrossRef]
  2. Andersen, Torben G., Tim Bollerslev, and Francis X. Diebold. 2007. Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics 89: 701–20. [Google Scholar] [CrossRef]
  3. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Heiko Ebens. 2001a. The distribution of realized stock return volatility. Journal of Financial Economics 61: 43–76. [Google Scholar] [CrossRef]
  4. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys. 2001b. The Distribution of Realized Exchange Rate Volatility. Journal of the American Statistical Association 96: 42–55. [Google Scholar] [CrossRef]
  5. Audrino, Francesco, and Simon D. Knaus. 2016. Lassoing the HAR Model: A Model Selection Perspective on Realized Volatility Dynamics. Econometric Reviews 35: 1485–521. [Google Scholar] [CrossRef]
  6. Audrino, Francesco, Huang Chen, and Okhrin Ostap. 2019. Flexible HAR Model for Realized Volatility. Studies in Nonlinear Dynamics & Econometrics 23: 1–22. [Google Scholar]
  7. Audrino, Francesco, Lorenzo Camponovo, and Constantin Roth. 2015. Testing the lag Structure of Assets’ Realized Volatility Dynamics. Economics Working Paper Series 1501; St. Gallen: University of St. Gallen, School of Economics and Political Science. [Google Scholar]
  8. Barnard, George A. 1963. New Methods of Quality Control. Journal of the Royal Statistical Society. Series A (General) 126: 255–58. [Google Scholar] [CrossRef]
  9. Breiman, Leo. 2001. Random Forests. Machine Learning 45: 5–32. [Google Scholar] [CrossRef] [Green Version]
  10. Buckland, Steven T., Kenneth P. Burnham, and Nicole H. Augustin. 1997. Model Selection: An Integral Part of Inference. Biometrics 53: 603–18. [Google Scholar] [CrossRef]
  11. Corsi, Fulvio, Francesco Audrino, and Roberto Renò. 2012. HAR Modeling for Realized Volatility Forecasting. In Handbook of Volatility Models and Their Applications. Hoboken: John Wiley & Sons, Inc., pp. 363–82. [Google Scholar]
  12. Corsi, Fulvio, Stefan Mittnik, Christian Pigorsch, and Uta Pigorsch. 2008. The Volatility of Realized Volatility. Econometric Reviews 27: 46–78. [Google Scholar] [CrossRef]
  13. Corsi, Fulvio. 2009. A Simple Approximate Long-Memory Model of Realized Volatility. Journal of Financial Econometrics 7: 174–96. [Google Scholar] [CrossRef]
  14. Craioveanu, Mihaela, and Eric Hillebrand. 2012. Why It Is OK to Use the HAR-RV (1, 5, 21) Model. Technical Report. Missouri: University of Central Missouri. [Google Scholar]
  15. Dacorogna, Michael M., Ulrich A. Müller, Robert J. Nagler, Richard B. Olsen, and Olivier V. Pictet. 1993. A geographical model for the daily and weekly seasonal volatility in the foreign exchange market. Journal of International Money and Finance 12: 413–38. [Google Scholar] [CrossRef]
  16. Giacomini, Raffaella, and Halbert White. 2006. Tests of Conditional Predictive Ability. Econometrica 74: 1545–78. [Google Scholar] [CrossRef]
  17. Hansen, Bruce E. 2007. Least Squares Model Averaging. Econometrica 75: 1175–89. [Google Scholar] [CrossRef]
  18. Hansen, Bruce E. 2008. Least-squares forecast averaging. Journal of Econometrics 146: 342–50. [Google Scholar] [CrossRef]
  19. Hansen, Bruce E. 2009. Averaging Estimators for Regressions with A Possible Structural Break. Econometric Theory 25: 1498–514. [Google Scholar] [CrossRef]
  20. Hansen, Bruce E. 2010. Averaging Estimators for Autoregressions with A Near Unit Root. Journal of Econometrics 158: 142–55. [Google Scholar] [CrossRef]
  21. Hansen, Bruce E., and Jeffrey S. Racine. 2012. Jackknife model averaging. Journal of Econometrics 167: 38–46. [Google Scholar] [CrossRef]
  22. Hoeting, Jennifer A., David Madigan, Adrian E. Raftery, and Chris T. Volinsky. 1999. Bayesian Model Averaging: A Tutorial. Statistical Science 14: 382–401. [Google Scholar]
  23. Huang, Xin, and George Tauchen. 2005. The Relative Contribution of Jumps to Total Price Variance. Journal of Financial Econometrics 3: 456–99. [Google Scholar] [CrossRef]
  24. Lehrer, Steven F., Tian Xie, and Xinyu Zhang. 2018. Wits versus Tweets: Does Adding Social Media Wisdom Trump Admitting Ignorance when Forecasting the CBOE VIX? Working Paper A0167. Hong Kong, China: The City University of Hong Kong. [Google Scholar]
  25. Liu, Qingfeng, and Ryo Okui. 2013. Heteroskedasticity-robust Cp Model Averaging. The Econometrics Journal 16: 463–72. [Google Scholar] [CrossRef]
  26. Magnus, Jan R., Owen Powell, and Patricia Prüfer. 2010. A comparison of two model averaging techniques with an application to growth empirics. Journal of Econometrics 154: 139–53. [Google Scholar] [CrossRef]
  27. Müller, Ulrich A., Michel M. Dacorogna, Rakhal D. Davé, Olivier V. Pictet, Richard B. Olsen, and J. Robert Ward. 1993. Fractals and Intrinsic Time—A Challenge to Econometricians. Technical Report. Zürich: Olsen & Associates. [Google Scholar]
  28. Patton, Andrew J., and Kevin Sheppard. 2015. Good Volatility, Bad Volatility: Signed Jumps and The Persistence of Volatility. The Review of Economics and Statistics 97: 683–97. [Google Scholar] [CrossRef]
  29. Qiu, Yue, and Tian Xie. 2018. Forecasting Foreign Exchange Realized Volatility: A Least Squares Model Averaging Approach. Journal of Systems Science and Mathematical Sciences 38: 725–44. [Google Scholar]
  30. Qiu, Yue, Xinyu Zhang, Tian Xie, and Shangwei Zhao. 2019. Versatile HAR model for realized volatility: A least square model averaging perspective. Journal of Management Science and Engineering 4: 55–73. [Google Scholar] [CrossRef]
  31. Wan, Alan TK, Xinyu Zhang, and Guohua Zou. 2010. Least Squares Model Averaging by Mallows Criterion. Journal of Econometrics 156: 277–83. [Google Scholar] [CrossRef]
  32. Xie, Tian. 2015. Prediction Model Averaging Estimator. Economics Letters 131: 5–8. [Google Scholar] [CrossRef]
  33. Xie, Tian. 2017. Heteroscedasticity-robust Model Screening: A Useful Toolkit for Model Averaging in Big Data Analytics. Economics Letters 151: 119–22. [Google Scholar] [CrossRef]
  34. Yuan, Zheng, and Yuhong Yang. 2005. Combining Linear Regression Models: When and How? Journal of the American Statistical Association 100: 1202–14. [Google Scholar] [CrossRef]
  35. Zhang, Xinyu, Alan TK Wan, and Guohua Zou. 2013. Model averaging by jackknife criterion in models with dependent data. Journal of Econometrics 174: 82–94. [Google Scholar] [CrossRef]
  36. Zhao, Shangwei, Xinyu Zhang, and Yichen Gao. 2016. Model Averaging with Averaging Covariance Matrix. Economics Letters 145: 214–17. [Google Scholar] [CrossRef]
1.
The CME Group Inc. (Chicago Mercantile Exchange & Chicago Board of Trade) in December 2017 launched Bitcoin future (XBT), with Bitcoin as the underlying asset.
2.
This phenomenon has been documented by Dacorogna et al. (1993) and Andersen et al. (2001b) for the foreign exchange market and by Andersen et al. (2001a) for stock market returns.
3.
ARFIMA is designed to model time series with long memory at the beginning. It is now a popular tool for modeling volatility, since volatility exhibits long memory.
4.
Corsi et al. (2012) provided a comprehensive review of the development of HAR-type models and their various extensions.
5.
Müller et al. (1993) referred to this interpretation as the Heterogeneous Market Hypothesis.
6.
The ratio statistic is defined as
Z t ( Δ ) = Δ 1 / 2 × [ RV t ( Δ ) BPV t ( Δ ) ] RV t ( Δ ) 1 [ ( μ 1 4 + 2 μ 1 2 5 ) max { 1 , TQ t ( Δ ) BPV t ( Δ ) 2 } ] 1 / 2
where Δ is the notion of increasingly finer sampled returns, μ 1 = E ( | Z | ) denotes the mean of the absolute value of standard normally distributed random variable, and TQ t is the standardized realized tripower quarticity measure: TQ t ( Δ ) = Δ 1 μ 4 / 3 3 j = 3 1 / Δ | r t + j · Δ , Δ | 4 / 3 | r t + ( j 1 ) · Δ , Δ | 4 / 3 | r t + ( j 2 ) · Δ , Δ | 4 / 3 with μ 4 / 3 = E ( | Z | 4 / 3 ) .
7.
Although all the elements in x t are h-period lags from the period t, we follow the conventional notation in time series and denote x t as the explanatory variable corresponding to the period t dependent variable.
8.
Corsi et al. (2008) also demonstrated that the residuals of commonly used realized volatility models for the S&P 500 index exhibit non-Gaussianity and volatility clustering. They assessed its relevance for modeling and forecasting volatility in the proposed HAR-GARCH model.
9.
Additional results using both the GARCH ( 1 , 1 ) and the ARFIMA ( p , d , q ) models are available upon request. These estimators performed poorly relative to the HAR model and thus are not included for space limitation.
10.
The reason we have to exclude RV t ( 1 ) is because the summation of semi-variance terms equals RV t ( 1 ) .
11.
We also tried the 10-fold cross-validation and fixed tuning parameter log n log ( k 1 ) n . The results remain qualitatively intact.
12.
Giacomini and White (2006) proposed a framework for out-of-sample predictive ability testing and forecast selection designed for use in the realistic situation in which the forecasting model is possibly misspecified due to unmodeled dynamics, unmodeled heterogeneity, incorrect functional form, or any combination of these. The null hypothesis of the GW test is that the two models we want to compare are equally accurate on average based on certain criterion.
13.
Note that the forecasting horizons we considered in this paper are all short. The HAR-type models which our model-averaging methods build upon do not perform well in the long forecasting horizons. One possible explanation is that the Bitcoin market is relatively small compared to conventional stock markets; therefore, it is more sensitive to various policy shocks, information impact, and even social media sentiment changes. Most of these shocks are short-lived, and it seems that the momentum effect does not last long in Bitcoin realized volatility. How to model Bitcoin volatility in a long forecasting horizon is beyond the scope of this paper and guarantees future research.
Figure 1. BTC/USD price, realized variance, and volume on Binance.
Figure 1. BTC/USD price, realized variance, and volume on Binance.
Econometrics 07 00040 g001
Table 1. Descriptive statistics of the BTC/USD RV.
Table 1. Descriptive statistics of the BTC/USD RV.
StatisticsFirst HalfSecond HalfFull Sample
Mean32.420012.331922.3760
Median21.95656.427111.8865
Maximum197.6081115.6538197.6081
Minimum1.82850.52410.5241
Std. Dev.33.716417.204728.5575
Skewness2.47923.21862.9249
Kurtosis10.608215.984214.3301
Jarque–Bera0.00100.00100.0010
ADF Test0.00100.00100.0010
Table 1 reports the mean, the sample mean, median, minimum, maximum, standard deviation, skewness, and kurtosis for the realized variance series of the BTC/USD returns. The p-values of the Jarque–Bera and the Augmented Dickey–Fuller (ADF) tests for RV are recorded in order to test their normality and stationarity, respectively. Note that, for JB and ADF test statistics that are outside tabulated critical values, we report the maximum (0.999) or minimum (0.001) p-values.
Table 2. List of heterogeneous autoregressive (HAR)-type estimators.
Table 2. List of heterogeneous autoregressive (HAR)-type estimators.
Panel A: Conventional Regressions
(1)AR(1)a simple autoregressive model
(2)HAR-Fullthe HAR model proposed in Corsi (2009) with l = [ 1 , 2 , , 30 ] , equivalent to a restricted AR(30)
(3)HARthe conventional HAR model proposed in Corsi (2009) with l = [ 1 , 7 , 30 ]
(4)HAR-Jthe HAR model with jump component proposed in Andersen et al. (2007)
(5)HAR-CJthe HAR model with continuous jump component proposed in Andersen et al. (2007)
(6)HAR-RS-Ithe HAR model with semi-variance components (Type I) proposed in Patton and Sheppard (2015)
(7)HAR-RS-IIthe HAR model with semi-variance components (Type II) proposed in Patton and Sheppard (2015)
(8)HAR-SJ-Ithe HAR model with semi-variance and jump components (Type I) proposed in Patton and Sheppard (2015)
(9)HAR-SJ-IIthe HAR model with semi-variance and jump components (Type II) proposed in Patton and Sheppard (2015)
Panel B: Methods Acknowledging Model Uncertainty
(10)LASSOthe LASSO HAR method proposed in Audrino and Knaus (2016)
(11)MAHARthe model averaging HAR method proposed in Lehrer et al. (2018)
(12)HRCPthe hetero-robust model-averaging method proposed in Liu and Okui (2013)
(13)JMAthe jackknife model-averaging method discussed in Zhang et al. (2013)
(14)H-MAHARthe hetero-robust model averaging HAR method proposed in Qiu and Xie (2018)
Table 2 lists all the HAR-type estimators included in the empirical exercise. For all the conventional HAR specifications without considering model uncertainty in Panel A, except the HAR-Full model (all the lagged covariates from 1 to 30), we set l = [ 1 , 7 , 30 ] . To build the candidate models for the model-averaging methods in Panel B, we take a general unrestricted model that includes all covariates in the HAR-Full model and only replace RV t ( 1 ) by the semi-variance components from HAR-RS-I.
Table 3. Out-of-sample forecast comparison for the BTC/USD RV.
Table 3. Out-of-sample forecast comparison for the BTC/USD RV.
MethodMSFEMAFESDFEPseudo R 2
Panel A: Conventional Regressions
AR(1)239.150410.071715.46450.4106
HAR-Full302.366210.892517.38870.2548
HAR204.65328.330214.30570.4956
HAR-J208.73488.557014.44770.4856
HAR-CJ215.95408.376614.69540.4678
HAR-RS-I193.20838.170513.89990.5238
HAR-RS-II197.33548.261814.04760.5137
HAR-SJ-I193.73628.216713.91890.5225
HAR-SJ-II201.12498.364014.18190.5043
Panel B: Method Acknowledging Model Uncertainty
LASSO247.87998.262815.74420.3891
MAHAR191.96737.173513.85520.5269
HRCP196.87857.353914.03130.5148
JMA191.98627.177213.85590.5269
H-MAHAR191.36247.162113.83340.5284
Table 3 compares the out-of-sample performance of the H-MAHAR estimator relative to its comparison methods. The sample period for the Bitcoin RV spans from 1 January 2018 to 20 December 2018 (a total of 352 observations). We use a rolling window of 100 observations to estimate the coefficients of all the models and evaluate the out-of-sample forecast performance at h = 1 . Bold numbers indicate the best performing model by each criterion.
Table 4. Results of the Giacomini–White test for h = 1 .
Table 4. Results of the Giacomini–White test for h = 1 .
MethodAR(1)FullHARJCJRS-IRS-IISJ-ISJ-IILASSOMAHARHRCPJMA
Panel A: Conventional Regressions
AR(1)-------------
HAR-Full0.1617------------
HAR0.00000.0000-----------
HAR-J0.00000.00000.0888----------
HAR-CJ0.00000.00000.84490.4070---------
HAR-RS-I0.00000.00000.43760.10740.4895--------
HAR-RS-II0.00000.00000.74180.22390.70470.1137-------
HAR-SJ-I0.00000.00000.58390.12830.58110.32020.5546------
HAR-SJ-II0.00000.00000.87390.42530.96670.09690.40630.1413-----
Panel B: Methods Acknowledging Model Uncertainty
LASSO0.00090.00000.86320.48050.79570.82450.99800.91130.8042----
MAHAR0.00000.00000.00010.00000.00070.00030.00010.00020.00010.0029---
HRCP0.00000.00000.00080.00010.00370.00350.00180.00230.00120.01500.0255--
JMA0.00000.00000.00010.00000.00070.00030.00010.00020.00010.00300.57740.0251-
H-MAHAR0.00000.00000.00000.00000.00060.00020.00010.00010.00010.00240.34810.02270.3119
The modified Giacomini–White test (Giacomini and White 2006) is implemented to test the null hypothesis that the row method (in vertical headings) performs equally as well as the column method (in horizontal headings) in terms of the absolute forecast error. Corresponding p-values for each method are reported in Panels A to B of Table 4.
Table 5. Top 5 models from the heteroskedasticity-robust model averaging HAR (H-MAHAR) estimator.
Table 5. Top 5 models from the heteroskedasticity-robust model averaging HAR (H-MAHAR) estimator.
Model 1Model 2Model 3Model 4Model 5
Weight0.34410.33550.25460.04880.0170
Panel A: HAR-RS Components
RV t +
RV t xxxxx
Panel B: Selected HAR Covariates
RV t ( 15 ) x xx
RV t ( 16 ) x xx
RV t ( 18 ) xx
RV t ( 22 ) x x
RV t ( 23 ) x x
RV t ( 28 ) x xxx
RV t ( 29 ) xxxxx
RV t ( 30 ) xxxxx
Table 5 describes the models that are assigned the five highest weights by the H-MAHAR estimator. Note that x denotes that the explanatory variable is included in the specific model.
Table 6. Forecast performance comparison for various horizons.
Table 6. Forecast performance comparison for various horizons.
Method MSFEMAFESDFEPseudo R 2 MSFEMAFESDFEPseudo R 2 MSFEMAFESDFEPseudo R 2
h = 2 h = 3 h = 4
Panel A: Conventional Regressions
AR(1) 262.928910.328616.21510.3062 277.698810.664316.66430.2643 283.230310.991116.82940.2502
HAR-Full 334.589111.669118.29180.1171 346.230312.120718.60730.0828 346.752312.245318.62130.0821
HAR 224.24408.899814.97480.4083 234.68869.314115.31960.3783 241.80379.281715.55000.3599
HAR-J 226.29708.921015.04320.4028 235.01449.291215.33020.3774 245.81979.510815.67860.3493
HAR-CJ 221.21808.899714.87340.4162 223.12879.110414.93750.4089 244.79309.645115.64590.3520
HAR-RS-I 226.56788.906015.05220.4021 258.02249.587716.06310.3164 230.63319.084515.18660.3895
HAR-RS-II 231.14089.001415.20330.3901 262.36149.731816.19760.3050 242.08609.386715.55910.3591
HAR-SJ-I 228.78378.924115.12560.3963 260.79719.653016.14920.3091 232.62739.117315.25210.3842
HAR-SJ-II 233.01469.107015.26480.3851 290.95099.678117.05730.2292 239.45669.316215.47440.3661
Panel B: Methods Acknowledging Model Uncertainty
LASSO 265.28098.740716.28740.3000 270.90549.021316.45920.2823 270.96199.202016.46090.2827
MAHAR 216.78148.456614.72350.4280 228.67938.437515.12210.3942 225.01278.134315.00040.4043
HRCP 217.29188.410014.74080.4266 228.59238.363815.11930.3944 220.12497.944814.83660.4173
JMA 216.93378.457714.72870.4276 228.72958.437215.12380.3940 227.89018.198215.09600.3967
H-MAHAR 216.92628.474614.72840.4276 228.75368.460615.12460.3940 223.96718.124514.96550.4071
Table 6 compares the out-of-sample performance of the H-MAHAR estimator relative to its comparison methods. The sample period for the Bitcoin RV spans from 1 January 2018 to 20 December 2018 (a total of 352 observations). We use a rolling window of 100 observations to estimate the coefficients of all the models and evaluate the out-of-sample forecast performance at h = 2 , 3 , and 4. The results for each h are reported in the left, middle, and right blocks, respectively. Bold numbers indicate the best performing model by each criterion.
Table 7. Results of the Giacomini–White test for various forecast horizons.
Table 7. Results of the Giacomini–White test for various forecast horizons.
MethodAR(1)FullHARJCJRS-IRS-IISJ-ISJ-IILASSOMAHARHRCPJMA
Panel A: h = 2
AR(1)-------------
HAR-Full0.0688------------
HAR0.00080.0000-----------
HAR-J0.00130.00000.7804----------
HAR-CJ0.00650.00000.99990.9478---------
HAR-RS-I0.00110.00000.95480.90160.9848--------
HAR-RS-II0.00330.00000.41090.52360.75770.1555-------
HAR-SJ-I0.00110.00000.84460.98170.94180.66480.3559------
HAR-SJ-II0.00480.00000.20160.25990.53210.05560.42040.0425-----
LASSO0.01490.00040.75220.71900.77940.74140.60790.71870.4705----
MAHAR0.00050.00000.23970.21960.27230.24670.16130.23970.10700.5903---
HRCP0.00030.00000.19520.18110.24130.20400.13040.19870.08680.52850.6363--
JMA0.00050.00000.24300.22310.27680.25000.16400.24280.10900.59300.92130.6168-
H-MAHAR0.00060.00000.26160.24040.29640.26850.17800.26060.11880.61540.33160.49980.2370
Panel B: h = 3
AR(1)-------------
HAR-Full0.0974------------
HAR0.01210.0000-----------
HAR-J0.01490.00000.8366----------
HAR-CJ0.02810.00010.62130.6402---------
HAR-RS-I0.04770.00010.21350.31440.3559--------
HAR-RS-II0.09790.00020.05110.11740.22370.1753-------
HAR-SJ-I0.05800.00020.14910.24540.30680.09370.4849------
HAR-SJ-II0.09330.00050.27510.33490.35450.58700.79480.8733-----
LASSO0.04100.00140.62780.64300.90340.38770.29140.33990.3513----
MAHAR0.00110.00000.03930.04310.12160.02180.01020.01740.03270.3106---
HRCP0.00070.00000.02360.02570.08720.01360.00590.01080.02300.25730.4987--
JMA0.00110.00000.04020.04430.12600.02230.01050.01780.03330.31150.98920.4993-
H-MAHAR0.00120.00000.04710.05150.14110.02580.01240.02060.03720.33330.40410.39990.2248
Panel C: h = 4
AR(1)-------------
HAR-Full0.1863------------
HAR0.00630.0000-----------
HAR-J0.01500.00000.1822----------
HAR-CJ0.02720.00060.33210.6941---------
HAR-RS-I0.00220.00000.23720.09000.1312--------
HAR-RS-II0.01090.00010.61340.65960.50910.0238-------
HAR-SJ-I0.00210.00000.28230.08000.14540.44600.0413------
HAR-SJ-II0.00380.00000.86080.43910.36650.06060.59470.0570-----
LASSO0.07230.00470.91190.68230.57310.85700.77160.89760.8607----
MAHAR0.00030.00000.02420.00670.00270.03700.00570.03170.01220.0649---
HRCP0.00040.00000.01180.00270.00090.01720.00460.01490.00610.04840.5237--
JMA0.00040.00000.03850.01240.00510.05930.01000.05160.02070.08660.12110.4208-
H-MAHAR0.00020.00000.02000.00490.00250.03310.00550.02810.01120.06730.84260.53900.3280
The modified Giacomini–White test (Giacomini and White 2006) is implemented to test the null hypothesis that the row method (in vertical headings) performs equally as well as the column method (in horizontal headings) in terms of the absolute forecast error. Corresponding p-values for each h are reported in Panels A to C of Table 7.
Table 8. Forecast performance comparison under different window lengths.
Table 8. Forecast performance comparison under different window lengths.
Method MSFEMAFESDFEPseudo R 2 MSFEMAFESDFEPseudo R 2
L = 50 L = 150
Panel A: Conventional Regression
AR(1) 249.647710.016015.80020.5658 238.35469.903215.43870.4277
HAR-Full 1637.239922.424840.4628−1.8473 243.79459.648615.61390.4147
HAR 276.875710.558516.63960.5185 212.97448.248014.59360.4887
HAR-J 293.460010.756217.13070.4896 210.75488.219414.51740.4940
HAR-CJ 374.799011.809319.35970.3482 208.91878.231014.45400.4984
HAR-RS-I 283.631410.628116.84140.5067 201.20678.132114.18470.5169
HAR-RS-II 295.033810.733417.17650.4869 204.84018.187014.31220.5082
HAR-SJ-I 284.414610.612716.86460.5054 200.63478.115614.16460.5183
HAR-SJ-II 299.519910.997117.30660.4791 206.00118.265614.35270.5054
Panel B: Method Acknowledges Model Uncertainty
LASSO 806.011515.742728.3903−0.4017 252.38367.743715.88660.3940
MAHAR 255.59459.347415.98730.5555 200.17707.358714.14840.5194
HRCP 298.537110.100217.27820.4808 203.25517.493014.25680.5120
JMA 257.53579.402116.04790.5521 200.44597.383814.15790.5187
H-MAHAR 252.62599.224515.89420.5607 199.84107.354014.13650.5202
Table 8 compares the out-of-sample performance of the H-MAHAR estimator relative to its comparison methods. The sample period for the Bitcoin RV spans from 1 January 2018 to 20 December 2018 (a total of 352 observations). We consider alternative rolling window lengths of 50 and 150 observations to estimate the coefficients of all the models and evaluate the out-of-sample forecast performance at h = 1 . The results for each L are reported in the left and right blocks, respectively. Bold numbers indicate the best performing model by each criterion.
Table 9. Results of the Giacomini–White test for different window lengths.
Table 9. Results of the Giacomini–White test for different window lengths.
MethodAR(1)FullHARJCJRS-IRS-IISJ-ISJ-IILASSOMAHARHRCPJMA
Panel A: L = 50
AR(1)-------------
HAR-Full0.0000------------
HAR0.14740.0000-----------
HAR-J0.06810.00000.4518----------
HAR-CJ0.00340.00000.03210.0603---------
HAR-RS-I0.13890.00000.82670.73920.0471--------
HAR-RS-II0.08900.00000.61230.95210.06420.5173-------
HAR-SJ-I0.14860.00000.86910.70250.04480.76740.4758------
HAR-SJ-II0.02440.00000.18690.49960.16040.10670.32810.1017-----
LASSO0.00000.00150.00000.00000.00090.00000.00000.00000.0000----
MAHAR0.11090.00000.00620.00160.00010.00430.00220.00510.00030.0000---
HRCP0.87600.00000.37670.21450.01600.31500.23150.33320.09540.00000.0045--
JMA0.14500.00000.00890.00240.00020.00620.00320.00730.00050.00000.07830.0056-
H-MAHAR0.05520.00000.00240.00050.00010.00150.00070.00190.00010.00000.11640.00590.0532
Panel B: L = 150
AR(1)-------------
HAR-Full0.6126------------
HAR0.00000.0003-----------
HAR-J0.00000.00030.7912----------
HAR-CJ0.00020.00130.93900.9513---------
HAR-RS-I0.00010.00040.56880.66300.7045--------
HAR-RS-II0.00010.00080.78400.88640.87700.4173-------
HAR-SJ-I0.00010.00040.53610.58550.64600.73310.4473------
HAR-SJ-II0.00010.00120.93180.81670.89490.12000.49950.0689-----
LASSO0.00000.00050.19110.23640.31010.35220.28910.38140.2050----
MAHAR0.00000.00000.00140.00150.00280.00330.00240.00430.00070.3147---
HRCP0.00000.00000.00680.00790.01140.01580.01180.01960.00380.51650.0173--
JMA0.00000.00000.00180.00210.00370.00450.00330.00570.00090.34810.00010.0463-
H-MAHAR0.00000.00000.00130.00150.00270.00310.00230.00400.00060.30810.61720.02040.0166
The modified Giacomini–White test (Giacomini and White 2006) is implemented to test the null hypothesis that the row method (in vertical headings) performs equally as well as the column method (in horizontal headings) in terms of the absolute forecast error. Corresponding p-values for each L are reported in Panels A to B of Table 9, respectively.

Share and Cite

MDPI and ACS Style

Xie, T. Forecast Bitcoin Volatility with Least Squares Model Averaging. Econometrics 2019, 7, 40. https://doi.org/10.3390/econometrics7030040

AMA Style

Xie T. Forecast Bitcoin Volatility with Least Squares Model Averaging. Econometrics. 2019; 7(3):40. https://doi.org/10.3390/econometrics7030040

Chicago/Turabian Style

Xie, Tian. 2019. "Forecast Bitcoin Volatility with Least Squares Model Averaging" Econometrics 7, no. 3: 40. https://doi.org/10.3390/econometrics7030040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop