1. Introduction
Estimating stock price volatility is a difficult task due to nonlinear patterns in data. Early interpretation of stock price volatility helps the traders to make more profits. Volatility in finance is a statistic to measure the rate of change in the stock price over time, and it is calculated using standard deviations. The volatility statistics help the investor to estimates the risk in the stock or stock index. When the volatility is very high, then it is riskier to invest. So, identification of volatility in the market is essential as far stock market is concerned.
In literature, it is found that stock market volatility is estimated by using autoregressive conditional heteroscedasticity [
1,
2]. The GARCH model has been proposed to capture the different families of conditional volatility [
3]. The GARCH model is useful when the variance of the stock price is not constant [
4]. Most of the work has used the GARCH model to estimate the volatility in the stock market [
5,
6,
7]. Moreover, the GARCH model does not capture the different variations of volatility periods, and the reason is GARCH parameters alpha and beta are restricted to less than one [
8,
9,
10,
11,
12]. However, when there is higher volatility, these parameter values can go beyond one. Hence, it fails to capture the higher volatility. According to a Markov process, a solution to this problem is to allow the GARCH model’s parameters to vary over time [
13,
14,
15,
16,
17]. Therefore, in this work, we have considered regime-switching based on the Markov switching GARCH (MSGARCH) and Self-Exciting Threshold Autoregressive (SETAR) model instead of the plain GARCH model. The contribution of this paper is to capture the dynamic volatility in time series data using the MSGARCH and SETAR models and improve the forecasting results. This is the first empirical study on the Indian stock market data based on the regime-switching model, to the best of our knowledge.
2. Literature Review
The study related to volatility forecasting in the financial world is comprehensive, and most of the empirical researches used ARCH and GARCH models for this purpose. The ARCH model was proposed in 1982 [
1] and the GARCH model in 1986 [
3]. The GARCH model has proved to be a helpful technique when there is a conditional variance in financial data [
18]. The linear regression model was applied when the data shows homoscedastic, and the reason is that the variance of the error is constant [
19]. However, when the type of data is nonlinear, the error term is not constant over time. In such cases, heteroscedasticity is a crucial characteristic to develop a financial time series model.
Ariyo et al. [
20] considered Autoregressive Integrated Moving Average (ARIMA) model to predict the stock prices. Nigeria and New York stock exchange data were considered in the experiments. The ARIMA model predicts future values based on its current and past value of time series data. The Autoregressive Integrated Moving Average (ARIMA) model is widely used to find a linear relationship in the time-series application [
21]. However, most researchers found that the ARIMA model cannot identify the nonlinear pattern in data. Therefore, most of the methods considered SVM and ANN instead of the ARIMA model [
22]. Pai and Lin [
22] proposed a hybrid ARIMA model for stock price prediction. The hybrid ARIMA is the combination of the ARIMA and SVM models. The residuals are obtained using the ARIMA method. Later, these residuals are given as input to SVM for stock price forecasting. Blair et al. [
23] considered the ARCH model to estimate the volatility between daily returns and VIX. The study concludes that VIX volatility forecasting is outperformed compared to volatility forecast daily returns.
Atoi [
24] considered error distribution in evaluating the market volatility. This work used GARCH, PGARCH, and EGARCH models to estimates the volatility, and Normal, Student’s-t, and generalized normal distributions were used to analyze the error distribution in the models. He has proved that error distribution is one of the essential parameters to improve the model performance. He has used Nigeria stock data from the year 2008 to 2013 for his experiments. The performance of the model was evaluated using the RMSE metric. In this study, the PGARCH model performs better than GARCH and EGARCH models.
The daily stock returns data were considered to evaluate the volatility of stock price [
25]. The data is collected from the Romania stock index from the year 2001 to 2012. In this work, the different family of the GARCH model was considered to forecast the stock returns. TGARCH and PGARCH models perform better than GARCH and IGARCH. The selection of the model is carried out using AIC and log-likelihood metrics. The future returns of the GARCH model were evaluated using RMSE, MAE, and MAPE.
Molnár [
26] proposed the range-GARCH model instead of the GARCH model. Here, the stock intraday difference from its highest and lowest is considered a range -GARCH model. Experiment work considered the 30 stocks and six indices of Dow Jones stock market data. There were around 4423 samples of data collected from the year 1992 to 2010. The range GARCH model performs better than the standard GARCH model. The model selection was evaluated using the AIC metric. The one-day ahead future price is forecasted, and the model’s performance is evaluated using the RMSE metric.
GARCH and ARCH model is considered for estimating the variance in stock prices [
27]. The residuals of each model were examined using a correlogram. However, the study was found that ARCH(1) could not capture the ARCH effects from its residuals. Because the residuals were generated using mean equations ARIMA(1,1,0). In this work, ARIMA(1,1,0)-GARCH(1,1) has captured the ARCH effect from its residuals, and the study concludes that GARCH performs better than the ARCH model. AIC and BIC selection criteria were considered to select the best model. The daily closing price of Muslim Commercial Bank was considered in the experiments. RMSE, MAE, MAPE metrics have been used for forecasting the stock price.
The volatility of the S&P 500 index was estimated using three GARCH models, namely GARCH, EGARCH, GJR-GARCH [
28]. AIC and BIC metrics were considered to select the best model. The performance of the EGARCH(3,3) model was better than GARCH(1,1) and GJR-GARCH(1,1). To enhance the model performance, later, EGARCH model residuals are given input to the ANN model. The ANN model is trained with the back-propagation method. Seventy percent of data was considered for training, 20% for validation, and 10% for testing. The experiment work considered S&P 500 data from 1998 to 2009. The performance of the model was evaluated using mean forecast error RMSE, MAE, and MAPE.
Sharma et al. [
29] studied daily stock indices volatility forecasting using the seven GARCH models. The 21 global market indices were considered in the experiments from the year 2000 to 2013. AIC metric was considered to select the best model. The model parameters were estimated using the maximum likelihood function. The future performance of stock price was estimated using MSE and MAE metrics. The study found that the standard GARCH model performs better than TGARCH, EGARCH, AVGARCH, NGARCH, APARCH, GJR.
The exchange rate of Bangladesh and the U.S. currency volatility is estimated by using the GARCH models [
30]. The experiments considered data from the year 2008 to 2015. Normal and Student’s t-distribution assumptions were considered in the GARCH models. The AR(2)–GARCH(1,1) performed better than EGARCH and TGARCH models.
The hybrid model is proposed to estimates the volatility of gold price [
31]. The hybrid model is the combination of the ANN and GARCH model. The residuals are captured using the GARCH model, and it is given input to the ANN model for forecasting the price. The future performance of gold prices was estimated using MSE, RMSE, and MAE metrics. The model was trained using the backpropagation method. The results show that the hybrid model performs better than the GARCH model. The volatility of the copper price was estimated using the hybrid deep learning method [
32]. The hybrid is the combination of the GARCH an RNN model.
Much research has indicated that GARCH models perform exceptionally well in volatility forecasting. GARCH(1,1) [
33] is tested against the Swiss Index; outcomes showed good parameter optimization for the returns. Studies have explored the impact of outliers in forecasting volatility [
34], as well as different GARCH model versions [
35,
36,
37]. The results from References [
38,
39,
40] studies indicate asymmetric extensions of GARCH perform better than the symmetric extensions. Reference [
38] explores different forecast methods for asymmetric models. Comparison in Reference [
41] between GARCH-SGED and GARCH-N model is made to test their performance. The results showed that the GARCH-SGED performed better than model GARCH-N model. Reference [
42] studied the Israel stock markets, indicating the outcomes from asymmetrical models provide better performance in forecasting volatility. Tan et al. [
43] proposed the MSGARCH model to estimate the volatility in bitcoin prices.
In statistics, a structural break in the datasets leads to a massive difference in forecasting errors. Allaro et al. [
44] considered chow tests to identify the structural break in the datasets. Chow test is used for testing whether the coefficient of two linear regression on the same datasets are equal or not. In this method, time-series datasets are split into two equal parts, then the coefficient of two linear fits is compared to know whether the structural break is present or not.
Caporale and Zekokh [
45] investigated volatility of cryptocurrencies using Markov-Switching GARCH models. The GARCH models might be predicted incorrect results due to the high volatility. Therefore, authors proposed regime-switching based on the MSGARCH method. The experiments considered Coindesk Bitcoin data from the year 2010 to 2018. Chen et al. [
46] proposed GARCH models for estimating volatility of wind data. Yancheng wind farm data were considered for the experiments. The wind power data were captured every 5 min. There were around 2016 data samples collected for work.
Fakhfekh and Jeribi [
47] investigated a different type of GARCH models for volatility estimation. The work considered student-t and normal error distribution in the GARCH model. AIC and BIC information criteria were considered to evaluate the model. The performance of the EGARCH model was better than other models. CoinMarketCap data from August 2017 to December 2018 were used to carry out the experiments.
Sun and Yu [
48] used the threshold GARCH method to analyzed the positive and negative news effects. The S&P 500 data were considered in the experiments. TRINH et al. [
49] investigated Vietnam government bonds price volatility using the GARCH TGARCH and EGARCH model. The study concludes that the GARCH model performed better than TGARCH and EGARCH model. The experiments considered Vietnam government bonds data from 2006 to 2019.
Emenogu et al. [
7] investigated the volatility of stock price using the nine variants of GARCH models. NGARCH model performed better than other models. Nigeria Plc data from 2001 to 2017 were considered for experiments. Cao et al. [
50] investigated the volatility of VIX options data using the GARCH model. OptionMetric VIX and SPX options data were collected from 2008 to 2012 for experiments. Sapuric et al. [
51] studied the bitcoin volatility using the EGARCH model.
In most of the literature, stock price volatility is estimated using the GARCH model, and it is described in
Table 1. Most of the work ignored the regime changes in volatility estimation. Therefore, we have applied an MSGARCH and SETAR model to estimate this work’s volatility.
4. Methodology
4.1. Structural Break
Structural breaks in the data lead to errors in forecasting; in this work, we have considered the structural breaks in the data while forecasting the volatility [
44]. Chow test is used to test the structural break in the stock datasets. In this method, time-series datasets are split into two equal parts, then the coefficient of two linear fits are compared to know whether the structural break is present or not. Let us consider the linear regression equations in (
2) and (
3).
Here,
Y,
are the dependent variables,
X,
are independent variables.
C,
are the constants, and
,
are slope of the line.
and
are the error term in regression model. We have used Equations (
2) and (
3) to fit the datasets. To check two linear regression fits are similar or not, we have defined the null hypothesis, and alternate hypotheses are given below.
: and ;
: or .
The
p-value is greater than 0.05, indicating a structural break in the stock datasets. The experimental results show that there is a structural break in the datasets, and it is described in
Figure 7 and
Table 2.
4.2. Non-Regime Switching
Stock price data are dynamic. Autoregressive–Moving-Average (ARMA) model is useful for linear time series data [
55,
56,
57]. Hence, ARMA models are not able to capture the dynamic volatility in time series data. Therefore, most of the work considered the GARCH model for dynamic volatility estimation [
58,
59,
60,
61,
62]. The GARCH model is useful when the variance of time series data is not constant. The GARCH model is defined in Equations (
4) and (
5).
is a random variable with zero mean and unit variance.
← Time varying volatility.
← Unexpected return.
← Predictable variance changing over time.
← Constant.
← Constant.
← Constant.
It is found from the literature that the GARCH models do not capture the variations of the volatility periods. Most GARCH models considered in the literature are one term Autoregressive Conditional Heteroscedasticity (ARCH) and one GARCH, i.e., GARCH(1.1). However, the GARCH model ignores the regime-switching in volatility estimation because, in Equation (
5)’s GARCH model, alpha and beta parameters can be more than one when the structural break in the present. Therefore, we have considered regime-switching using the MSGARCH and SETAR models.
4.3. Regime Switching Based on MSGARCH
Structural changes in time series data are referred to as regime-switching. The overall proposed work is depicted in
Figure 8. Regime switching is essential when there is higher volatility in stock prices. Therefore, we have applied a Markov Switching-based GARCH (MSGARCH) model to estimate the stock price volatility. In this work, we have considered two MSGARCH models to estimates the volatility in stock price. In the first model, we have used homogeneous MSGARCH regime-switching. In homogeneous MSGARCH, GARCH conditional variance is considered. The normal distribution is used to analyze the error distribution in the models.
In the second model, we have used heterogeneous MSGARCH regime-switching. In heterogeneous MSGARCH, GARCH, EGARCH, TGARCH, conditional variances are considered. Normal and Student t are used to analyze the error distribution in the models.
In the proposed work, stock return data are given as input for the MSGARCH model for estimating the volatility. The calculation of stock returns is discussed in
Section 3. In the MSGARCH model, stock returns are defined as
at time
t. We assumed that
has zero mean, and it is not serially correlated. The MSGRCH is defined in Equation (
6).
where:
← Continuous distribution;
← Continuous Variance;
r ← regime k;
← vector k;
← Stock return information set up to .
denotes continuous distribution. It has mean value is zero. Conditional variance is denoted by
with regime
r state.
vector represents the regime-switching of Markov process. The regime switching from one state to another state is evolved using the first order of homogeneous Markov chain with state
. Here,
is an integer value which has discrete state
. In Equation (
6),
represents the current state, and
is the previous state of Markov process.
We have considered the GARCH, EGARCH, and TGARCH conditional variance in the MSGARCH model to estimate stock price volatility.
Student t and Normal distributions are used to analyze the error distribution in the models. The performance of the model is estimated by using the AIC and BIC metric. The details of the steps are described in Algorithm 1.
Algorithm 1 Error distribution using Student t and Normal distributions. |
- 1:
Input: Input stock prices and stock indices. - 2:
Output: Ten days forecast prediction. - 3:
Stock price and stock indices returns are computed and given as input to the MSGARCH models. - 4:
MSGARCH specification for volatility estimation . - 5:
(a). GARCH conditional variance with Homogeneous regime switching method. (b). GARCH conditional variance with Heterogeneous regime switching method. - 6:
Estimate conditional distribution. (a) Normal Distribution (b) Student T = - 7:
Model estimation using AIC and BIC metrics. - 8:
Ten days forecast prediction.
|
4.4. Regime Switching Based on Self-Exciting Threshold AutoRegressive (SETAR) Model
The SETAR model is one of the popular models in time series to forecast the future trend in data. SETAR model was used when there is a structural break in the datasets. In the SETAR(R, AR), the model consists of two parts. R represents the number of the regime, and AR represents the order of auto-regression.
Consider a simple auto-regression(P) for stock price
, and it is defined below equation.
where:
← auto-regression(P) Coefficient.
← Constant Variance.
TAR allows the model parameters to change according to the value of a weakly exogenous threshold variable
for capturing nonlinear trends.
where:
← Column vector variables;
← divide the domain of the threshold variable into k different regimes.
In each different regime, in stock price, follows a different auto-regression(P) model. When the threshold variable with the delay parameter d being a positive integer, the dynamics or regime of is determined by its own lagged value , and the TAR model is called a self-exciting TAR or SETAR model.