1. Introduction
Cryptocurrencies are decentralised currencies that are transacted without the regulations of a reserve bank or financial intermediaries. Blockchain technology is used to process transactions. BitCoin is on top of the list of traded cryptocurrencies in terms of traded volume.
Since BitCoin is not backed by any central bank or government, its users and traders are expected to be vulnerable to higher risk (volatility). As with the global trend, cryptocurrency trading, particularly in BitCoin, has gained a lot of traction in South Africa. There is a steady increase in movements of people’s savings and investments between the Rand and BitCoin.
Cryptocurrencies are said to be very risky [
1]. Developing countries’ currencies, including the South African Rand, are equally risky [
2]. The purpose of this study is to use the wavelet-decomposed–ARMA–GARCH–Extreme Value Theory (Generalised Pareto Distribution (GPD) and the Generalised Extreme Value Distribution (GEVD))-based Value at Risk (VaR) to compare the riskiness of the two currencies. The Generalised Pareto Distribution and Generalised Extreme Value Distribution used to describe/characterise extreme residuals from the return series models are preferred because these distributions are good at analysing extreme risk. The two distributions differ in the way they select extreme observations.
The VaR is a statistic that quantifies the riskiness of a financial portfolio of assets. It is the largest value or amount expected to be lost over a specified time horizon, i.e., daily, weekly, or ten days, at a pre-defined statistical confidence level. Ref. [
3] defined VaR as the value that “compresses all Greek letters for all the market variables underlying a portfolio into a single number”. Investors and practitioners rely heavily on VaR as a risk measure, even though it is not globally sub-additive, and hence not a coherent risk measure. The VaR metric is popular because its practical advantages outweigh its theoretical disadvantages. According to [
4], VaR is sub-additive in most practical situations, which is in line with the diversification concept of modern portfolio theory.
Ref. [
5] states that the behaviour of the financial time series shows volatility clustering, noise, leptokurtosis, and autocorrelation behaviour. Volatility clustering is when large changes in price tend to be followed by large changes, and small changes tend to be followed by small changes, whilst autocorrelation is the degree of correlation of the same variables between two time intervals. Hence, the use of wavelets transforms to decompose the series into different time horizons based on their volatility regime or cluster. Generalised Auto-Regressive Conditional Heteroscedasticity (GARCH) models capture conditional heteroscedasticity, making residuals independent and identically distributed [
6]. Leptokurtosis describes fat tails or excess peaks prevalent in financial time series. Extreme value theory (EVT) models are recommended in efforts to correctly capture the extreme risk in financial assets.
The wavelet is a mathematical function that decomposes (breaks down) the original signal or time series into various components or sub-series [
7]. They are very useful in capturing important features of the signal at each component level (resolution level) like long memory volatility features of a financial time series, which, in turn, helps in improving the forecasting capability of a model.
More recently, the use of wavelets in estimating volatility has gained popularity in the fields of hydrology, geophysics, and financial time series analysis [
7,
8,
9].
Ref. [
10] gave traction to the concept of wavelet analysis of a time series by providing the theory behind extracting useful statistical information from a physical science series, such as in hydrology. The wavelet methodology for the prediction of time-series data based on multi-scale decomposition was developed by [
11]. Although a number of research papers have been published dealing with various theoretical aspects of wavelets, their application to data is still a difficult task. This is true with respect to financial data that are noisy, non-stationary, with fat tails, and often auto-correlated.
Cryptocurrencies are said to be very risky, and so is the South African Rand. The purpose of this study is to provide a detailed comparison of risks associated with each of the two currencies so as to provide information to the global investors, and foreign currency traders in the Republic of South Africa, and member countries of the Rand Union. In this paper, an improved empirical technique for estimating Value at Risk (VaR) by combining wavelets decomposition (WD), an ARMA model, a Generalised Autoregressive Conditional Heteroscedasticity (GARCH), and Extreme Value Theory (EVT) models is proposed. Will this “hybrid” model add value to the subject of the computation of VaR? This question will be addressed by fitting EVT models to the standardised residuals extracted from the WD-ARMA-GARCH model of the currency’s daily log series, and employing backtesting techniques to the resulting VaR.
The rest of the paper is organised as follows:
Section 2 presents the literature review. The methodology is in
Section 3. Results and discussions are in
Section 4, and
Section 5 concludes.
2. Literature Review
Value at Risk (VaR) is one of the most commonly used risk measures in finance because of its ability to compress all Greeks to a single value [
3]. Typically, VaR is computed by first modelling the entire returns distribution for the asset or a portfolio, then calculating the statistic at the percentile corresponding to the desired confidence level.
Ref. [
12] showed that the traditional normal distribution-based VaR is not only incoherent but also fails to precisely estimate the risk of loss when the loss distributions have ‘fat tails’, unless EVT distributions are used. “This significantly discredits the accuracy of the traditional Normal distribution-based VaR risk measure” according to [
13].
The EVT assumes independent and identically distributed (i.i.d) observations. This i.i.d assumption does not always hold for financial time series data. To correct this, [
6] proposed a two-stage methodology in the form of a GARCH-EVT model using five index returns in their illustrations of modelling volatility. The first step is to capture the heteroscedasticity (non-constant variation or fluctuations) features by fitting a GARCH model. The second step is to apply the EVT to residuals extracted from a selected GARCH model using the Generalised Pareto Distribution (GPD) or the Generalized Extreme Value Distribution (GEVD). The second part of this modelling process allows one to capture or describe the large fluctuations in prices and returns. The merits of the GARCH-EVT hybrid model lie in its ability to capture conditional heteroscedasticity (changing variation) in the data through the GARCH framework, while, at the same time, modelling the extreme tail (large fluctuations) behaviour through the EVT methods.
While GARCH models have been to a larger extent successful in capturing most of the volatility stylised facts of financial time series [
14], they are still lagging behind in detecting many structural changes (which are also prevalent) in the signals [
15]. Ref. [
9] showed that WD (Maximal Overlap Discrete Wavelet Transform)-GARCH captures well these structural changes and overall performs better than simple GARCH models. They used the daily stock price indices of four African countries’ stock markets, Kenya’s NSE20, Nigeria’s All Share, South Africa’s FTSE/JSE100, and Tunisia’s TUNINDEX. Data from 2 January 2000 to 31 December 2014 were used.
According to [
16], wavelet transforms perform better compared to the traditional Fourier transforms in signal processing. [
17] used wavelet decomposition transforms and an ARIMA model to forecast the volatility of daily prices from the Amman Stocks Market (Jordan), from 1993 until 2009. Their findings showed that the approximated series under the wavelet transforms was better than the original series as it provided more stability in variance, and mean and smoothed out outliers. Furthermore, their forecasts using the ARIMA (p,d,q) under the wavelet transformation gave more accurate results than using the original signal.
Ref. [
8] compared the performance of a Wavelet-Decomposed–Generalised Auto-Regressive Conditional Heteroscedasticity (WD-GARCH) model with a simple GARCH model in forecasting climate anomalies using the Multivariate ENSO Index (MEI), the global climate data time series index, for the period January 1950 to February 2018. They used the Akaike information criterion, Schwarz criterion, Hannan–Quinn criterion to perform the model selection, and the residual mean square error to assess the goodness of fit. Their results showed that both models fit the MEI data well. The forecast produced by the GARCH (1, 2) model underestimated the observed score, while the newly proposed WD-GARCH (1, 1) model generated more accurate forecasts for the given data. The authors recommend the WD-GARCH (1, 1) be applied to forecasts in the fields reflected by MEI variability.
Ref. [
18] compared the performance of the Auto-Regressive Moving Average with eXplanatory variable(s) (ARIMAX), ARIMAX-GARCH, and ARIMAX-GARCH-WAVELET models in modelling volatility of wheat yield in the Kanpur district of Uttar Pradesh, India, from 1972 to 2013. His findings showed that the ARIMAX-GARCH-WAVELET outperformed other GARCH family models in forecasting the volatility of the wheat yield.
Ref. [
19] used a statistical test approach in comparing the average returns and volatility of BitCoin against the Indonesian Composite Index, and gold. BitCoin average returns were significantly higher than the other financial assets studied. BitCoin was expected to be riskier. This would be consistent with mean-variance portfolio theory, which suggests a higher yield for riskier assets [
20].
There has been an increase in the amount of research to ascertain whether the stylised facts of cryptocurrency are similar to those of other financial assets. Ref. [
1] showed that cryptocurrencies have similar distributional characteristics with Gold and the FTSE/JSE 40, though the cryptocurrency is more volatile. Ref. [
21] noted the presence of heavy-tailedness and excess kurtosis in the one-minute return data of BitCoin. Ref. [
22] observed a high negative skewness and volatility in BitCoin in comparison to other stock returns. Refs. [
23,
24] came to a similar conclusion that cryptocurrencies characteristics are nearly indistinguishable from the forex markets in well-established financial markets.
Ref. [
25] argued that the shocks that are prevalent in the financial market do not affect BitCoin and gold returns; hence, they can be used for hedging. Conversely, ref. [
26] noted relative stability in BitCoin and Ethereum using asymmetric power-law statistical distributions.
Ref. [
27] used the RiskMetrics (constrained Integrated GARCH (1,1)) with heavy-tailed error distribution to compare the riskiness of investing in BitCoin against keeping the savings in a developing economies’ currency, the South African Rand account. Their findings showed that BitCoin is riskier than the Rand. However, their backtest results suggested that the RiskMetrics model is inadequate at a 10% level of significance. Ref. [
6] empirically showed the Generalised Auto-Regressive Conditional Heteroscedasticity–Extreme Value Theory (GARCH-EVT) model is superior (more accurate) to the EVT models in the estimation of VaR. Ref. [
28] emphasised the importance of residuals in VaR estimation when she showed that a GARCH model with normal innovations is inferior to the GARCH model with Student’s t innovations when data have fat tails (a common feature in the financial series data).
In this research, the interest is in capturing the distributional features like volatility clustering, conditional heteroscedasticity, structural breaks, and fat tails of two exchange rates returns time series, namely, BitCoin (BTC) to United States Dollar (USD), and the South African Rand (ZAR) to the USD, using wavelet-decomposed–ARMA-GARCH and EVT models. The ARMA component model quantifies the behaviour of the mean return. The modelling results can then be used in the estimation of Value at Risk to compare the riskiness of the two currencies.
3. Methodology
The steps involved in the estimation of the wavelet decomposed WD-ARMA-GARCH-EVT Value at Risk (VaR) can be summarized as follows:
Decompose and filter the log return series’ using Maximal Overlap Discrete Wavelet Transform with two mother wavelets, namely, the Haar and Daubechies (d4).
Fit the Auto-Regressive Moving Average–Generalised Auto-Regressive Conditional Heteroscedasticity (ARMA-GARCH) component model to the wavelet transformed series.
Extract residuals and fit Extreme Value Theory models (Generalised Pareto Distribution, Generalised Extreme Value Distribution).
Estimate VaR and confirm model adequacy using Kupiec’s backtest technique.
3.1. Wavelets
The wavelet is a mathematical function that decomposes (breaks down) the original signal or time series into various components. They are very useful in capturing long memory volatility features of a time series, which, in turn, helps in improving the forecasting capability of a model. Their advantage is their ability to be localised both in time and in the frequency domain, thus enabling the researcher to observe and analyse data at different scales [
29].
“This methodology involves recursively applying a succession of low-pass and high-pass filters to the signal (return series). This process allows the separation of high-frequency component from the low-frequency one” [
30]. Therefore, improving the model’s signal processing ability to capture structural changes including mildly explosive bubbles.
Wavelet transformations have been successfully applied to non-stationary time series data and yielded fairly good results in the fields of geo-sciences, remote sensing, engineering, hydrology, finance, medicine, ecology, renewable energy, chemistry, and history [
31]. Its spectro-temporal abilities (ability to decompose the signal into different temporal scales/portions and inspect and carry out a frequency analysis of those portions of the signals) mean that they can be applied to non-stationary time series data with mildly explosive bubbles like BitCoin.
The decomposition of the series can be obtained using wavelet transform that is based on two filters, which are the “mother wavelet”
and the “father wavelet”
Then, the wavelet decomposition of a time series can be expressed as a linear combination of a wavelet function.
If
is a time series function (for
, the wavelet decomposed version is
where the orthogonal basis functions
and
are defined as
where
represents the multiresolution, or scale level, and
depicts the number of coefficients in each scale level.
and
are the scaling (or smooth) and detail (or wavelet) coefficients, respectively, and are defined as
The magnitude of these coefficients reflects a measure of the contribution of the corresponding wavelet function to the total signal.
The scale
is also called the dilation factor and controls the length of the wavelet (window), whereas the translation parameter
refers to the location and indicates the non-zero portion of each wavelet basis vector. Equation (2) presents the long-scale smooth components that are used to generate the scaling coefficients, whereas the differencing coefficients are generated in Equation (3). The resulting multi-scale decomposition of Equation (1) can be simplified as
where
is the
level wavelet and
represents the aggregated sum of variations at each detail of the scale. In Equations (1) and (4), the father wavelet reconstructs the smooth and low-frequency parts of the signal, whereas the mother wavelet function describes the detailed and high-frequency parts of the signal.
Therefore, Equation (4) provides a complete reconstruction of the signal decomposed into a set of j frequency components, so that each component corresponds to a particular range of frequencies.
3.2. Types of Wavelet Transforms
The wavelet transforms break the original signal (currency log returns time series) into projections of translated and scaled versions of the original mother wavelet as follows for Continuous Wavelet Transform (CWT).
where
the original signal or time series is,
is the wavelet,
is the scale parameter, and
translate wavelet across the signal.
For Discrete Wavelet Transform (DWT), the signal is broken down as follows:
where
is the original signal or time series,
is the wavelet,
is the scale parameter,
is the translate wavelet across signal,
scaling index, and
wavelet transformed signal.
The limitations of the above-mentioned transforms are well documented. The CWT maps the one-dimensional function
to a function
having continuous real variables
and
. The coefficients of
at a particular scale and translation measure how well the original function or signal
matches with the scaled or translated mother wavelet. However, to recover the function, all the coefficients of
are not required. As a result, CWT gives a redundant way to represent the signal [
32]. On the other hand, DWT,
, takes times that are multiples of
, making it hard to compare with the original series or signal.
The Maximal Overlap Discrete Wavelet Transform (MODWT) is the preferred methodology for this research paper because its decomposition at different scales can easily be compared with the original time series, since it does not only take multiples of
and is less sensitive to the starting point. This is helpful in understanding the patterns at different frequencies, i.e., short-term, medium-term or long-term. According to [
10], the Maximal Overlap Discrete Wavelet Transform of a time series
,
to the
level is as follows:
where
wavelet filter constructed by convolving filters composed of and .
It suffices the following conditions:
for all integers
,
= scale filter constructed by convolving filters composed of .
It suffices the following conditions:
for all integers
,
for all integers .
is the width of the base level filter. The maximum number of levels depends on the available data points.
Although there are several mother wavelets, only some wavelets are suitable for financial time series analysis [
33]. The Haar and Daubechies (d4) mother wavelets capture better the economic and financial time series characteristics, with non-stationary and structural changes [
34,
35]. These wavelets are well localised in time domain yet dispersed in frequency domain. Therefore, they can be used for analysis of time series with structural breaks and sharp jumps.
3.2.1. Haar Wavelet
Proposed by [
36], the mother wavelet function
can be described as follows for the
level:
and the scaling function can be described as follows:
The Haar wavelet extracts information about how much difference there is between the two unit scale averages of bordering on the time .
3.2.2. Daubechies 4 (d4) Wavelet
The Daubechies 4 (d4) wavelet filter is
3.3. Auto-Regressive Moving Average–Generalised Auto-Regressive Conditional Heteroscedasticity (ARMA-GARCH)
The second step involves using the wavelet transformed series obtained from the multiresolution decomposition to fit the ARMA-GARCH-family models to model the dynamics of the exchange rates of BitCoin/USD and ZAR/USD volatility. The ARMA(
) model is mathematically defined as. Let
be the log returns of the currencies
The GARCH(
) model is mathematically defined as
The simplest form of the Generalised Auto-Regressive Conditional Heteroscedasticity model is the GARCH (1,1) with the equation and variance of form:
Assuming residuals follow conditional normality and are i.i.d. Let
, the quasi-Gaussian maximum likelihood function is
The log-likelihood of the above expression is
the optimal parameters are obtained by maximising (13) with respect to θ.
While [
37] found that the i.i.d innovations do not have fourth moments—leading to slower rates and unstable limits—the advantage of each of the Generalised Pareto Distribution and Generalised Extreme Value Distribution models lies in their ability to take a continuous range of possible distributional shapes influenced by the shape parameter also known as Extreme Value Index (EVI) parameter, which includes the bounded and unbounded innovation (tails) distributions as special cases. Each distribution is associated with 3 distributional forms depending on the heaviness of the tails. When the EVI is zero, the distributions are light-tailed, and when it is less than zero, the distributions are short-tailed or bounded. A positive EVI is associated with heavy tailedness. The Generalised Pareto Distribution and Generalised Extreme Value Distribution allow one to “let the data decide” which of these distributions is appropriate within each distribution, instead of having to select a particular form of the distribution function. The end result is a good model fit to the extremes (tails) of the data emanating from this flexibility.
3.4. Extreme Value Theory
The peak over threshold (POT) is used to select data for fitting the Generalized Pareto Distribution (GPD), and is used in our approach to model the standardized residuals emanating from the selected GARCH family model.
The block maxima method is another EVT approach for identifying maximum (extremes) in a data set. The Generalised Extreme Value Distribution (GEVD) is then fitted to the set of block maxima chosen in a given set of data. The data are initially arranged in time sequence and then grouped into non-overlapping blocks.
3.4.1. The Generalised Pareto Distribution (GPD)
The peak over threshold (POT) approach, used in fitting the Generalized Pareto Distribution, is used to model the standardized residuals from the WD-GARCH-family model.
Refs. [
38,
39] showed that for large enough thresholds,
’s, the POT function/exceedances above this threshold, can be estimated by Generalised Pareto Distribution. The Generalised Pareto Distribution is defined as follows:
where
are standardised residuals above the threshold
,
is the shape parameter or extreme value index (EVI), and
is the scale parameter
The value of
shows how heavy the tail is, with a bigger positive value indicating a heavy tail. When
is negative, the tail is short (bounded).
gives indicates a light tail.
3.4.2. Parameter Estimation of the Generalised Pareto Distribution
Let
be a sufficiently high threshold, assuming
n observations
such that
, the subsample
has an underlying distribution of a Generalised Pareto Distribution, where
for
,
for
, then the logarithm of the probability density function of
is
Then, the log-likelihood
for the model is the logarithm of the joint density of the
observations, i.e.,
We obtain the parameters by maximising the log-likelihood function of the sub-sample under a suitable threshold .
3.4.3. The Generalised Extreme Value Distribution (GEVD)
The Fisher–Tippett–Gnedenko theorem, first proposed by [
40] and later revised by [
41] is very important in extreme value theory. According to this theorem, the maxima or minima of a sample of observations will converge to the Generalised Extreme Value Distribution (GEVD).
The GEVD is the limiting distribution of the normalised block maxima of a sequence of independent identically distributed random variables. The GEVD is given as follows:
with
and
The probability density function, obtained as the derivative of the above distribution function, is given by:
with
and where
and
are the location and scale parameters, respectively. The shape parameter
is also known as the extreme value index (EVI).
The block maxima method is an EVT approach for identifying maximum (extremes) in a data set and for describing their behaviour. The Generalised Extreme Value Distribution is fitted to a set of maxima chosen in a given set of data. This research will use a weekly block of size 7 days.
3.4.4. Parameter Estimation
Let
are independent variables following the Generalised Extreme Value Distribution, the log-likelihood for the parameters, when
, is
Maximisation of the above function with respect to the parameters vector
, leads to the maximum likelihood estimates for the entire Generalised Extreme Value Distribution family [
42].
3.5. Value at Risk
The formula to compute Value at Risk, for a small tail probability, and total sample size .
For a Generalised Pareto Distribution with maximum likelihood estimates
, threshold
and
the number of exceedances, is given by:
for a Generalised Extreme Value Distribution with maximum likelihood estimates
:
Finally, according to [
6], the
of the asset is computed using the following formula:
where is
is derived from the mean equation (Auto-Regressive Moving Average) and
is estimated from the volatility model (Generalised Auto-Regressive Conditional Heteroscedasticity).
is the
percentile of the standardised residuals. Often,
does not vary much and is often predictable. The riskiness of the asset is then expressed through
hence, the modelling of the residuals. This is especially important when modelling extreme risk.
4. Data Analysis
The currency data used in this research were obtained from the finance sector website (
www.investing.com/currencies accessed on 1 July 2021). Analyses were conducted using R, RStudio, WaveletGARCH, wavelets, evir, FinTS, PerformanceAnalytics, ismev, and eva statistical packages. The adjusted closing values of daily exchange rates from 1 January 2015 to 30 June 2021 were fitted to the WD-ARMA-GARCH-EVT model. BitCoin is traded every day; hence, there are 2372observations. The Rand is not traded on weekends and South Africa’s public holidays resulting in 1694 observations. To align our data for analysis, we replaced missing return values in the Rand exchange rate with zero since there are no profits or losses realised by the holder of the local currency during the weekend and/or public holidays. The daily log returns were calculated and used for modelling. The formula for log returns used is
where
and
are today and yesterday’s closing values of daily prices (exchange rates), respectively.
In
Figure 1 and
Figure 2, the log returns look stationary, around the zero mean, although volatility is non-constant and clustered, indicating heteroscedasticity, which is common with financial data. Isolated extreme returns are visible; hence, EVT models will be used to capture risks associated with these extremes.
4.1. Descriptive Statistics
Table 1 below gives the descriptive statistics.
In
Table 1, the Null Hypothesis of Normality using Jarque–Bera is rejected at the 5% level of significance, meaning the use of symmetric models should not be considered when analysing the above-mentioned return series.
The significant -value of the Ljung–Box test for ZAR/USD returns suggest the failure to reject the Null Hypothesis of no autocorrelation. This means observations can be assumed to be independent and identically distributed (i.i.d). However, for BitCoin/USD returns, this null hypothesis is rejected; hence, a two-stage approach by McNeil and Frey will therefore be used to help deal with the autocorrelation problem. The first stage of fitting WD-ARMA-GARCH will eliminate this autocorrelation.
The stationarity tests (ADF and PP) confirm that, at the 5% level of significance, the Null Hypothesis of a unit root is rejected, and it can be concluded that both exchange rate return series are stationary. The KPSS test results showed that all returns are stationary as well.
4.2. WAVELETS-ARMA-GARCH
Based on the literature and the financial characteristics presented above, using a hybrid of wavelets decomposition, ARMA, GARCH, and EVT models can lead to a better measure of equity risk. The return series are non-normal and have a heteroscedasticity feature, and, as shown in
Figure 3, they have long memory volatility. Using the Wavelet-ARMA-GARCH model can aid in capturing these features in the estimation of risk.
Figure 3 shows the wavelet coefficients
for all eight levels and the scale coefficients
) for the eighth level. The Haar and Daubechies wavelets were used for computing the Maximal Overlap Discrete Wavelet Transform coefficients. The wavelet coefficients are used to decompose the signal or series (WD) according to the level (information they have), i.e., high frequency (volatility) or low frequency (volatility). Then, at each level, we fit an Auto-Regressive Moving Average (ARMA)–Generalised Auto-Regressive Conditional Heteroscedasticity model (GARCH), and, finally, combine the values for estimating volatility.
On the left-hand side is the BitCoin/USD plot of Maximal Overlap Discrete Wavelet Transform, and on the right-hand side is the ZAR/USD plot of Maximal Overlap Discrete Wavelet Transform. The wavelet coefficients are smoother at a higher level, representing longer-term volatility. The scale coefficients at the highest level represent the volatility that is not explained by wavelet coefficients. The means that the series of the coefficients is shifted by positions backwards so that all the series are on the same timeline. The WD(Haar)-ARMA-GARCH(1,1) and the WD(d4)-ARMA-GARCH(1,1) was then fitted for both exchange rates. The GARCH parameters were estimated using the quasi-Gaussian maximum likelihood estimators.
Table 2 presents the WD-ARMA-GARCH optimal parameters for BitCoin/USD and ZAR/USD for the Haar wavelet transformed and the Daubechies (d4) transformed series. These are the models that were used to capture volatility clustering and conditional heteroscedasticity. The Haar transformed series resulted in WD(Haar)-ARMA(2,0)-GARCH(1,1) for BitCoin/USD and WD(Haar)-ARMA(1,0)-GARCH(1,1) for the ZAR/USD. The Daubechies (d4) transformed series resulted in WD(d4)-ARMA (2,3)-GARCH(1,1) for BitCoin/USD and WD(d4)-ARMA(1,3)-GARCH(1,1) for ZAR/USD. To capture fat tails, their residuals were extracted, standardized, and used to fit EVT models, which, in turn, were used to estimate VaR, and model adequacy confirmation was performed usingbacktesting techniques.
4.2.1. BitCoin Returns (BTC/USD)
To fit the Generalised Pareto Distribution model, a threshold
must be selected. The mean excess plots determine a suitable threshold, which is necessary for fitting the Generalised Pareto Distribution model. The choice of a threshold should be depicted by linear increases in the mean excess plot.
Figure 4 and
Figure 5 present the mean excess function (in blue) and Q-Q plots (in red) of BitCoin/USD residuals from WD(Haar)-ARMA(2,0)-GARCH(1,1) and WD(d4)-ARMA(2,3)-GARCH(1,1), respectively. By observing these mean excess functions, a threshold of between 0 and 1 seems to be a reasonable choice. The 80th percentile was selected for all series, and it provided a reasonable choice as it falls within the above range.
The parameters of the Generalised Pareto Distribution were estimated using the Maximum Likelihood method and are presented in the
Table 3 below.
The shape parameter, in
Table 3, is positive for all cases. Hence, the extreme returns exhibit heavy tails forall models. This can be loosely interpreted as a suggestion that BitCoin is riskier as heavy tails imply a higher concentration of observations at the extremes.
Figure 6 shows graphical goodness-of-fit plots for BitCoin returns fitted with WD(Haar)-ARMA(2,0)-GARCH(1,1)-GPD. The Probability and Quantile plots are almost linear, confirming a good fit. The return levels are within the confidence bands, as expected. The density plot is also a good estimate for the histogram of the data. The model fits the BitCoin returns data well.
Figure 7 shows graphical goodness of fit plots for BitCoin returns fitted with WD(d4)-ARMA (2,3)-GARCH(1,1)-GPD. The Probability and Quantile plots are almost linear, confirming a good fit. The return levels are within the confidence bands as expected. The density plot is also a good estimate for the histogram of the data. The model fits the BitCoin returns data well.
The block maxima of the BitCoin/USD log returns have been fitted to Generalised Extreme Value Distribution with weekly block sizes.
Table 4 shows the maximum likelihood estimates of the parameters and their corresponding standard errors (SE). The shape parameters (
) are positive, implying that extreme returns follow a heavy tail Fréchet class distribution [
44]. These parameter estimates imply that data sets are heavy-tailed.
Figure 8 shows graphical goodness-of-fit plots for BitCoin returns fitted with WD (Haar)-ARMA(2,0)-GARCH(1,1)-GEVD. The Probability and Quantile plot are almost linear, confirming a good fit. The return levels are within the confidence bands as expected. The density plot is also a good estimate for the histogram of the data. The model fits the BitCoin returns data well.
Figure 9 shows graphical goodness-of-fit plots for BitCoin returns fitted with WD (d4)-ARMA(2,3)-GARCH(1,1)-GEVD. The Probability and Quantile plots are almost linear, and the density plot is also a good estimate for the histogram of the data. The (WD(d4)-ARMA (2,3)-GARCH(1,1)-GEVD) model fits the BitCoin returns data well.
4.2.2. South African Rand Returns (ZAR/USD)
Figure 10 and
Figure 11 present the mean excess function (in blue) and the Q-Q plots (in red) of ZAR/USD residuals from the WD(Haar)-ARMA(1,0)-GARCH and WD(d4)-ARMA(1,3)-GARCH(1,1) models, respectively. By observing these mean excess functions, a threshold of between 0 and 1 seems to be a reasonable choice. The 80th percentile was selected for all series and it provided a reasonable choice as falls within the above range.
The parameters of the Generalised Pareto Distribution were estimated using the ML method and are presented in the
Table 5 below.
In
Table 5, the shape parameter (
is negative in all cases. Hence, the extreme returns follow a short-tailed Pareto type II family of Generalised Pareto Distribution for all models.
Figure 12 shows graphical goodness-of-fit plots for Rand returns fitted with WD (Haar)-ARMA(1,0)-GARCH(1,1)-GPD. The Probability and Quantile plots are almost linear, confirming a good fit. The return levels are within the confidence bands, as expected. The density plot is also a good estimate for the histogram of the data. The model fits the Rand returns data well.
Figure 13 shows graphical goodness-of-fit plots for Rand returns fitted with WD(d4)-ARMA(1,3)-GARCH(1,1)-GPD. The Probability and Quantile plots are almost linear suggesting a good fit, and the density plot is also a good estimate for the histogram of the data. The model fits the Rand returns data well.
The block maxima of the extreme returns have been fitted to Generalised Extreme Value Distribution with weekly block size.
Table 6 shows the maximum likelihood estimates of the parameters and their corresponding standard errors (SE). The shape parameter estimates (
are negative for all cases. Hence, the extreme returns follow a negative Weibull class distribution of the Generalised Extreme Value Distribution for all models, implying that the returns are short-tailed or bounded.
Figure 14 shows graphical goodness-of-fit plots for Rand returns fitted with WD (Haar)-ARMA(1,0)-GARCH(1,1)-GEVD. The Probability and Quantile plots are almost linear, confirming a good fit. The return levels are within the confidence bands, as expected. The density plot is also a good estimate for the histogram of the data. The model fits the Rand returns data well.
Figure 15 shows graphical goodness-of-fit plots for Rand returns fitted with WD (d4)-ARMA(1,3)-GARCH(1,1)-GEVD. The Probability and Quantile plots are almost linear, confirming a good fit. The return levels are within the confidence bands, as expected. The density plot is also a good estimate for the histogram of the data. The model fits the Rand returns data fairly well.
4.3. Value at Risk and Back Test Results
The computed VaR figures presented in
Table 7, suggest that BitCoin/USD is riskier than the ZAR/USD since it has a higher value at risk per USD invested in each currency. At 99% significance levels, BitCoin/USD has an average of (2.70784 + 2.705287)/2 = 2.71% and (4.989638 + 4.983419)/2 = 4.98% for WD-ARMA-GARCH-GPD and WD-ARMA-GARCH-GEVD, respectively; this is slightly higher than 2.69% and 3.59% for the ZAR/USD, respectively. In monetary terms, at 99% level significance, an investor holding on to BitCoin is likely to lose extremes of almost USD 5.00 per USD 100.00 invested compared to the USD 3.60 likely to be lost by one holding on to the Rand, confirming the high risk associated with BitCoin [
1]. The average returns presented in
Table 1 show that BitCoin/USD returns are higher than ZAR/USD returns. These findings are consistent with the mean-variance portfolio theory, which suggests a higher yield for riskier assets [
20].
Based on the
p-values presented in
Table 8, the Kupiec likelihood ratio tests confirm that the fitted models are well suited to the returns series since the observed
p-values are greater than 0.05, except for the WD(Haar)-ARMA-GARCH-GEVD and WD(d4)-ARMA-GARCH-GEVD models with99% significance level for both currencies, both at 95%. Model adequacy is largely accepted. The model with the highest
p-value is considered to be the best-fit model, and hence it is recommended for use by financial risk analysts in estimating currency VaR.
The Kupiec test suggests that the WD (Haar)-ARMA-GARCH-GPD models give the best fit for both the BitCoin/USD and ZAR/USD currencies.
5. Discussions and Conclusions
In this study, the computation and performance of selected wavelet-decomposed (WD)–ARMA-GARCH-EVT-based Value at Risk (VaR) methodologies are explored using BitCoin/USD and ZAR/USD data. The time series are decomposed using the Maximal Overlap Discrete Wavelet Transform technique and filtered using the Haar and Daubechies, d4 wavelets.
Table 2 suggests that theHaar transformed series resulted in WD(Haar)-ARMA(2,0)-GARCH(1,1) for BitCoin/USD, while WD(Haar)-ARMA(1,0)-GARCH(1,1) was the result for ZAR/USD. Daubechies (d4) transformed series resulted in WD(d4)-ARMA(2,3)-GARCH(1,1) for BitCoin/USD, while WD(d4)-ARMA(1,3)-GARCH(1,1) was the result for ZAR/USD. The shape parameters (
for BitCoin/USD, as presented in
Table 3 and
Table 4, are positive, implying that the BitCoin returns follow a heavy-tailed distribution like Pareto and Fréchet. Conversely, the shape parameters (
for the ZAR/USD, as presented in
Table 5 and
Table 6, are negative, signifying that the Rand returns follow a bounded distribution [
44].
The EVT model provided a good fit to the tails of the distribution of the returns. The diagnostic plots showed that the Probability and Quantile plots do not deviate significantly from a straight line, signifying a good fit.
The daily VaR with two percentiles 95% and 99% were estimated and are summarised in
Table 7. Both confidence levels reveal that BitCoin/USD has a higher value at risk than the ZAR/USD, leading one to conclude that BitCoin is riskier than the Rand. But, also,
Table 1 shows that the average returns for the BitCoin/USD are higher than those of the ZAR/USD. These findings are consistent with the mean-variance portfolio theory, which suggests a higher yield for riskier assets [
20].
Kupiec’s likelihood ratio test values presented in
Table 8 confirm model adequacy for both series, except WD(Haar)-ARMA-GARCH-GEVD and WD(d4)-GARCH-GEVD for both currencies at 99% as the
p-values are less than 0.05, rejecting model adequacy at a 5% level of significance. This implies that the capacity of the models partly depends on the level of significance used.
The purpose of this study was to use Wavelet decomposition, ARMA, and GARCH combined with EVT in the estimation of VaR of the daily returns of both BitCoin (BTC) and the South African Rand (ZAR) against the USD, and to compare their riskiness. Both currencies suggest that the WD(Haar)-ARMA-GARCH-GPD model performs fairly well. This could be of great help to global investors and forex market risk managers in South Africa to understand the risk to which they are exposed when they convert their savings from Rand to BitCoin, particularly in choosing the model that gives better estimates in computing VaR, an important risk measure in the estimation of risk-adjusted capital requirements.
This information is useful to local foreign currency traders and investors who need to fully appreciate the tail-related return levels and risk exposure when they convert their savings or investments to BitCoin instead of the South African currency, the Rand. Particularly, when the market enters a turbulent time, BitCoin is riskier than the South African Rand, which is a developing country’s currency. These results, though, do not imply that WD(Haar)-ARMA-GARCH(1,1)-GPD will always give a better fit among wavelet filters for every currency data set. As further research, we recommend the consideration of other wavelet filters such as Coiflets, Symlets, etc., and compare their performance.