Next Article in Journal
Automata Based Multivariate Time Series Analysis for Anomaly Detection over Sliding Time Windows
Previous Article in Journal
Comparison between Two Different Deployment Types of Road-Side Devices Reducing Incident-Related Potential Conflicts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Pairs Trading Strategies in Cryptocurrency Markets: A Comparative Study between Statistical Methods and Evolutionary Algorithms †

1
Department of Intelligent Commerce, National Kaohsiung University of Science and Technology, Kaohsiung 82445, Taiwan
2
AI Fintech Center, National Kaohsiung University of Science and Technology, Kaohsiung 82445, Taiwan
3
Department of Finance and Information, National Kaohsiung University of Science and Technology, Kaohsiung 82445, Taiwan
*
Author to whom correspondence should be addressed.
Presented at the 3rd IEEE International Conference on Electronic Communications, Internet of Things and Big Data Conference 2023, Taichung, Taiwan, 14–16 April 2023.
Eng. Proc. 2023, 38(1), 74; https://doi.org/10.3390/engproc2023038074
Published: 6 July 2023

Abstract

:
Pairs trading is a popular quantitative trading strategy with the advantage of a similarity in price movement to financial assets. Assuming that the price spreads of trading pairs are mean-reverting, this strategy exploits the disequilibrium in financial markets to find arbitrage investment opportunities. Pairs trading has been widely applied to stock, ETF, and commodity markets. However, the effectiveness of this method for cryptocurrency markets has yet to be properly explored. Therefore, we examine the profitability of pairs trading for 26 cryptocurrencies traded on the Binance exchange at high frequencies of 1, 5, and 60 min. In addition to the traditional statistical methods of distance, correlation, cointegration, and stochastic differential residual (SDR), we focus on two evolutionary algorithms: genetic algorithm (GA) and non-dominated sorting genetic algorithm II (NSGA-II). During the 79-trading-day period from 11 January to 31 March 2018, NSGA-II showed the best results at all frequencies, with an average return of 2.84%. Among the statistical models, SDR ranks first, whereas Correlation ranks last, with average returns of 1.63% and −0.48%, respectively. The z-test results show that the models are statistically significantly different. We propose NSGA-II as the best candidate for use in pairs trading strategies in cryptocurrency markets.

1. Introduction

Pairs trading is a strategy in which an investor simultaneously buys undervalued assets and sells overvalued assets when a disequilibrium condition is detected in a specific financial market. This method has proven to be effective in stock markets, commodity markets, and ETF markets [1]. Recently, the cryptocurrency market has emerged as a profitable and risky investment channel for investors. Therefore, this study is carried out to suggest a suitable pairs trading strategy for the cryptocurrency market by answering two questions: (1) Is pairs trading an effective trading strategy in the cryptocurrency market, especially with high trading frequencies? (2) Among the current most popular techniques for pairs selection, which ones are the best?
Most research on pairs trading applies traditional pairs selection methods such as distance, cointegration, or correlation. We explore two more evolutionary algorithms for selecting trading pairs: genetic algorithm (GA) and non-dominated sorting genetic algorithm II (NSGA-II). Moreover, we also conduct rigid tests to examine the significant differences among the methods.
The rest of this article is organized as follows. Section 2 describes the methodologies of pairs selection methods, trading strategies as well as data and experimental design, Section 3 analyzes the trading results and give discussion.

2. Methodology

The research flow chart is depicted in Figure 1 with five steps: processing the data obtained from the Binance Exchange API, selecting trading pairs based on four traditional statistical methods and two evolutionary algorithms, proposing trading strategies, analyzing trading results based on six different criteria, and finally conducting z-tests to confirm the significant differences between the methods.

2.1. Pairs Selection

We deploy six techniques to select trading pairs, including four statistical methods and two evolutionary algorithms.

2.1.1. Euclidean Distance Method

The Euclidean distance method is still prevalent in choosing trading pairs thanks to its simplicity. The basic concept is that pairs with the smallest distance are selected based on the application of Euclidean squared distance to normalized price series.

2.1.2. Cointegration

Cointegration is a term coined by Granger [2] and Engle and Granger [3] to describe the long-run relationship between two assets. Granger was awarded the Nobel Prize in Economics in 2003 for his contributions to this concept. When two assets are cointegrated, their prices follow a similar pattern or, in financial terms, the two assets have similar risk exposure so that their prices move alongside. Cointegration techniques have been utilized to investigate the co-movement of a wide range of financial assets, including equity shares, commodities, exchange rates, and, more recently, cryptocurrencies. In this study, the augmented Dickey–Fuller test (ADF) [4] is used to determine whether or not a given time series is stationary. The null hypothesis of the ADF test is that there is a unit root (or a stochastic trend) in the residual series, with the alternative being that there is no unit root. If the null hypothesis is rejected, the two assets are cointegrated and selected to form a pair.

2.1.3. Correlation

The correlation is quantified using the correlation coefficient ρ, with values ranging from −1 to +1. A value of +1 means that a perfect positive correlation exists between the two variables, i.e., when one is moving in a certain direction, the other is also moving in the same direction with the same magnitude. A value of −1 means that there is a perfect negative correlation and 0 means there is no correlation at all. If the correlation between the two assets is high, the trader can choose that pair. This value represents a strong relationship between the two stocks. Correlation does not have a well-defined relationship with cointegration. Cointegrated series may have low correlation, and highly correlated series may not be cointegrated.

2.1.4. Stochastic Differential Residual (SDR)

SDR was developed by Do, Faff, and Hamza [5], which applies CAPM and APT theories to specify the residual spread function G t as follows.
G t = G P t A , P t B , U t = R t A R t B Γ r t m
where R t A , R t B are the expected return on assets A and B, respectively, Γ is the vector of exposure differentials or the sensitivity of the asset price to macroeconomic factors, and r t m is the risk premiums.

2.1.5. Genetic Algorithm (GA)

Genetic algorithm (GA) is a computer science technique for solving combinatorial optimization problems. GA simulates evolutionary adaptations of biological populations based on Darwin’s theory. To apply GA, we need to determine the coding of chromosomes, the fitness function, the chromosomes selection method, the crossover method, the mutation method, and the termination condition.

2.1.6. Non-Dominated Sorting Genetic Algorithm II (NSGA-II)

NSGA-II is another genetic algorithm with three unique features: a fast non-dominated sorting approach, a fast crowded distance estimation procedure, and a simple crowded comparison operator. NSGA-II follows the steps of population initialization, non-dominated sort, crowding distance, selection, genetic operators, recombination, and selection [6].

2.2. Trading Strategies

After the pairs are generated from the algorithms mentioned above, we set the trading strategies. We apply the Bollinger bands of the spread between security A and security B, in which the upper limit of the Bollinger bands at time t is SMAt + 2STDt, and the lower limit is SMAt—2STDt, where SMA is the simple moving average value and STD is the standard deviation value. If the spread exceeds the upper limit, we short security A and long security B. On the contrary, if the spread is smaller than the lower limit, we long security A and short security B. For the timing of closing the position, if the spread value goes back to the SMA value at a certain point in time, we close the position. Additionally, when the position is held over the trading period, the position is forced to close.
The trading strategies are described in Figure 2.

2.3. Data and Experimental Design

2.3.1. Data

The closing prices are taken from the global cryptocurrency exchange—Binance—for three months, from 3 January 2018 to 31 March 2018, with three different frequencies, 1, 5, and 60 min (1 h). Based on Ref. [7], we start with 183 cryptocurrencies on Binance, and then weed out those that have been inactive in the market for the considered period. Next, we keep 30% of the cryptocurrencies with the top trading volume to ensure liquidity conditions. Finally, the remaining 26 cryptocurrencies continue to the next steps. The currency of the original raw data on Binance is BTC. To make it easier to read and track the experimental results, we have converted the currency intSo USDT (Tether (USDT), a unit of cryptocurrency designed so that a base USD represents each token.

2.3.2. Experimental Design

The experimental setup is described as follows.
  • Initial principle for each trade: US $1000.
  • The SMA and the STD are calculated based on the sliding window with a forming period of five time units and a trading period of one time unit.
  • We select five pairs for every sliding window, except for GA and NSGA-II, because these two methods may not select out five pairs.

3. Result and Discussion

Table 1 describes the trading results of six selection methods based on different criteria.
The experimental results show that NSGA-II has the highest return rates at each data frequency of 1, 5, and 60 min, with an average return rate of 2.84% from 11 January 2018 to 31 March 2018—79 days in total. The average annualized rate of return is about 13.8%. Among the statistical models, SDR performs the best, whereas correlation performs the worst, with average returns of 1.63% and −0.48%, respectively. The average rate of return of GA is 0.86%. Despite a positive return, it is lower than we expected. The possible reason for this is that GA is a single-objective optimization technique, so it does not consider other risk factors, resulting in a lower-than-expected performance. It is also noted that the min return of GA is among the lowest. Table 2 shows that NSGA-II significantly differs from other algorithms at the 99% confidence level in all frequencies, except for GA and SDR.

Author Contributions

Conceptualization, P.-C.K., P.-C.L., H.-T.D. and W.-H.C.; methodology, P.-C.K., P.-C.L. and Y.-H.K.; software, Y.-F.H. and Y.-H.K.; validation, P.-C.K. and Y.-F.H.; formal analysis, H.-T.D. and Y.-H.K.; investigation, P.-C.K. and P.-C.L.; data curation, P.-C.L. and Y.-H.K.; writing—original draft preparation, H.-T.D.; writing—review and editing, P.-C.L. and H.-T.D.; visualization, H.-T.D. and Y.-H.K.; supervision, P.-C.K., P.-C.L. and W.-H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets for this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Krauss, C. Statistical arbitrage pairs trading strategies: Review and outlook. J. Econ. Surv. 2017, 31, 513–545. [Google Scholar] [CrossRef] [Green Version]
  2. Granger, C.W.J. Some properties of time series data and their use in econometric model specification. J. Econom. 1981, 16, 121–130. [Google Scholar] [CrossRef]
  3. Engle, R.F.; Granger, C.W.J. Co-integration and Error Correction: Representation, Estimation and Testing. J. Econom. Soc. 1987, 55, 251–276. [Google Scholar] [CrossRef]
  4. Dickey, D.A.; Fuller, W.A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar]
  5. Do, B.; Faff, R.; Hamza, K. A new approach to modeling and estimation for pairs trading. In Proceedings of the 2006 Financial Management Association European Conference, Durham, UK, 20–22 April 2006. [Google Scholar]
  6. Yusoff, Y.; Ngadiman, M.S.; Zain, A.M. Overview of NSGA-II for optimizing machining process parameters. Procedia Eng. 2011, 15, 3978–3983. [Google Scholar] [CrossRef] [Green Version]
  7. Fil, M.; Kristoufek, L. Pairs trading in cryptocurrency markets. IEEE Access 2020, 8, 172644–172651. [Google Scholar] [CrossRef]
Figure 1. Research flow chart.
Figure 1. Research flow chart.
Engproc 38 00074 g001
Figure 2. Trading strategies.
Figure 2. Trading strategies.
Engproc 38 00074 g002
Table 1. Trading results.
Table 1. Trading results.
AlgorithmTotalReturnTradeCountAVGReturnMaxReturnMinReturnMaxDrawdown
1 min5 min60 min1 min5 min60 min1 min5 min60 min1 min5 min60 min1 min5 min60 min1 min5 min60 min
Distance1.661.511.102342151720.010.010.0131.3131.3119.68−10.21−10.24−14.860.080.050.32
Cointegration1.390.980.312122041550.010.000.0040.9140.0933.70−32.06−34.10−32.000.380.170.00
Correlation−0.52−0.42−0.492011951720.000.000.0036.8540.5245.93−29.94−30.53−32.640.640.000.00
SDR2.111.401.392152081830.010.010.0161.7167.2865.41−64.18−64.48−57.610.520.720.51
GA0.480.991.121831711190.000.010.0160.8866.3756.38−60.91−64.48−57.590.010.460.00
NSGA-II3.592.682.241961781130.020.020.0231.0427.8626.08−28.17−55.66−33.311.401.510.50
Table 2. z-test results.
Table 2. z-test results.
Frequency1 min5 min60 min
NSGA-II1.00001.00001.0000
Distance0.0000 ***0.0031 ***0.0000 ***
Cointegration0.0000 ***0.0000 ***0.0000 ***
Correlation0.0000 ***0.0000 ***0.0000 ***
SDR0.17710.21730.1961
GA0.0000 ***0.0241 **0.0561 *
* Significant at the 10% level; ** significant at the 5% level; *** significant at the 1% level.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ko, P.-C.; Lin, P.-C.; Do, H.-T.; Kuo, Y.-H.; Huang, Y.-F.; Chen, W.-H. Pairs Trading Strategies in Cryptocurrency Markets: A Comparative Study between Statistical Methods and Evolutionary Algorithms. Eng. Proc. 2023, 38, 74. https://doi.org/10.3390/engproc2023038074

AMA Style

Ko P-C, Lin P-C, Do H-T, Kuo Y-H, Huang Y-F, Chen W-H. Pairs Trading Strategies in Cryptocurrency Markets: A Comparative Study between Statistical Methods and Evolutionary Algorithms. Engineering Proceedings. 2023; 38(1):74. https://doi.org/10.3390/engproc2023038074

Chicago/Turabian Style

Ko, Po-Chang, Ping-Chen Lin, Hoang-Thu Do, Yuan-Heng Kuo, You-Fu Huang, and Wen-Hsien Chen. 2023. "Pairs Trading Strategies in Cryptocurrency Markets: A Comparative Study between Statistical Methods and Evolutionary Algorithms" Engineering Proceedings 38, no. 1: 74. https://doi.org/10.3390/engproc2023038074

Article Metrics

Back to TopTop