# Statistical Prediction of Future Sports Records Based on Record Values

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Prediction of Record Values

#### 2.1. Point Prediction

#### 2.2. Interval Prediction

#### 2.2.1. Lower Record Values from Power Function Distributions

#### 2.2.2. Upper Record Values from Pareto Distributions

#### 2.2.3. Expected Lengths of Prediction Intervals

## 3. Application to Athletics and American Football Data

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

iid | independent and identically distributed |

cdf | cumulative distribution function |

probability density function | |

MLP | maximum likelihood predictor |

MOLP | maximum observed likelihood predictor |

MPSP | maximum product of spacings predictor |

quantile–quantile | |

NFL | National Football League |

PPR | points per reception |

## References

- Noubary, R.D. A procedure for prediction of sports records. J. Quant. Anal. Sport.
**2005**, 1. [Google Scholar] [CrossRef] - Einmahl, J.H.J.; Magnus, J.R. Records in athletics through extreme-value theory. J. Am. Stat. Assoc.
**2008**, 103, 1382–1391. [Google Scholar] [CrossRef][Green Version] - Noubary, R.D. Tail modeling, track and field records, and Bolt’s effect. J. Quant. Anal. Sport.
**2010**, 6. [Google Scholar] [CrossRef] - Einmahl, J.H.J.; Smeets, S.G.W.R. Ultimate 100-m world records through extreme-value theory. Stat. Neerl.
**2011**, 65, 32–42. [Google Scholar] [CrossRef] - Henriques-Rodrigues, L.; Gomes, M.; Pestana, D. Statistics of extremes in athletics. Revstat Stat. J.
**2011**, 9, 127–153. [Google Scholar] [CrossRef] - Fraga Alves, I.; de Haan, L.; Neves, C. How far can man go? In Advances in Theoretical and Applied Statistics; Torelli, N., Pesarin, F., Bar-Hen, A., Eds.; Springer: Heidelberg, Germany, 2013; pp. 187–197. [Google Scholar] [CrossRef]
- Stephenson, A.G.; Tawn, J.A. Determining the best track performances of all time using a conceptual population model for athletics records. J. Quant. Anal. Sport.
**2013**, 9, 67–76. [Google Scholar] [CrossRef] - Adam, M.B.; Tawn, J.A. Modelling record times in sport with extreme value methods. Malays. J. Math. Sci.
**2016**, 10, 1–21. [Google Scholar] - Albert, J.; Glickman, M.E.; Swartz, T.B.; Koning, R.H. (Eds.) Handbook of Statistical Methods and Analyses in Sports; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
- Wunderlich, F.; Memmert, D. Forecasting the outcomes of sports events: A review. Eur. J. Sport Sci.
**2021**, 21, 944–957. [Google Scholar] [CrossRef] [PubMed] - Arnold, B.C.; Balakrishnan, N.; Nagaraja, H.N. Records; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1998. [Google Scholar] [CrossRef]
- Kaminsky, K.S.; Rhodin, L.S. Maximum likelihood prediction. Ann. Inst. Stat. Math.
**1985**, 37, 507–517. [Google Scholar] [CrossRef] - Volovskiy, G.; Kamps, U. Maximum observed likelihood prediction of future record values. TEST
**2020**, 29, 1072–1097. [Google Scholar] [CrossRef][Green Version] - Volovskiy, G.; Kamps, U. Maximum product of spacings prediction of future record values. Metrika
**2020**, 83, 853–868. [Google Scholar] [CrossRef][Green Version] - Raqab, M.Z.; Ahmadi, J.; Doostparast, M. Statistical inference based on record data from Pareto model. Statistics
**2007**, 41, 105–118. [Google Scholar] [CrossRef] - Ahmadi, J.; Doostparast, M. Bayesian estimation and prediction for some life distributions based on record values. Stat. Pap.
**2006**, 47, 373–392. [Google Scholar] [CrossRef] - Madi, M.T.; Raqab, M.Z. Bayesian prediction of temperature records using the Pareto model. Environmetrics
**2004**, 15, 701–710. [Google Scholar] [CrossRef] - Cheng, R.C.H.; Amin, N.A.K. Estimating parameters in continuous univariate distributions with a shifted origin. J. R. Stat. Soc. Ser. B
**1983**, 45, 394–403. [Google Scholar] [CrossRef] - Ranneby, B. The maximum spacing method. An estimation method related to the maximum likelihood method. Scand. J. Stat.
**1984**, 11, 93–112. [Google Scholar] - Wang, B.X.; Yu, K.; Coolen, F.P. Interval estimation for proportional reversed hazard family based on lower record values. Stat. Probab. Lett.
**2015**, 98, 115–122. [Google Scholar] [CrossRef][Green Version] - Awad, A.M.; Raqab, M.Z. Prediction intervals for the future record values from exponential distribution: Comparative study. J. Stat. Comput. Simul.
**2000**, 65, 325–340. [Google Scholar] [CrossRef] - Asgharzadeh, A.; Abdi, M.; Kuş, C. Interval estimation for the two-parameter Pareto distribution based on record values. Selçuk J. Appl. Math.
**2011**, 149–161. [Google Scholar]

**Figure 1.**Expected lengths of the $90\%$ prediction intervals ${I}_{1},\dots ,{I}_{4}$ for the next lower record value ($s=r+1$) based on the first r lower record values from a $Pow(\lambda ,\beta )$ distribution with $\lambda =11.3$ and $\beta =70$.

**Figure 2.**Expected lengths of the $90\%$ prediction intervals ${I}_{1},\dots ,{I}_{4}$ for the next but one lower record value ($s=r+2$) based on the first r lower record values from a $Pow(\lambda ,\beta )$ distribution with $\lambda =11.3$ and $\beta =70$.

**Figure 4.**Histogram of the results of women’s 100 m faster than 11.3 s and pdf of the power function distribution with $\lambda =11.3$ and $\beta =69.73$, where the latter is the maximum likelihood estimate based on all times below 11.3 as upper threshold.

**Figure 5.**Power function Q–Q plot for the times of women’s 100 m faster than 11.3 s, where the five smallest times were omitted when calculating the regression line.

**Table 1.**Expected lengths of the $95\%$ and $90\%$ prediction intervals ${I}_{1},\dots ,{I}_{4}$ for the sth lower record value based on the first r records from a $Pow(\lambda ,\beta )$ distribution.

$\mathit{\alpha}=5\%$ | $\mathit{\alpha}=10\%$ | |||||||
---|---|---|---|---|---|---|---|---|

${I}_{1}$ | ${I}_{2}$ | ${I}_{3}$ | ${I}_{4}$ | ${I}_{1}$ | ${I}_{2}$ | ${I}_{3}$ | ${I}_{4}$ | |

$\lambda =11.3$, $\beta =70$ | ||||||||

$r=3$, $s=4$ | 1.45 | 1.03 | 0.54 | 1.13 | 0.98 | 0.74 | 0.44 | 0.95 |

$r=8$, $s=9$ | 0.66 | 0.64 | 0.51 | 1.61 | 0.51 | 0.49 | 0.41 | 1.35 |

$r=8$, $s=10$ | 1.01 | 0.97 | 0.72 | 1.67 | 0.79 | 0.77 | 0.60 | 1.40 |

$r=25$, $s=26$ | 0.43 | 0.43 | 0.40 | 2.17 | 0.34 | 0.34 | 0.32 | 1.83 |

$r=25$, $s=27$ | 0.63 | 0.63 | 0.57 | 2.18 | 0.51 | 0.51 | 0.47 | 1.83 |

$r=25$, $s=28$ | 0.78 | 0.78 | 0.70 | 2.19 | 0.64 | 0.64 | 0.58 | 1.84 |

$\lambda =45.5$, $\beta =90$ | ||||||||

$r=3$, $s=4$ | 4.72 | 3.32 | 1.72 | 3.61 | 3.15 | 2.37 | 1.39 | 3.02 |

$r=8$, $s=9$ | 2.15 | 2.07 | 1.64 | 5.21 | 1.65 | 1.60 | 1.32 | 4.38 |

$r=8$, $s=10$ | 3.29 | 3.16 | 2.35 | 5.43 | 2.59 | 2.50 | 1.95 | 4.56 |

$r=25$, $s=26$ | 1.47 | 1.46 | 1.36 | 7.41 | 1.17 | 1.17 | 1.10 | 6.23 |

$r=25$, $s=27$ | 2.16 | 2.15 | 1.96 | 7.46 | 1.76 | 1.75 | 1.62 | 6.28 |

$r=25$, $s=28$ | 2.69 | 2.68 | 2.40 | 7.52 | 2.21 | 2.20 | 2.00 | 6.32 |

**Table 2.**Percentages of coverage of the $90\%$ prediction intervals ${I}_{1},\dots ,{I}_{4}$ for the sth lower record value based on n = 10,000 sequences of the first r lower record values from a $Pow(\lambda ,\beta )$ distribution.

${\mathit{I}}_{1}$ | ${\mathit{I}}_{2}$ | ${\mathit{I}}_{3}$ | ${\mathit{I}}_{4}$ | |
---|---|---|---|---|

$\lambda =11.3$, $\beta =70$ | ||||

$r=3$, $s=4$ | 0.9041 | 0.9001 | 0.8229 | 0.9404 |

$r=8$, $s=9$ | 0.8965 | 0.8989 | 0.8716 | 0.9910 |

$r=8$, $s=10$ | 0.8995 | 0.8994 | 0.8492 | 0.9785 |

$r=25$, $s=26$ | 0.8995 | 0.8985 | 0.8890 | 0.9998 |

$r=25$, $s=27$ | 0.8980 | 0.8990 | 0.8832 | 0.9988 |

$r=25$, $s=28$ | 0.9032 | 0.9040 | 0.8808 | 0.9971 |

$\lambda =45.5$, $\beta =90$ | ||||

$r=3$, $s=4$ | 0.9041 | 0.9001 | 0.8229 | 0.9404 |

$r=8$, $s=9$ | 0.8965 | 0.8989 | 0.8716 | 0.9910 |

$r=8$, $s=10$ | 0.8995 | 0.8994 | 0.8492 | 0.9785 |

$r=25$, $s=26$ | 0.8995 | 0.8985 | 0.8890 | 0.9998 |

$r=25$, $s=27$ | 0.8980 | 0.8990 | 0.8832 | 0.9988 |

$r=25$, $s=28$ | 0.9032 | 0.9040 | 0.8808 | 0.9971 |

**Table 3.**World records, maximum product of spacings predictor and (approximate) $90\%$ prediction intervals ${I}_{1}$, ${I}_{2}$ and ${I}_{3}$ for the next record of the women’s 100 m based on the previous records.

s | World Record | MPSP | ${\mathit{I}}_{1}$ | ${\mathit{I}}_{2}$ | ${\mathit{I}}_{3}$ |
---|---|---|---|---|---|

1 | 11.20 | ||||

2 | 11.08 | 11.10 | [9.46, 11.19] | [10.91, 11.19] | |

3 | 11.07 | 10.97 | [9.03, 11.07] | [10.35, 11.07] | [10.76, 11.07] |

4 | 11.04 | 10.99 | [10.63, 11.07] | [10.69, 11.07] | [10.84, 11.07] |

5 | 11.01 | 10.98 | [10.77, 11.04] | [10.76, 11.04] | [10.85, 11.04] |

6 | 10.88 | 10.95 | [10.80, 11.01] | [10.78, 11.01] | [10.84, 11.01] |

7 | 10.81 | 10.81 | [10.62, 10.88] | [10.62, 10.88] | [10.68, 10.88] |

8 | 10.79 | 10.74 | [10.56, 10.81] | [10.56, 10.81] | [10.61, 10.81] |

9 | 10.76 | 10.73 | [10.58, 10.79] | [10.57, 10.79] | [10.61, 10.79] |

10 | 10.49 | 10.70 | [10.57, 10.76] | [10.55, 10.76] | [10.59, 10.76] |

11 | 10.41 | [10.22, 10.49] | [10.22, 10.49] | [10.26, 10.49] |

**Table 4.**World records (until 2022), maximum product of spacings predictor and $90\%$ prediction interval ${I}_{2}$/${I}_{2}^{\prime}$ for the next record based on the previous records for various athletic events.

Women | Men | |||||||
---|---|---|---|---|---|---|---|---|

Event | Record | r | MPSP | Prediction Interval | Record | r | MPSP | Prediction Interval |

100 m | 10.49 | 10 | 10.41 | [10.22, 10.49] | 9.58 | 13 | 9.53 | [9.40, 9.58] |

100/110 m hurdles | 12.12 | 9 | 12.04 | [11.83, 12.12] | 12.80 | 9 | 12.71 | [12.50, 12.80] |

200 m | 21.34 | 9 | 21.15 | [20.68, 21.33] | 19.19 | 5 | 18.90 | [18.03, 19.18] |

400 m | 47.60 | 12 | 47.25 | [46.42, 47.58] | 43.03 | 4 | 42.46 | [40.53, 43.00] |

800 m | 1:53.28 | 2 | 1:50.06 | [1:32.74, 1:53.11] | 1:40.91 | 7 | 1:40.20 | [1:38.29, 1:40.87] |

1500 m | 3:50.07 | 3 | 3:45.61 | [3:28.01, 3:49.84] | 3:26.00 | 8 | 3:24.55 | [3:20.77, 3:25.92] |

10,000 m | 29:01.03 | 9 | 28:40.22 | [27:48.19, 28:59.95] | 26:11.00 | 13 | 26:02.20 | [25:41.55, 26:10.55] |

Marathon | 2:14:04 | 2 | 2:06:45 | [1:30:47, 2:13:41] | 2:01:09 | 8 | 2:00:11 | [1:57:40, 2:01:06] |

Shot put | 22.63 | 26 | 22.81 | [22.64, 23.19] | 23.37 | 16 | 23.56 | [23.38, 24.01] |

Javelin throw | 72.28 | 3 | 78.70 | [72.60, 111.95] | 98.48 | 8 | 101.07 | [98.61, 108.23] |

Discus throw | 76.80 | 17 | 77.56 | [76.84, 79.31] | 74.08 | 12 | 74.89 | [74.12, 76.88] |

Long jump | 7.52 | 14 | 7.57 | [7.52, 7.70] | 8.95 | 9 | 9.06 | [8.96, 9.36] |

High jump | 2.09 | 13 | 2.10 | [2.09, 2.13] | 2.45 | 22 | 2.46 | [2.45, 2.48] |

Event | Score |
---|---|

Passing | 1 point per 25 yards |

Passing touchdowns | 4 points |

Interceptions thrown | −2 points |

Rushing/receiving yards | 1 point per 10 yards |

Receptions | 1 point |

Touchdowns | 6 points |

2-Point conversions | 2 points |

Fumbles lost | −2 points |

**Table 6.**World records (2000–2019), maximum product of spacings predictor and (approximate) $90\%$ prediction intervals ${I}_{1}^{\prime}$, ${I}_{2}^{\prime}$ and ${I}_{3}^{\prime}$ for the next record based on the previous records for the fantasy points of quarterbacks.

s | Player | Team | Records | MPSP | ${\mathit{I}}_{1}^{\prime}$ | ${\mathit{I}}_{2}^{\prime}$ | ${\mathit{I}}_{3}^{\prime}$ |
---|---|---|---|---|---|---|---|

1 | Cade McNown | CHI | 34.3 | ||||

2 | Trent Green | STL | 36.3 | 42.0 | [34.7, 1621.4] | [34.7, 63.0] | |

3 | Peyton Manning | IND | 37.4 | 41.3 | [36.4, 106.5] | [36.5, 89.4] | [36.5, 53.6] |

4 | Trent Green | KAN | 37.9 | 41.2 | [37.5, 50.5] | [37.6, 61.4] | [37.6, 49.9] |

5 | Michael Vick | ATL | 38.2 | 40.9 | [38.0, 45.0] | [38.0, 53.1] | [38.0, 47.5] |

6 | Daunte Culpepper | MIN | 41.8 | 40.6 | [38.3, 43.1] | [38.3, 49.3] | [38.3, 46.0] |

7 | Michael Vick | PHI | 49.3 | 44.7 | [41.9, 49.2] | [41.9, 54.2] | [41.9, 51.1] |

8 | 53.4 | [49.5, 62.4] | [49.5, 66.7] | [49.5, 62.8] |

**Table 7.**World records (2000–2019), maximum product of spacings predictor and (approximate) $90\%$ prediction intervals ${I}_{1}^{\prime}$, ${I}_{2}^{\prime}$ and ${I}_{3}^{\prime}$ for the next record based on the previous records for the fantasy points of running backs.

s | Player | Team | Records | MPSP | ${\mathit{I}}_{1}^{\prime}$ | ${\mathit{I}}_{2}^{\prime}$ | ${\mathit{I}}_{3}^{\prime}$ |
---|---|---|---|---|---|---|---|

1 | Duce Staley | PHI | 36.2 | ||||

2 | Marshall Faulk | STL | 44.9 | 43.7 | [36.6, 1284.9] | [36.6, 63.6] | |

3 | Marshall Faulk | STL | 45.6 | 54.9 | [45.4, 2688.2] | [45.4, 182.1] | [45.4, 82.1] |

4 | Fred Taylor | JAX | 51.8 | 52.4 | [45.9, 101.6] | [45.9, 93.5] | [45.9, 69.3] |

5 | Shaun Alexander | SEA | 56.1 | 59.4 | [52.1, 95.7] | [52.2, 95.2] | [52.2, 78.0] |

6 | Clinton Portis | DEN | 57.4 | 63.6 | [56.4, 91.4] | [56.5, 93.8] | [56.5, 81.6] |

7 | Jamaal Charles | KAN | 59.5 | 64.0 | [57.7, 83.8] | [57.7, 87.4] | [57.7, 79.4] |

8 | 65.6 | [59.8, 82.1] | [59.8, 85.8] | [59.8, 79.8] |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Empacher, C.; Kamps, U.; Volovskiy, G.
Statistical Prediction of Future Sports Records Based on Record Values. *Stats* **2023**, *6*, 131-147.
https://doi.org/10.3390/stats6010008

**AMA Style**

Empacher C, Kamps U, Volovskiy G.
Statistical Prediction of Future Sports Records Based on Record Values. *Stats*. 2023; 6(1):131-147.
https://doi.org/10.3390/stats6010008

**Chicago/Turabian Style**

Empacher, Christina, Udo Kamps, and Grigoriy Volovskiy.
2023. "Statistical Prediction of Future Sports Records Based on Record Values" *Stats* 6, no. 1: 131-147.
https://doi.org/10.3390/stats6010008