# Investigating Causes of Model Instability: Properties of the Prediction Accuracy Index

## Abstract

**:**

## 1. Introduction

- (a)
- explore the properties of the (multivariate) PAI;
- (b)
- reflect on its use in practice;
- (c)
- make further recommendations on how to investigate the cause of instability when a high value of the PAI indicates a lack of model stability.

#### 1.1. The Prediction Accuracy Index (PAI)

- ${r}_{j}$
- is the vector of explanatory variables for the $j$th observation of the review data ($j$ = 1 to $N$);
- ${x}_{i}$
- is the vector of explanatory variables for the $i$th observation of the development data ($i$ = 1 to $n$);
- $V$$=\mathrm{MSE}\times {\left({\mathrm{X}}^{\mathrm{T}}\mathrm{X}\right)}^{-1}$
- is the variance–covariance matrix of the estimated regression coefficients;
- $\mathrm{X}$
- is the design matrix with columns defined by the ${x}_{i}$ (the explanatory variables at development).

#### 1.2. Notation and Illustrative Data Examples

^{T}. For the review data R1, the $PAI$ equals 1.31, which Taplin and Hunt (2019) interpret as a deterioration requiring further investigation. The review data R2 are described in Table 2. For this review data, $PAI=1.58$, which Taplin and Hunt (2019) interpret as a significant deterioration in the predictive accuracy of the model.

## 2. The PAI as a Function of the Squared Mahalanobis Distances

#### An Expression for the PAI as a Function of the Squared Mahalanobis Distance ${M}_{v}$

- ${\overline{M}}_{r}$ equals the average of the squared Mahalanobis distance of the review data ${\rho}_{j}$,
- ${\overline{M}}_{d}$ equals the average of the squared Mahalanobis distance of the development data ${\u03f5}_{i}$,
- and all Mahalanobis distances are relative to the mean and variance–covariance matrix of the development data. That is, the squared Mahalanobis distance of observation $v$ (either ${\mathsf{\rho}}_{j}$ for review data or for ${\mathsf{\epsilon}}_{j}$ for development data) is$${M}_{v}={(v-\overline{\u03f5})}^{T}{V}_{d}^{-1}\left(v-\overline{\u03f5}\right)$$

## 3. The Contribution of Individual Observations to the PAI

- (a)
- ${I}_{k}$ can be either positive or negative;
- (b)
- The average of the ${I}_{k}$ is 0;
- (c)
- A useful reference point for a small ${I}_{k}$ is 0, since, while ${I}_{k}$ can be negative, the PAI does not change if an observation with ${I}_{k}=0$ is removed;
- (d)
- A useful reference point for a large ${I}_{k}$ is $PA{I}_{0}-1$, since the $PAI$ is reduced to 1 when an observation with ${I}_{k}=PA{I}_{0}-1$ is removed;
- (e)
- ${I}_{k}$ can be written in terms of the squared Mahalanobis distance ${m}_{k}$ of observation ${\rho}_{j}$:$${I}_{k}=PA{I}_{0}-PA{I}_{\left(-k\right)}=\frac{{m}_{k}-{\overline{M}}_{r}}{N\left(1+{\overline{M}}_{d}\right)}$$

## 4. The Proportion of the PAI Due to a Shift in Distributions

- (1)
- $\frac{1}{N}{\displaystyle \sum}_{j=1}^{N}{({\rho}_{j}-\overline{\rho})}^{T}{V}_{d}^{-1}\left({\rho}_{j}-\overline{\rho}\right)$, the mean squared Mahalanobis distance of the review data from the mean in the review data;
- (2)
- ${(\overline{\rho}-\overline{\u03f5})}^{T}{V}_{d}^{-1}\left(\overline{\rho}-\overline{\u03f5}\right)$, the squared Mahalanobis distance between the mean of the review data and the mean of the development data.

^{T}, $PAI=1.31$, and $S=0.02$. This extra observation changes the mean of the explanatory variables from $\overline{\u03f5}={\left(3.12,2.08,1.44\right)}^{T}$ to $\overline{\rho}={\left(3.18,2.06,1.43\right)}^{T}$; however, transforming the review data to have the same mean as the development data only reduces the PAI by 0.006. Hence, the change in the mean of the variables contributes to only $S=0.006/0.31=$ 2% of the 0.31 the PAI exceeds the baseline value of 1. As expected from the discussion in Section 3, the PAI in this example is not due to a shift in the means of the explanatory variables but due to the single outlier in the review data.

^{T}, it is transformed to ${\rho}_{1}^{\prime}={\left(0.3,1.06,1\right)}^{T}$), and produces a value of $PAI=1.13$ for the transformed review data. Hence, from Equation (6), the proportion of the excess PAI due exclusively to a change in the means of the explanatory variables is (1.58–1.13)/(1.58–1) = 0.77. That is, 77% of the amount the PAI of 1.58 exceeds the value of 1 (when the model performs equally accurately on review and development data) is due to the shift in the mean of the review data relative to the development data.

## 5. The Contributions of Explanatory Variables to the PAI

^{T}in the review data. That is, when the entire model development is considered (rather than just the variables included in the final model), there is a compelling argument that the PAI should be calculated using more variables than just those in the final model. Thus, we suggest that the PAI should also be calculated using all variables considered for inclusion in a model. This is important because the accuracy of predictions from the final model depends not only on the coefficients for variables in the model but also on the choice of which variables are included in the final model. Model stability is arguably relevant for the entire modelling process (not just the final model).

## 6. Discussion

#### 6.1. Categorical Variables

#### 6.2. Econometric Variables

^{T}, (3, 2, 2)

^{T}, and (3, 2, 2)

^{T}, where the squared Mahalanobis distances are 1.0, 1.4, and 1.8, respectively (Table 3). These occur a total of 16 out of 50 times in the development data; therefore, there is only a 32% chance that the PAI will be green (<1.1) even if the review data are selected from the distribution of the development data. All other values for the explanatory variables at review result in a PAI greater than 1.1. For the PAI to be less than 1.5, the squared Mahalanobis distance must be less than 3.5, which adds the observations $\rho =$ (5, 3, 1)

^{T}and (1, 1, 2)

^{T}, which occur six times in the development data. Hence, only 44% of the observations in the review data result in a PAI less than 1.5. Thus, even if the value of the explanatory variables at review were randomly selected from the distribution in the development data, there is a 56% chance the PAI is red (>1.5) and only a 32% chance it is green (<1.2).

## 7. Conclusions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Appendix A

- Top left entry ($A$) is $n$, the sample size at development;
- The off-diagonal entries ($B$ and $C$) are row and column vectors that contain zeroes;
- Bottom-right entries ($D$) equal $n$ times the variance–covariance matrix of the explanatory variables ($ij\mathrm{th}$ element equal to $\sum}_{k=1}^{n}({X}_{ik}{X}_{jk$).

## Appendix B

- (a)
- ${I}_{k}$ can be both positive or negative because the squared Mahalanobis distance of observations can be either higher or lower than the average in the development data;
- (b)
- The average of the ${I}_{k}$ is 0 because the average of the $PA{I}_{\left(-k\right)}$ equals $PA{I}_{0}$ (to see this, note that each of the $PA{I}_{\left(-k\right)}$ is itself a mean of a set of numbers after leaving out one value, therefore each value in this set is omitted exactly once);
- (c)
- By definition of ${I}_{k}$, when ${I}_{k}=0$, we have $PA{I}_{0}=PA{I}_{\left(-k\right)}$;
- (d)
- By definition of ${I}_{k}$, when ${I}_{k}=PA{I}_{0}-1$, we have $PA{I}_{0}-PA{I}_{\left(-k\right)}=PA{I}_{0}-1$, therefore $PA{I}_{\left(-k\right)}=1$;
- (e)
- The identity ${I}_{k}=PA{I}_{0}-PA{I}_{\left(-k\right)}=\frac{{m}_{k}-{\overline{M}}_{r}}{N\left(1+{\overline{M}}_{d}\right)}$ follows from Equation (2) to define the PAI in terms of the average of the squared Mahalanobis distances of the review data and the identity for the mean of $N$ observations $\overline{y}$ in terms of the $k$th observation ${y}_{k}$ and the average of the other ($k-1$) observations ${\overline{y}}_{\left(-k\right)}$:$$\overline{y}=\frac{\overline{y}+\left(N-1\right){\overline{y}}_{\left(-k\right)}}{N}$$

## Appendix C

## References

- Basel Committee on Banking Supervision. 2006. Basel II: International Convergence of Capital Measurement and Capital Standards, A Revised Framework—Comprehensive Version. Bank for International Settlements. Available online: https://www.bis.org/publ/bcbs128.htm (accessed on 4 February 2018).
- Becker, Aneta, and Jarosław Becker. 2021. Dataset shift assessment measures in monitoring predictive models. Procedia Computer Science 192: 3391–402. [Google Scholar] [CrossRef]
- Giudici, Paolo, and Emanuela Raffinetti. 2021. Shapley-Lorenz eXplainable Artificial Intelligence. Expert Systems with Applications 167: 114104. [Google Scholar] [CrossRef]
- Hurlbert, Stuart H. 1984. Pseudoreplication and the design of ecological field experiments. Ecological Monographs 54: 187–211. [Google Scholar] [CrossRef][Green Version]
- International Accounting Standards Board. 2014. IFRS 9—Financial Instruments. Available online: http://www.aasb.gov.au/admin/file/content105/c9/AASB9_12-14.pdf (accessed on 4 February 2018).
- Kruger, Chamay, Willem Daniel Schutte, and Tanja Verster. 2021. Using Model Performance to Assess the Representativeness of Data for Model Development and Calibration in Financial Institutions. Risks 9: 204. [Google Scholar] [CrossRef]
- Lundberg, Scott M., and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. Paper Presented at the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, December 4–9. [Google Scholar] [CrossRef]
- Mahalanobis, Prasanta Chandra. 1936. On the Generalized Distance in Statistics. Proceedings of the National Institute of Science of India 2: 49–55. [Google Scholar]
- Petersen, Mitchell A. 2009. Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches. Review of Financial Studies 22: 435–80. [Google Scholar] [CrossRef][Green Version]
- Taplin, Ross, and Clive Hunt. 2019. The Population Accuracy Index: A New Measure of Population Stability for Model Monitoring. Risks 7: 53. [Google Scholar] [CrossRef][Green Version]

**Table 1.**Number of observations in the illustrative development data for combinations of the three explanatory variables ${\u03f5}_{i}={\left(a,b,c\right)}^{T}$.

$\mathit{c}$ | $\mathit{b}$ | $\mathit{a}$ | |||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | ||

1 | 1 | 2 | 1 | 0 | 0 | 0 | 0 |

1 | 2 | 0 | 2 | 7 | 4 | 0 | 0 |

1 | 3 | 0 | 0 | 0 | 6 | 5 | 1 |

2 | 1 | 9 | 1 | 0 | 0 | 0 | 0 |

2 | 2 | 0 | 2 | 4 | 1 | 0 | 0 |

2 | 3 | 0 | 0 | 0 | 1 | 3 | 1 |

**Table 2.**Number of observations in the illustrative review data R2 for combinations of the three explanatory variables ${\mathsf{\rho}}_{\mathrm{j}}={\left(a,b,c\right)}^{T}$.

$\mathit{c}$ | $\mathit{b}$ | $\mathit{a}$ | |||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | ||

1 | 1 | 1 | 2 | 1 | 0 | 0 | 0 |

1 | 2 | 0 | 1 | 3 | 6 | 4 | 0 |

1 | 3 | 0 | 0 | 0 | 1 | 5 | 4 |

2 | 1 | 1 | 8 | 1 | 0 | 0 | 0 |

2 | 2 | 0 | 0 | 2 | 3 | 2 | 0 |

2 | 3 | 0 | 0 | 0 | 1 | 1 | 3 |

**Table 3.**Squared Mahalanobis distance ${M}_{v}$ for values of ${\u03f5}_{i}={\left(a,b,c\right)}^{T}$ using the illustrative development data in Table 1.

$\mathit{c}$ | $\mathit{b}$ | $\mathit{a}$ | |||||
---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | ||

1 | 1 | 4.3 | 5.7 | 12.5 | 24.8 | 42.5 | 65.7 |

1 | 2 | 12.0 | 3.8 | 1.0 | 3.7 | 11.8 | 25.5 |

1 | 3 | 40.3 | 22.5 | 10.2 | 3.3 | 1.8 | 5.9 |

2 | 1 | 2.5 | 4.2 | 11.4 | 24.0 | 42.1 | 65.7 |

2 | 2 | 11.7 | 3.8 | 1.4 | 4.4 | 12.9 | 26.9 |

2 | 3 | 41.6 | 24.1 | 12.1 | 5.5 | 4.4 | 8.8 |

**Table 4.**Average squared Mahalanobis distances and PAI for males and females, illustrating a model that predicts females slightly less accurately than males at development but considerably less accurately at review.

Subset | Development Data | Review Data | PAI |
---|---|---|---|

Male applicants | 4.3 | 4.4 | 1.02 |

Female applicants | 4.5 | 6.2 | 1.31 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Taplin, R.
Investigating Causes of Model Instability: Properties of the Prediction Accuracy Index. *Risks* **2023**, *11*, 110.
https://doi.org/10.3390/risks11060110

**AMA Style**

Taplin R.
Investigating Causes of Model Instability: Properties of the Prediction Accuracy Index. *Risks*. 2023; 11(6):110.
https://doi.org/10.3390/risks11060110

**Chicago/Turabian Style**

Taplin, Ross.
2023. "Investigating Causes of Model Instability: Properties of the Prediction Accuracy Index" *Risks* 11, no. 6: 110.
https://doi.org/10.3390/risks11060110