# Discrete-Time Survival Models with Neural Networks for Age–Period–Cohort Analysis of Credit Risk

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Background and Literature Review

## 3. Research Methodology

#### 3.1. Discrete-Time Survival Model

- The variable $t$ was used as the primary time-to-event variable and indicated the loan age of a credit account. Loan age is the span of the time since a loan account was created. It is also called loan maturity. For this study, data were provided monthly so $t$ is the number of months but the period could be different, e.g., quarterly;
- The variable $m$ represents the number of loan accounts in the dataset, so $i\in \left\{1,\cdots ,m\right\}$;
- We let ${t}_{i}^{*}$ be the loan age time index of the last observation recorded for account $i$, so that for each account $i$, records only exist for loan age $t\in \left\{1,\cdots ,{t}_{i}^{*}\right\}$;
- The binary variable ${d}_{it}$ represents whether account $i$ defaults or not (1 denotes default and 0 denotes non-default) at a certain loan age t. The precise definition of default can vary by application but for this study, three months’ consecutive missed payments were used, which is an industry-standard following the Basel II convention of 90 days of missed payments (BCBS 2006);
- Notably, in the survival analysis context, the default must be the last event in a series; hence, for $t<{t}_{i}^{*}$, ${d}_{it}=0$ and for $t={t}_{i}^{*}$, ${d}_{it}=0$, which indicates a censored account (i.e., event time, such as death time in medical or default time in credit risk, is unknown during the whole observation period), and ${d}_{it}=1$ indicates the default event;
- The variable ${\mathbf{w}}_{i}$ is a vector of static application variables collected at the time when the customer applies for a loan (e.g., credit score, interest rate, debt-to-income ratio, and loan-to-value);
- We let ${v}_{i}$ be the origination period, or vintage, of account i. Normally, the period is the quarter or year when the account was originated. This is actually just one of the features in the vector ${\mathbf{w}}_{i}$. We let ${N}_{v}$ be the total number of vintages (the time when an individual customer open the account) in the dataset;
- Meanwhile, we denoted time-varying variables (e.g., behavioral, repayment history, and macroeconomic data) by vector ${\mathbf{x}}_{it}$, which is collected across the lifetime of the account;
- We let ${c}_{it}$ be the calendar time of account i at loan age t, with ${N}_{c}$ being the total number of calendar time periods. The measurement of calendar time is typically monthly, quarterly, or annually. Notably, ${c}_{it}$ is actually just one of the features in the vector ${\mathbf{x}}_{\mathit{i}\mathit{t}}$.

#### 3.2. Vintage Model

#### 3.3. Neural Network with DTSM (NN-DTSM) for Credit Risk

#### 3.4. Age–Period–Cohort Effects and Lexis Graph

**The age effect**reflects effects relating to the aging and developmental changes to individuals across their lifecycle;**The period effect**represents an equal environmental effect on all individuals over a specific calendar time period simultaneously, since systematic changes in social events, such as a financial crisis or COVID-19, may cause similar effects on individuals across all ages at the same time;**The cohort effect**is the influence on groups of observations that originate at the same time, depending on the context of the problem. For example, it could be people born at the same time or cars manufactured in the same batch.

#### 3.5. Age Period Cohort Model

- For all t such that $1\le t\le {N}_{T}$, where ${N}_{T}=\mathrm{max}\left({t}_{i}^{*}\right)$$${\delta}_{t}^{\left[T\right]}(x)=\left\{\begin{array}{l}1,\mathrm{if}x=t\\ 0,\mathrm{otherwise}\end{array}\right.$$
- For all v such that $1\le v\le {N}_{v}$$${\delta}_{v}^{\left[V\right]}(x)=\left\{\begin{array}{l}1,\mathrm{if}x=v\\ 0,\mathrm{otherwise}\end{array}\right.$$
- For all c such that $1\le c\le {N}_{c}$$${\delta}_{c}^{\left[C\right]}(x)=\left\{\begin{array}{l}1,\mathrm{if}x=c\\ 0,\mathrm{otherwise}\end{array}\right.$$

#### 3.6. Linear Regression and Fitting Macroeconomic Variables

#### 3.7. Lagged Macroeconomic Model

#### 3.8. Overall Framework of the Proposed Method

## 4. Data and Experimental Design

#### 4.1. Mortgage Data

#### 4.2. Macroeconomic Data

- We devised APC to capture the whole calendar-time effect. If MEVs were included directly into the model, this most important part of the calendar-time effect would be missing;
- We did not assume MEVs represent all calendar time effects, because some effects such as legislation, environmental or social changes would also influence the calendar time function and these should also be included as part of the calendar-time effect;
- Some previous papers were looking to build explanatory models but, in this study, we developed predictive models. MEVs in our study will be used later as criteria to assess the accuracy of the model and directly including them in the model would reduce the reliability of this testing process.

#### 4.3. Evaluation Methods

## 5. Results

#### 5.1. Neural Network versus Linear DTSM

#### 5.1.1. Experimental Setup

#### 5.1.2. Hyperparameter Selection Using Grid Search

`Model.fit`method. The result of the grid search gives values of minus log-likelihood (the target loss function) ranging from 0.07681 to 0.08257, with the best value 0.07681 when using the following hyperparameters of the neural network: no dropout, 4 hidden layers, 8 neurons in each layer, and 20 epochs for training.

#### 5.1.3. Comparison between NN-DTSM and Linear DTSM

#### 5.2. Lexis Graphs

#### 5.3. APC Model

#### 5.4. Macroeconomic Data Fitting

#### 5.4.1. Choose Time Lag for Macroeconomic Data

#### 5.4.2. Multivariate Fit of MEVs with a Calendar Time Effect Component

## 6. Discussion

## 7. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Note

1 | www.freddiemac.com/research/datasets/sf-loanlevel-dataset (accessed on 22 December 2023). |

2 | See note 1 above. |

## References

- Alfonso Perez, Gerardo, and Raquel Castillo. 2023. Nonlinear Techniques and Ridge Regression as a Combined Approach: Carcinoma Identification Case Study. Mathematics 11: 1795. [Google Scholar] [CrossRef]
- Allison, Paul. 1982. Discrete-time methods for the analysis of event histories. Sociological Methodology 13: 61–98. [Google Scholar] [CrossRef]
- Altman, Edward. 1968. Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. Journal of Finance 23: 589–609. [Google Scholar] [CrossRef]
- Arya, Shweta, Catherine Eckel, and Colin Wichman. 2013. Anatomy of the credit score. Journal of Economic Behavior & Organization 95: 175–85. [Google Scholar]
- Banasik, John, Jonathan Crook, and Lynn Thomas. 1999. Not if but when will borrowers default. Journal of the Operational Research Society 50: 1185–90. [Google Scholar] [CrossRef]
- Basel Committee on Banking Supervision (BCBS). 2006. Basel II: International Convergence of Capital Measurement and Capital Standards. Available online: www.bis.org/publ/bcbsca.htm (accessed on 22 December 2023).
- Bell, Stephen. 2020. ANPC member profile for APC. Australasian Plant Conservation. Journal of the Australian Network for Plant Conservation 29: 38–39. [Google Scholar] [CrossRef]
- Bellotti, Anthony, and Jonathan Crook. 2009. Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society 60: 1699–707. [Google Scholar] [CrossRef]
- Bellotti, Anthony, and Jonathan Crook. 2013. Forecasting and stress testing credit card default using dynamic models. International Journal of Forecasting 29: 563–74. [Google Scholar] [CrossRef]
- Bellotti, Anthony, and Jonathan Crook. 2014. Retail credit stress testing using a discrete hazard model with macroeconomic factors. Journal of the Operational Research Society 65: 340–50. [Google Scholar] [CrossRef]
- Blumenstock, Gabrial, Stefan Lessmann, and Hsin-Vonn Seow. 2022. Deep learning for survival and competing risk modelling. Journal of the Operational Research Society 73: 26–38. [Google Scholar] [CrossRef]
- Breeden, Joseph. 2016. Incorporating lifecycle and environment in loan-level forecasts and stress tests. European Journal of Operational Research 255: 649–58. [Google Scholar] [CrossRef]
- Breeden, Joseph. 2021. A survey of machine learning in credit risk. Journal of Credit Risk 17: 1–62. [Google Scholar] [CrossRef]
- Breeden, Joseph, and Jonathan Crook. 2022. Multihorizon discrete time survival models. Journal of the Operational Research Society 73: 56–69. [Google Scholar] [CrossRef]
- Correa, Alehandro, Andres Gonzalez, and Camilo Ladino. 2011. Genetic Algorithm Optimization for Selecting the Best Architecture of a Multi-Layer Perceptron Neural Network: A Credit Scoring Case. SAS Global Forum. Available online: https://support.sas.com/resources/papers/proceedings11/149–2011.pdf (accessed on 22 December 2023).
- Cox, David Roxbee. 1972. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological) 34: 187–202. [Google Scholar]
- Dahl, George, Tara Sainath, and Geoffrey Everest Hinton. 2013. Improving deep neural networks for LVCSR using rectified linear units and dropout. Paper presented at the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, May 26–31. [Google Scholar]
- De Leonardis, Daniele, and Roberto Rocci. 2008. Assessing the default risk by means of a discrete-time survival analysis approach. Applied Stochastic Models in Business and Industry 24: 291–306. [Google Scholar] [CrossRef]
- Dendramis, Yiannis, Elias Tzavalis, and Aikaterini Cheimarioti. 2020. Measuring the Default Risk of Small Business Loans: Improved Credit Risk Prediction using Deep Learning. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3729918 (accessed on 22 December 2023).
- Dirick, Lore, Gerda Claeskens, and Bart Baesens. 2017. Time to default in credit scoring using survival analysis: A benchmark study. Journal of the Operational Research Society 68: 652–65. [Google Scholar] [CrossRef]
- Faraggi, David, and Richard Simon. 1995. A neural network model for survival data. Statistics in Medicine 14: 73–82. [Google Scholar] [CrossRef]
- Fosse, Ethan, and Christopher Winship. 2019. Analyzing age-period-cohort data: A review and critique. Annual Review of Sociology 45: 467–92. [Google Scholar] [CrossRef]
- Frame, W. Scott, Andreas Fuster, Joseph Tracy, and James Vickery. 2015. The rescue of Fannie Mae and Freddie Mac. Journal of Economic Perspectives 29: 25–52. [Google Scholar] [CrossRef]
- Gensheimer, Michael, and Balasubramanian Narasimhan. 2019. A scalable discrete-time survival model for neural networks. PeerJ 7: e6257. [Google Scholar] [CrossRef]
- Glenn, Norval. 2005. Cohort Analysis. Newcastle upon Tyne: Sage, vol. 5. [Google Scholar]
- Gourieroux, Christian, Alain Monfort, and Vassilis Polimenis. 2006. Affine models for credit risk analysis. Journal of Financial Econometrics 4: 494–530. [Google Scholar] [CrossRef]
- Hemmert, Giselmar, Laura Schons, Jan Wieseke, and Heiko Schimmelpfennig. 2018. Log-likelihood-based pseudo-R2 in logistic regression: Deriving sample-sensitive benchmarks. Sociological Methods & Research 47: 507–31. [Google Scholar]
- Huang, Qiujun, Jingli Mao, and Yong Liu. 2012. An improved grid search algorithm of SVR parameters optimization. Paper presented at the 2012 IEEE 14th International Conference on Communication Technology, Chengdu, China, November 9–11. [Google Scholar]
- Hussin Adam Khatir, Ahmed Almustfa, and Marco Bee. 2022. Machine Learning Models and Data-Balancing Techniques for Credit Scoring: What Is the Best Combination? Risks 10: 169. [Google Scholar] [CrossRef]
- Jha, Paritosh Navinchandra, and Marco Cucculelli. 2021. A New Model Averaging Approach in Predicting Credit Risk Default. Risks 9: 114. [Google Scholar] [CrossRef]
- Khemais, Zaghdoudi, Djebali Nesrine, and Mezni Mohamed. 2016. Credit scoring and default risk prediction: A comparative study between discriminant analysis & logistic regression. International Journal of Economics and Finance 8: 39. [Google Scholar]
- Kielstra, Paul. 2023. Finding Value in Generative AI for Financial Services. Edited by KweeChuan Yeo. Cambridge, MA: MIT Technology Review Insights. Available online: https://www.technologyreview.com/2023/11/26/1083841/finding-value-in-generative-ai-for-financial-services/ (accessed on 22 December 2023).
- Kupper, Lawrence, Joseph Janis, Azza Karmous, and Bernard Greenberg. 1985. Statistical age-period-cohort analysis: A review and critique. Journal of Chronic Diseases 38: 811–30. [Google Scholar] [CrossRef] [PubMed]
- Lee, Changhee, William Zame, Jinsung Yoon, and Mihaela van der Schaar. 2018. DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. Proceedings of the AAAI Conference on Artificial Intelligence 32: 2314–21. Available online: https://ojs.aaai.org/index.php/AAAI/article/view/11842 (accessed on 22 December 2023). [CrossRef]
- Lu, Hongtao, and Qinchuan Zhang. 2016. Applications of deep convolutional neural network in computer vision. Journal of Data Acquisition and Processing 31: 1–17. [Google Scholar]
- Ohno-Machado, Lucila. 1996. Medical Applications of Artificial Neural Networks: Connectionist Models of Survival. Ph.D. dissertation, Stanford University, Stanford, CA, USA. [Google Scholar]
- Pang, Hong-xia, Wen-de Dong, Zhi-hai Xu, Hua-jun Feng, Qi Li, and Yue-ting Chen. 2011. Novel linear search for support vector machine parameter selection. Journal of Zhejiang University Science C 12: 885–96. [Google Scholar] [CrossRef]
- Ptak-Chmielewska, Aneta, and Anna Matuszyk. 2020. Application of the random survival forests method in the bankruptcy prediction for small and medium enterprises. Argumenta Oeconomica 44: 127–42. [Google Scholar] [CrossRef]
- Quell, Peter, Bellotti Anthony, Breeden Joseph, and Javier Calvo Martin. 2021. Machine learning and model risk management. Model Risk Manager’s International Association. (mrmia.org). [Google Scholar]
- Radzi, Siti Fairuz Mat, Muhammad Khalis Abdul Karim, M Iqbal Saripan, Mohd Amiruddin Abd Rahman, Iza Nurzawani Che Isa, and Mohammad Johari Ibahim. 2021. Hyperparameter tuning and pipeline optimization via grid search method and tree-based autoML in breast cancer prediction. Journal of Personalized Medicine 11: 978. [Google Scholar] [CrossRef]
- Ryu, Jae Yong, Mi Young Lee, Jeong Hyun Lee, Byong Ho Lee, and Kwang-Seok Oh. 2020. DeepHIT: A deep learning framework for prediction of hERG-induced cardiotoxicity. Bioinformatics 36: 3049–55. [Google Scholar] [CrossRef]
- Siarka, Pawel. 2011. Vintage analysis as a basic tool for monitoring credit risk. Mathematical Economics 7: 213–28. [Google Scholar]
- Sohn, So Young, Dong Ha Kim, and Jin Hee Yoon. 2016. Technology credit scoring model with fuzzy logistic regression. Applied Soft Computing 43: 150–58. [Google Scholar] [CrossRef]
- Stepanova, Maria, and Thomas Lynn. 2001. PHAB scores: Proportional hazards analysis behavioural scores. Journal of the Operational Research Society 52: 1007–16. [Google Scholar] [CrossRef]
- Thomas, Lynn. 2000. A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers. International Journal of Forecasting 16: 149–72. [Google Scholar] [CrossRef]
- Thomas, Lynn, Jonathan Crook, and David Edelman. 2017. Credit Scoring and Its Applications. Philadelphia: SIAM. [Google Scholar]
- Yang, Yang, and Kenneth Land. 2013. Age-period-cohort analysis: New models, methods, and empirical applications. Abingdon: Taylor & Francis. [Google Scholar]
- Yang, Yang, Sam Schulhofer-Wohl, Wenjiang Fu, and Kenneth Land. 2008. The intrinsic estimator for age-period-cohort analysis: What it is and how to use it. American Journal of Sociology 113: 1697–736. [Google Scholar] [CrossRef]

Hyperparameter | Values | |
---|---|---|

$d$ | Percentage of dropout (regularization): | 0, 0.1, 0.2, 0.3, 0.4, 0.5 |

$nn$ | Number of hidden layers: | 2, 4, 6, 8 |

$nl$ | Number of neurons in each layer: | 2, 4, 6, 8 |

$ti$ | Training iteration for the network: | 5, 10, 15, 20, 25, 30 |

Variable | Coefficient Estimate | p-Value |
---|---|---|

X1 (coefficient of unemployment rate, lag 4 months) | $+4.000\times {10}^{-4}$ | <0.0001 |

X2 (coefficient of HPI, lag 1 month) | $-3.118\times {10}^{-5}$ | <0.0001 |

X3 (coefficient of the time trend) | $-5.309\times {10}^{-6}$ | 0.522 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, H.; Bellotti, A.; Qu, R.; Bai, R.
Discrete-Time Survival Models with Neural Networks for Age–Period–Cohort Analysis of Credit Risk. *Risks* **2024**, *12*, 31.
https://doi.org/10.3390/risks12020031

**AMA Style**

Wang H, Bellotti A, Qu R, Bai R.
Discrete-Time Survival Models with Neural Networks for Age–Period–Cohort Analysis of Credit Risk. *Risks*. 2024; 12(2):31.
https://doi.org/10.3390/risks12020031

**Chicago/Turabian Style**

Wang, Hao, Anthony Bellotti, Rong Qu, and Ruibin Bai.
2024. "Discrete-Time Survival Models with Neural Networks for Age–Period–Cohort Analysis of Credit Risk" *Risks* 12, no. 2: 31.
https://doi.org/10.3390/risks12020031