Next Article in Journal
A Review of Biophysiological and Biochemical Indicators of Stress for Connected and Preventive Healthcare
Previous Article in Journal
Distinct Mechanical Properties of the Respiratory System Evaluated by Forced Oscillation Technique in Acute Exacerbation of COPD and Acute Decompensated Heart Failure
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Association of Preterm Birth with Depression and Particulate Matter: Machine Learning Analysis Using National Health Insurance Data

1
AI Center, Korea University Anam Hospital, Seoul 02841, Korea
2
School of Industrial Management Engineering, Korea University, Seoul 02841, Korea
3
Department of Obstetrics & Gynecology, Korea University Anam Hospital, Seoul 02841, Korea
4
Department of Obstetrics & Gynecology, Korea University Ansan Hospital, Ansan 15355, Korea
5
Department of Obstetrics & Gynecology, Korea University Guro Hospital, Seoul 08308, Korea
*
Author to whom correspondence should be addressed.
Diagnostics 2021, 11(3), 555; https://doi.org/10.3390/diagnostics11030555
Submission received: 8 January 2021 / Revised: 15 March 2021 / Accepted: 18 March 2021 / Published: 19 March 2021
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

Abstract

:
This study uses machine learning and population data to analyze major determinants of preterm birth including depression and particulate matter. Retrospective cohort data came from Korea National Health Insurance Service claims data for 405,586 women who were aged 25–40 years and gave births for the first time after a singleton pregnancy during 2015–2017. The dependent variable was preterm birth during 2015–2017 and 90 independent variables were included (demographic/socioeconomic information, particulate matter, disease information, medication history, obstetric information). Random forest variable importance was used to identify major determinants of preterm birth including depression and particulate matter. Based on random forest variable importance, the top 40 determinants of preterm birth during 2015–2017 included socioeconomic status, age, proton pump inhibitor, benzodiazepine, tricyclic antidepressant, sleeping pills, progesterone, gastroesophageal reflux disease (GERD) for the years 2002–2014, particulate matter for the months January–December 2014, region, myoma uteri, diabetes for the years 2013–2014 and depression for the years 2011–2014. In conclusion, preterm birth has strong associations with depression and particulate matter. What is really needed for effective prenatal care is strong intervention for particulate matters together with active counseling and medication for common depressive symptoms (neglected by pregnant women).

1. Introduction

Preterm birth is a major part of disease burden for newborns and children on the globe [1,2,3,4]. Every year 15 million babies are born preterm in the world and preterm birth is a main contributor for global neonatal and childhood mortality, i.e., 1 million deaths among those aged 0–4 years [1,2]. For example, one out of every 10 babies was preterm in the United States during 2003–2012, that is, 5,042,982 (12.2%) of 41,206,315 newborns [3]. Indeed, cost-effective interventions are expected to prevent three quarters of mortality from preterm birth [4]. A recent review reports that the following maternal variables are important predictors of preterm birth: demographic/socioeconomic determinants (age, below high school graduation, urban region, insurance, marriage, religion), disease information (delivery/pregestational body mass index, predelivery systolic/diastolic blood pressure, upper gastrointestinal tract symptom, gastroesophageal reflux disease, Helicobacter pylori, gestational diabetes mellitus, systemic lupus erythematosus, increased cerebrospinal fluid and reduced cortical folding due to impaired brain growth), medication history (progesterone, calcium channel blocker, hydroxychloroquine sulfate) and obstetric information (parity, twins, infant sex, prior preterm birth, prior cone biopsy, cervical length, myomas and adenomyosis) [5].
Moreover, emerging literature requests due attention to the significant effects of depression and air pollution on preterm birth [6,7,8,9,10,11,12,13,14,15]. Two systematic reviews reported that prenatal or gestational depression is an important risk factor for preterm birth [6,7]. In addition, two systematic reviews [8,9] and several population-based cohort studies [10,11,12,13,14,15] confirmed a positive association between air pollution and preterm birth. These population-based cohort studies covered various areas and periods including the San Joaquin Valley (the United States, 2000–2006) [10], Ohio (the United States, 2007–2010) [11], Utah (the United States, 2002–2010) [12], Ontario (Canada, 2005–2012) [13], Wuhan (China, 2011–2013) [14] and Korea (2010–2013) [15]. However, the number of predictors in the existing literature has been limited to 14 and no effort has been made based on machine learning in this direction. In this context, this study uses machine learning and population data to analyze major determinants of preterm birth including depression and particulate matter. This study includes a population-based cohort of 405,586 participants and the most comprehensive set of 90 predictors such as demographic/socioeconomic determinants, particulate matter, disease information, medication history and obstetric information.

2. Materials and Methods

2.1. Participants

Retrospective cohort data for this study came from Korea National Health Insurance Service claims data for 405,586 women, aged 25–40 years who gave birth for the first time after a singleton pregnancy during 2015–2017. South Korea runs a compulsory, universal health insurance service program and Korea National Health Insurance Service claims data cover most health events of all citizens residing in Korea (for more details, visit https://www.nhis.or.kr/static/html/wbd/g/a/wbdga0401.html, accessed on 15 March 2021). This retrospective study was approved by the Institutional Review Board (IRB) of Korea University Anam Hospital on 5 November 2018 (2018AN0365). Informed consent was waived by the Institutional Review Board (IRB) given that data were deidentified.

2.2. Variables

The dependent variable was preterm labor and birth during 2015–2017 (birth between 20 weeks and 0 day and 36 weeks and 6 days of gestation). Four categories of preterm labor and birth were defined based on ICD-10 Code: (1) PTB 1—preterm birth with premature rupture of membranes (PROM) only; (2) PTB 2—preterm labor and birth without PROM; (3) PTB 3—PTB 1, PTB 2 or both; (4) PTB 4—PTB 3 or other indicated preterm birth (Supplementary Table S1). This variable was coded as “no” vs. “yes”. The following 90 independent variables were included: (1) demographic/socioeconomic determinants in 2014 such as age (years), socioeconomic status measured by an insurance fee with the range of 1 (the highest group) to 20 (the lowest group), and region (city) (no vs. yes); (2) particulate matter (PM10) for each of the months January–December 2014; (3) disease information (no vs. yes) for each of the years 2002–2014, i.e., depression, diabetes, gastroesophageal reflux disease (GERD), hypertension and periodontitis; (4) medication history (no vs. yes) in 2014, i.e., benzodiazepine, calcium channel blocker, nitrate, progesterone, proton pump inhibitor, sleeping pills and tricyclic antidepressant; (5) obstetric information (no vs. yes) in 2014 such as in vitro fertilization, myoma uteri and prior cone. The 65 disease variables were denoted as Depression_2002, …, Depression_2014, Diabetes_2002, …, Diabetes_2014, GERD_2002, …, GERD_2014, Hypertension_2002, …, Hypertension_2014, and Periodontitis_2002, …, Periodontitis_2014. The disease information and the medication history were screened from ICD-10 and ATC codes, respectively (Supplementary Tables S1 and S2). Indeed, diabetes was defined as fasting glucose equal to or higher than 126 mg/dL or antidiabetic medication. Likewise, hypertension was defined as systolic/diastolic blood pressure equal to or higher than 140/90 mmHg or antihypertensive medication [16]. Finally, particulate matter was denoted as PM_2014_01 (2014 January), …, PM_2014_12 (December 2014) and its monthly average at a district level was obtained from [17]. Introducing the disease and particulate matter variables as above (so called “distributed lag variables”) is one efficient way to analyze the effects of important independent variables in past periods on the dependent variable in the current period.

2.3. Analysis

Logistic regression, the random forest and the artificial neural network were applied and compared for the prediction of preterm birth [18]. Data on 402,092 observations with full information were divided into training and validation sets with a 70:30 ratio (281,464 vs. 120,628 observations). Accuracy, a ratio of correct predictions among 120,628 observations, was introduced as a criterion for validating the models trained. Random forest variable importance, which measures the contribution of a variable for the performance of the model, was used for identifying major determinants of preterm birth and testing its associations with depression, particulate matter and other predictors. R-Studio 1.3.959 (R-Studio Inc., Boston, MA, USA) was employed for the analysis during 1 August 2020–31 December 2020.

3. Results

Descriptive statistics for participants’ preterm birth and its determinants are shown in Table 1. Among 405,586 participants, 21,732 (5.40%), 8927 (2.22%), 27,752 (6.90%) and 28,845 (7.17%) belonged to PTB 1, 2, 3 and 4, respectively. The median age and socioeconomic status of the participants were 29 and 12, respectively. Among the participants, 126,008 (31.34%) and 63,066 (15.68%) had proton pump inhibitor and tricyclic antidepressant medications in 2014, respectively. The share of those with depression registered a steady growth from 0.18% in 2002 to 1.36% in 2014. The monthly averages of PM10 in Korea’s seven metropolitan areas for the year 2014 were 56 (January), 50 (February), 52 (March), 52 (April), 64 (May), 44 (June), 39 (July), 30 (August), 33 (September), 35 (October), 44 (November) and 42 (December) in terms of 10−6 g/m3, respectively. In terms of accuracy, the random forest was similar with logistic regression and the artificial neural network (94.50%, 97.66%, 93.08% and 92.83% for PTB 1, PTB 2, PTB 3 and PTB 4 in Table 2, respectively). The results of undersampling are shown in Table 3. Undersampling is an approach to match the sizes of two groups (participants with and without preterm birth) so that the training of machine learning can be balanced between the two groups. Undersampling leads to slight improvement in the performance (the area under the receiver-operating-characteristic curve) of the random forest, e.g., from 0.5585 to 0.5803 in the case of PTB 2.
Based on random forest variable importance, top-40 determinants of preterm birth during 2015–2017 included socioeconomic status, age, proton pump inhibitor, benzodiazepine, tricyclic antidepressant, sleeping pills, progesterone, GERD for the years 2002–2014, particulate matter for the months January2014–December 2014, region, myoma uteri, diabetes for the years 2013–2014 and depression for the years 2011–2014. These values were the averages for PTB 1, PTB 2, PTB 3 and PTB 4 (Supplementary Figure S1 for each of PTB 1, PTB 2, PTB 3 and PTB 4). The importance rankings of particulate matter were particularly high for PTB 2: PM_2014_08 (5th), PM_2014_12 (6th), PM_2014_02 (7th), PM_2014_11 (8th), PM_2014_09 (10th), PM_2014_06 (11th), PM_2014_10 (12th), PM_2014_01 (13th), PM_2014_07 (14th), PM_2014_05 (15th), PM_2014_03 (17th), PM_2014_04 (18th). These findings were similar with those of undersampling in Supplementary Figure S2. The results of logistic regression (Table 4 and Table 5) provide useful information about the sign and magnitude for the effect of a major determinant on preterm birth. For example, the odds of PTB 4 will increase by 12.6% if socioeconomic status decreases by 10 in Table 4, e.g., from 2 to 12 (median). The odds of PTB 4 will increase by 24.1% if particulate matter in 2014 August (PM_2014_08) increases by 1 × 10−6 g/m3 in the table. In a similar vein, the odds of PTB 4 will be greater by 12.2% for those with depression in 2010 than those without it in the table.

4. Discussion

4.1. Findings of This Study

Based on random forest variable importance, top-40 determinants of preterm birth during 2015–2017 included socioeconomic status, age, proton pump inhibitor, benzodiazepine, tricyclic antidepressant, sleeping pills, progesterone, GERD for the years 2002–2014, particulate matter for the months January–December 2014, region, myoma uteri, diabetes for the years 2013–2014 and depression for the years 2011–2014.

4.2. Summary of Existing Literature

A recent systematic review reported a positive association between gestational depression and spontaneous preterm labor and birth [6]. This review selected 39 cohort studies with 134,488 participants in total, published in English during 1980–2003. The majority of these studies came from high-income countries such as the United States (27), Denmark (2), France (2), Sweden (2), Canada (1), Norway (1) and the United Kingdom (1). Then, a subsequent systematic review reported that prenatal depression is an important risk factor for preterm birth [7]. This review selected 64 observational studies published in English during 2007 and 2017. Here, 49 (77%) and 15 (23%) of these studies were done in middle-income and low-income countries, respectively. Likewise, two systematic reviews [8,9] stated a positive relationship between air pollution and preterm birth. These reviews selected 15 articles during 1966–2009 and 14 articles during 1995–2012, respectively. These 27 observational or cohort studies were characterized by varying numbers of participants (3853–3,545,777) and diverse origins, i.e., Australia (1), China (2), Canada (4), the Czech Republic (1), Korea (2), Spain (1), the United Kingdom (2) and the United States (14). Their odds-ratio range was 1.05–1.15 regarding PM2.5. It would be worthwhile to review several additional population-based cohort studies [10,11,12,13,14,15] on a positive association between air pollution and preterm birth as well. These studies employed 50,005–1,742,183 participants, covering various areas and periods including San Joaquin Valley (the United States, 2000–2006) [10], Ohio (the United States, 2007–2010) [11], Utah (the United States, 2002–2010) [12], Ontario (Canada, 2005–2012) [13], Wuhan (China, 2011–2013) [14] and Korea (2010–2013) [15]. Their odds-ratio ranges were 1.01–1.57 and 1.04–1.19 regarding PM10 and PM2.5, respectively. However, the number of predictors in the existing literature above has been limited to 14. Moreover, no effort has been made based on machine learning in this line of research.

4.3. Contributions of This Study

This study presents the most comprehensive analysis for the determinants of preterm birth, using a population-based cohort of 405,586 participants and the richest collection of 90 predictors such as demographic/socioeconomic determinants, particulate matter, disease information, medication history and obstetric information. Firstly, this study confirms that depression and particulate matter are major predictors of preterm birth (they were the top-40 determinants of preterm birth in this study). Several researchers focus on behavioral, infectious, neuroendocrine and neuroinflammatory mechanisms between depression and preterm birth [19]. Other researchers develop a hypothesis that air pollution causes systemic inflammation, which in turn leads to preterm birth [20]. Little research has been undertaken and more investigation is needed to explore and evaluate various pathways among depression, particulate matter and preterm birth. The findings of this study demonstrate that what is really needed for effective prenatal care is strong intervention for particulate matter together with active counseling and medication for common depressive symptoms (neglected by pregnant women). Secondly, the results of this study agree with those of a previous study with 731 participants on gastroesophageal reflux disease, medication history and preterm birth [18]: The findings of this previous study highlighted the significance of age, socioeconomic status (below high school graduation), progesterone medication history, gastroesophageal reflux disease, region (city) and gestational diabetes mellitus. Above all, to the best of our knowledge, this study is the first attempt to use machine learning and population data to find the main predictors of preterm birth and evaluate its association with depression and particulate matter. This study will be a good starting point in this direction to find main predictors of preterm birth and draw effective implications for its prevention and management.

4.4. Limitations of This Study

Firstly, this study did not examine possible mediating effects among variables. Secondly, this study adopted the binary category of preterm birth as no vs. yes (birth between 20 weeks and 0 day and 36 weeks and 6 days of gestation). But preterm birth can have multiple categories and it will be a good topic for future study to compare different predictors for various categories of preterm birth, e.g., extremely preterm (less than 28 (or 24) weeks), very preterm (28–32 (or 24–32) weeks), moderate to late preterm (32–37 weeks) [2]. Thirdly, four categories of preterm birth were defined based on the ICD-10 Code and this could be a source of potential bias. Fourthly, it was not the scope of this study to explore and evaluate various pathways among depression, particulate matter and preterm birth. Little research has been undertaken and more investigations are needed on this topic. Fifthly, uniting various kinds of deep learning approaches for various kinds of preterm birth data would bring new innovations and deeper insights in this line of research. Finally, further investigations of single vs. multiple gestation would deliver more insights and more detailed clinical implications.

4.5. Conclusions

Preterm birth has strong associations with depression and particulate matter. What is really needed for effective prenatal care is strong intervention for particulate matters together with active counseling and medication for common depressive symptoms (neglected by pregnant women).

Supplementary Materials

The following are available online at https://www.mdpi.com/2075-4418/11/3/555/s1, Figure S1: Random forest variable importance, Figure S2: Random forest variable importance—undersampling, Table S1: ICD-10 code for preterm birth, depression, gastroesophageal reflux disease and periodontitis, Table S2: ATC code for medication.

Author Contributions

Conceptualization, K.-S.L., H.Y.K., G.J.C., S.C.H., M.J.O., H.J.K. and K.H.A.; methodology, K.-S.L., H.-I.K. and K.H.A.; software, K.-S.L., H.-I.K. and K.H.A.; validation, K.-S.L., H.-I.K. and K.H.A.; formal analysis, K.-S.L., H.-I.K. and K.H.A.; investigation, K.-S.L., H.-I.K. and K.H.A.; resources, K.-S.L., H.-I.K. and K.H.A.; data curation, K.-S.L., H.-I.K. and K.H.A.; writing—original draft preparation, K.-S.L., H.Y.K., G.J.C., S.C.H., M.J.O., H.J.K. and K.H.A.; writing—review and editing, K.-S.L., H.Y.K., G.J.C., S.C.H., M.J.O., H.J.K. and K.H.A.; visualization, K.-S.L., H.-I.K. and K.H.A.; supervision, K.-S.L. and K.H.A.; project administration, K.-S.L. and K.H.A.; funding acquisition, K.-S.L. and K.H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Korea University Medical Center (No. K1925051) and the Ministry of Science and ICT of South Korea under the Information Technology Research Center support program supervised by the IITP (Institute for Information and Communications Technology Planning & Evaluation) (No. IITP-2018-0-01405).

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (IRB) of Korea University Anam Hospital on 5 November 2018 (2018AN0365).

Informed Consent Statement

Informed consent was waived by the IRB given that data were deidentified.

Data Availability Statement

The data presented in this study are not publicly available. But the data are available from the corresponding author upon reasonable request and under the permission of Korea National Health Insurance Service.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, L.; Oza, S.; Hogan, D.; Chu, Y.; Perin, J.; Zhu, J.; Lawn, J.E.; Cousens, S.; Mathers, C.; Black, R.E. Global, regional, and national causes of under-5 mortality in 2000-15: An updated systematic analysis with implications for the Sustainable Development Goals. Lancet 2016, 388, 3027–3035. [Google Scholar] [CrossRef] [Green Version]
  2. World Health Organization. News: Preterm Birth. Available online: http://www.who.int/news-room/fact-sheets/detail/preterm-birth (accessed on 1 December 2020).
  3. Magro Malosso, E.R.; Saccone, G.; Simonetti, B.; Squillante, M.; Berghella, V. US trends in abortion and preterm birth. J. Matern. Fetal Neonatal Med. 2018, 31, 2463–2467. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Harrison, M.S.; Goldenberg, R.L. Global burden of prematurity. Semin. Fetal Neonatal Med. 2016, 21, 74–79. [Google Scholar] [CrossRef]
  5. Lee, K.S.; Ahn, K.H. Application of artificial intelligence in early diagnosis of spontaneous preterm labor and birth. Diagnostics 2020, 10, 733. [Google Scholar] [CrossRef]
  6. Staneva, A.; Bogossian, F.; Pritchard, M.; Wittkowski, A. The effects of maternal depression, anxiety, and perceived stress during pregnancy on preterm birth: A systematic review. Women Birth 2015, 28, 179–193. [Google Scholar] [CrossRef] [PubMed]
  7. Fekadu Dadi, A.; Miller, E.R.; Mwanri, L. Antenatal depression and its association with adverse birth outcomes in low and middle-income countries: A systematic review and meta-analysis. PLoS ONE 2020, 15, e0227323. [Google Scholar] [CrossRef] [PubMed]
  8. Bosetti, C.; Nieuwenhuijsen, M.J.; Gallus, S.; Cipriani, S.; La Vecchia, C.; Parazzini, F. Ambient particulate matter and preterm birth or birth weight: A review of the literature. Arch. Toxicol. 2010, 84, 447–460. [Google Scholar] [CrossRef] [PubMed]
  9. Li, X.; Huang, S.; Jiao, A.; Yang, X.; Yun, J.; Wang, Y.; Xue, X.; Chu, Y.; Liu, F.; Liu, Y.; et al. Association between ambient fine particulate matter and preterm birth or term low birth weight: An updated systematic review and meta-analysis. Environ. Pollut. 2017, 227, 596–605. [Google Scholar] [CrossRef] [PubMed]
  10. Weber, K.A.; Yang, W.; Lurmann, F.; Hammond, S.K.; Shaw, G.M.; Padula, A.M. Air pollution, maternal hypertensive disorders, and preterm birth. Environ. Epidemiol. 2019, 3, e062. [Google Scholar] [CrossRef] [PubMed]
  11. DeFranco, E.; Moravec, W.; Xu, F.; Hall, E.; Hossain, M.; Haynes, E.N.; Muglia, L.; Chen, A. Exposure to airborne particulate matter during pregnancy is associated with preterm birth: A population-based cohort study. Environ. Health 2016, 15, 6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Mendola, P.; Nobles, C.; Williams, A.; Sherman, S.; Kanner, J.; Seeni, I.; Grantz, K. Air pollution and preterm birth: Do air pollution changes over time influence risk in consecutive pregnancies among low-risk women? Int. J. Environ. Res. Public Health 2019, 16, 3365. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Lavigne, E.; Yasseen, A.S.; Stieb, D.M.; Hystad, P.; van Donkelaar, A.; Martin, R.V.; Brook, J.R.; Crouse, D.L.; Burnett, R.T.; Chen, H.; et al. Ambient air pollution and adverse birth outcomes: Differences by maternal comorbidities. Environ. Res. 2016, 148, 457–466. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Qian, Z.; Liang, S.; Yang, S.; Trevathan, E.; Huang, Z.; Yang, R.; Wang, J.; Hu, K.; Zhang, Y.; Vaughn, M.; et al. Ambient air pollution and preterm birth: A prospective birth cohort study in Wuhan, China. Int. J. Hyg. Environ. Health 2016, 219, 195–203. [Google Scholar] [CrossRef] [PubMed]
  15. Kim, Y.J.; Song, I.G.; Kim, K.N.; Kim, M.S.; Chung, S.H.; Choi, Y.S.; Bae, C.W. Maternal exposure to particulate matter during pregnancy and adverse birth outcomes in the Republic of Korea. Int. J. Environ. Res. Public Health 2019, 16, 633. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Brown, M.A.; Magee, L.A.; Kenny, L.C.; Karumanchi, S.A.; McCarthy, F.P.; Saito, S.; Hall, D.R.; Warren, C.E.; Adoyi, G.; Ishaku, S.; et al. Hypertensive disorders of pregnancy: ISSHP classification, diagnosis, and management recommendations for international practice. Hypertension 2018, 72, 24–43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Air Korea. Air Quality Information. Available online: https://www.airkorea.or.kr/index (accessed on 1 December 2020).
  18. Lee, K.S.; Song, I.S.; Kim, E.S.; Ahn, K.H. Determinants of spontaneous preterm labor and birth including gastroesophageal reflux disease and periodontitis. J. Korean Med. Sci. 2020, 35, e105. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Shapiro, G.D.; Fraser, W.D.; Frasch, M.G.; Séguin, J.R. Psychosocial stress in pregnancy and preterm birth: Associations and mechanisms. J. Perinat. Med. 2013, 41, 631–645. [Google Scholar] [CrossRef] [PubMed]
  20. Vadillo-Ortega, F.; Osornio-Vargas, A.; Buxton, M.A.; Sánchez, B.N.; Rojas-Bracho, L.; Viveros-Alcaráz, M.; Castillo-Castrejón, M.; Beltrán-Montoya, J.; Brown, D.G.; O’Neill, M.S. Air pollution, inflammation and preterm birth: A potential mechanistic link. Med. Hypotheses 2014, 82, 219–224. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Table 1. Descriptive statistics on preterm birth and its determinants.
Table 1. Descriptive statistics on preterm birth and its determinants.
VariableNoYesYes (%)
PTB 1 a380,36021,7325.40
PTB 2393,16589272.22
PTB 3374,34027,7526.90
PTB 4373,24728,8457.17
Benzodiazepine165,773236,31958.77
Calcium Channel Blocker398,35237400.93
Diabetes_2002401,2268660.22
Diabetes_2003401,07910130.25
Diabetes_2004400,83312590.31
Diabetes_2005400,30617860.44
Diabetes_2006400,34817440.43
Diabetes_2007400,30217900.45
Diabetes_2008400,21118810.47
Diabetes_2009400,06220300.50
Diabetes_2010399,83322590.56
Diabetes_2011399,49126010.65
Diabetes_2012399,02730650.76
Diabetes_2013398,04840441.01
Diabetes_2014395,69963931.59
Depression_2002400,5517270.18
Depression_2003400,3289500.24
Depression_2004400,06812100.30
Depression_2005399,46718110.45
Depression_2006399,11221660.54
Depression_2007398,49427840.69
Depression_200839827730010.75
Depression_2009397,87734010.85
Depression_2010397,42238560.96
Depression_2011396,95143271.08
Depression_2012395,92953491.33
Depression_2013395,97153071.32
Depression_2014395,83754411.36
GERD_2002 b399,07630160.75
GERD_2003398,12939630.99
GERD_2004396,93251601.28
GERD_2005395,35167411.68
GERD_2006393,24488482.20
GERD_2007389,17712,9153.21
GERD_2008386,21915,8733.95
GERD_2009380,45221,6405.38
GERD_2010376,61925,4736.34
GERD_2011372,81929,2737.28
GERD_2012368,83333,2598.27
GERD_2013367,24034,8528.67
GERD_2014363,41138,6819.62
Hypertension_2002401,4926000.15
Hypertension_2003401,4646280.16
Hypertension_2004401,3607320.18
Hypertension_2005401,1968960.22
Hypertension_2006401,08810040.25
Hypertension_2007400,96811240.28
Hypertension_2008400,84412480.31
Hypertension_2009400,71813740.34
Hypertension_2010400,71413780.34
Hypertension_2011400,73813540.34
Hypertension_2012400,40616860.42
Hypertension_2013400,18719050.47
Hypertension_2014399,85022420.56
In Vitro Fertilization401,9651270.03
Myoma Uteri385,01517,0774.25
Nitrate400,77613160.33
Periodontitis_2002401,8951970.05
Periodontitis_2003401,8302620.07
Periodontitis_2004401,6884040.10
Periodontitis_2005401,6654270.11
Periodontitis_2006401,5025900.15
Periodontitis_2007401,7833090.08
Periodontitis_2008401,7952970.07
Periodontitis_2009401,7423500.09
Periodontitis_2010401,7533390.08
Periodontitis_2011401,7972950.07
Periodontitis_2012401,8372550.06
Periodontitis_2013401,8242680.07
Periodontitis_2014401,8542380.06
Prior Cone401,9111810.05
Progesterone307,68494,40823.48
Proton Pump Inhibitor276,084126,00831.34
Region (City)28,615373,47792.88
Sleeping Pills370,30331,7897.91
Tricyclic Antidepressant339,02663,06615.68
a PTB, preterm birth during 2015–2017; b GERD, gastroesophageal reflux disease.
Table 2. Model performance.
Table 2. Model performance.
ModelAccuracyAUC a
PTB 1 bPTB 2PTB 3PTB 4PTB 1PTB 2PTB 3PTB 4
Logistic Regression0.94500.97660.93080.92830.55360.59160.55990.5610
Artificial Neural Network0.94500.97660.93080.92830.50000.50000.50000.5000
Random Forest0.94500.97660.93080.92830.52750.55850.54070.5407
a Area under the receiver-operating-characteristic curve; b PTB, preterm birth during 2015–2017.
Table 3. Model performance with undersampling.
Table 3. Model performance with undersampling.
ModelAccuracyAUC a
PTB 1 bPTB 2PTB 3PTB 4PTB 1PTB 2PTB 3PTB 4
Logistic Regression0.94480.96910.93020.92760.55500.58720.55670.5621
Artificial Neural Network0.94500.97660.93080.92830.50000.50000.50000.5000
Random Forest0.93990.95500.92510.92180.55350.58030.55170.5601
a Area under the receiver-operating-characteristic curve; b PTB, preterm birth during 2015–2017.
Table 4. Coefficients of determinants from logistic regression for each type of preterm birth.
Table 4. Coefficients of determinants from logistic regression for each type of preterm birth.
DeterminantPTB 1 aPTB 2PTB 3PTB 4
Age** 1.0000** 1.00001.0000** 1.0000
Benzodiazepine** 1.0004** 1.6725** 1.0034** 1.0017
Calcium Channel Blocker* 1.63831.5038* 1.16441.0681
Diabetes_20021.83532.56562.30021.5692
Diabetes_20031.94812.08091.22221.1303
Diabetes_20042.44801.66621.91471.7558
Diabetes_20051.0359** 1.52511.0404** 1.683
Diabetes_20061.55261.50651.26362.0387
Diabetes_20071.77201.12032.41721.3727
Diabetes_20081.92232.34871.10732.5957
Diabetes_20091.31612.01641.0789*1.2377
Diabetes_20101.81001.24181.73701.4228
Diabetes_20111.25351.26492.38131.4582
Diabetes_20121.50081.85741.49721.1276
Diabetes_20132.53681.53771.75062.0177
Diabetes_2014** 1.0077** 1.00001.0000** 1.0000
Depression_20021.58411.0117** 2.18082.2863
Depression_20031.0729* 1.11672.00871.8181
Depression_20041.87051.0978*1.85492.0803
Depression_20051.15111.39062.44331.5790
Depression_20061.0631* 1.82241.11801.4364
Depression_20071.0956* 1.75981.39471.8300
Depression_20082.40071.37471.53462.4277
Depression_20092.10262.14312.57962.0482
Depression_20101.0200** 2.65901.0441** 1.1220
Depression_20112.44022.00301.91692.1260
Depression_20121.77742.63711.92631.3044
Depression_2013** 1.63562.09571.04221.0106
Depression_20141.26031.6195** 1.43561.2745
GERD_20021.35631.90831.20871.1254
GERD_20031.10522.06051.0871*1.1697
GERD_20042.27821.50841.21151.2032
GERD_2005** 1.45891.15911.0435** 1.0084
GERD_20062.31881.76062.41221.6032
GERD_20071.12571.94261.53931.3863
GERD_20082.03292.08111.22421.2627
GERD_20091.0689* 1.12541.50551.9012
GERD_20101.19831.58441.80262.5059
GERD_2011** 1.12852.44331.10781.0503
GERD_20121.28681.0417** 1.3111.1167
GERD_20132.16981.71862.50392.2581
GERD_20142.03941.41601.32791.5451
Hypertension_20021.0275** 1.57381.0272** 1.1747
Hypertension_20031.17021.52492.48532.5978
Hypertension_20041.0521*2.43811.75971.1924
Hypertension_20051.78691.57631.5032.2755
Hypertension_2006** 1.0203** 2.14231.0638* 1.0457
Hypertension_20072.42691.2012.20962.3402
Hypertension_20081.32281.27161.21461.2731
Hypertension_20091.20091.66972.13211.6487
Hypertension_20101.0225** 1.0764* 1.0271** 1.0418
Hypertension_20111.90461.22641.0831*2.3515
Hypertension_20121.25852.58262.43412.2499
Hypertension_20132.47721.0218** 1.75881.8997
Hypertension_2014** 1.31361.0142** 1.0008** 1.0227
In Vitro Fertilization** 1.34271.0005** 1.0120** 1.0002
Myoma Uteri** 1.0000** 1.0000** 1.0000** 1.0000
Nitrate1.98931.79171.98091.4776
Periodontitis_20021.97182.00341.34652.0526
Periodontitis_20031.41981.77791.48921.9032
Periodontitis_2004** 1.11031.82671.24931.0187
Periodontitis_20051.24431.20351.63192.5775
Periodontitis_2006** 1.34611.0005** 1.43141.0181
Periodontitis_20071.31341.74191.62921.7522
Periodontitis_20081.89892.02372.10471.7572
Periodontitis_20091.30651.63352.26642.4738
Periodontitis_20102.02662.39081.0987* 2.1828
Periodontitis_20111.30081.73551.63952.6100
Periodontitis_20122.43081.32641.16361.7730
Periodontitis_20131.15982.58701.23791.2664
Periodontitis_20141.38581.54611.55411.2876
PM_2014_01** 1.0000** 1.0639** 1.0001** 1.0002
PM_2014_02** 1.0000** 1.0483** 1.0000** 1.0000
PM_2014_03** 1.0214** 1.0028** 1.0000** 1.0003
PM_2014_041.95841.0355** 1.0887*1.1566
PM_2014_05** 1.0032** 1.0006** 1.0005** 1.0004
PM_2014_06** 1.0103** 1.00001.0008** 1.0000
PM_2014_07** 1.0000** 2.03011.0000** 1.0000
PM_2014_082.53221.27381.1038*1.2412
PM_2014_09** 1.0001** 1.17341.0059** 1.0001
PM_2014_10** 1.0020** 2.69451.0041** 1.0007
PM_2014_111.71641.42241.72861.2650
PM_2014_12** 1.0631* 1.54611.0958* 1.0325
Prior Cone1.18811.78992.55602.1212
Progesterone** 1.0000** 1.0000** 1.0000** 1.0000
Proton Pump Inhibitor* 1.0314** 1.80511.13151.0870
Region (City)* 1.0000** 1.0068** 1.39831.0564
Sleeping Pills1.17831.47801.69501.9664
Socioeconomic Status** 1.65471.50791.0856* 1.0126
Tricyclic Antidepressant** 1.0065** 1.31691.0613* 1.0223
a PTB, preterm birth during 2015–2017; * p < 0.10, ** p < 0.05.
Table 5. Coefficients of determinants from logistic regression with undersampling.
Table 5. Coefficients of determinants from logistic regression with undersampling.
DeterminantPTB 1 aPTB 2PTB 3PTB 4
Age** 1.0000** 1.00001.0000** 1.0000
Benzodiazepine** 1.03171.5222** 1.01981.3658
Calcium Channel Blocker2.09621.13791.6327** 1.0065
Diabetes_20022.40111.57891.44891.7523
Diabetes_20031.1577** 1.00461.66492.2509
Diabetes_20041.27011.41611.28062.0063
Diabetes_2005** 1.03061.67311.6038** 1.0338
Diabetes_20061.74352.53491.88791.9066
Diabetes_20072.03641.58021.43172.2514
Diabetes_20082.50071.81682.11371.3937
Diabetes_20092.0164** 1.02722.3441* 1.0568
Diabetes_2010*1.07482.0294* 1.08431.9013
Diabetes_20111.41672.66921.37051.6768
Diabetes_20122.00741.4708* 1.07481.4180
Diabetes_20132.64811.46991.35251.5072
Diabetes_2014** 1.0004** 1.0000** 1.0000** 1.0000
Depression_20021.97372.20641.94232.0489
Depression_20031.65661.26871.5061.2128
Depression_20041.54611.45611.12452.2313
Depression_20052.11172.28281.3139* 1.0695
Depression_20061.42511.64411.6534** 1.0422
Depression_20071.47402.0573** 1.05051.9812
Depression_20081.30091.73282.2586** 1.0480
Depression_20092.58882.55281.37252.2751
Depression_20101.29701.66651.4457* 1.0910
Depression_20112.51291.75141.48601.5786
Depression_20121.48422.01791.52311.3482
Depression_20131.16551.4877** 1.0437** 1.0402
Depression_20141.29371.19921.85071.6490
GERD_20022.70672.54891.27272.2506
GERD_20032.0535* 1.0868** 1.05111.2279
GERD_20041.71881.17851.60521.2509
GERD_20051.45942.07481.1597** 1.0127
GERD_20062.37032.22521.71612.4705
GERD_20071.74421.34262.03362.5701
GERD_20081.92371.75411.69181.3901
GERD_20091.25271.32821.51092.5045
GERD_20102.00951.79751.70731.2688
GERD_2011** 1.02141.3559** 1.04401.4040
GERD_20122.36771.42171.25291.9972
GERD_20132.04241.49971.54371.6973
GERD_20141.26151.7767*1.07821.2339
Hypertension_20021.11051.5397*1.0819*1.0678
Hypertension_20031.29661.20012.70891.7429
Hypertension_2004** 1.02651.33992.43721.1225
Hypertension_20051.82511.34011.55512.3208
Hypertension_2006* 1.07481.90671.36991.1140
Hypertension_20072.32242.60962.03251.1384
Hypertension_20081.11521.96661.30941.6447
Hypertension_20092.00162.63711.39351.4603
Hypertension_20101.52641.7342* 1.0928** 1.0022
Hypertension_20111.31792.29481.34722.2883
Hypertension_20121.43541.27431.69051.6833
Hypertension_20131.75021.2018* 1.0681.3578
Hypertension_20141.8333** 1.00641.30891.7927
In Vitro Fertilization1.26421.1235* 1.09861.7811
Myoma Uteri** 1.0000** 1.0000** 1.0000** 1.0000
Nitrate2.3581*1.05252.14272.2613
Periodontitis_20022.29011.23501.19191.5518
Periodontitis_20032.30851.17801.74921.7112
Periodontitis_2004* 1.07201.17991.4031*1.0831
Periodontitis_20052.5496** 1.02241.86631.3206
Periodontitis_20061.1120* 1.06651.16501.2683
Periodontitis_20072.15551.30092.39102.2307
Periodontitis_20081.73232.17821.28301.4805
Periodontitis_2009* 1.05851.95021.28831.1346
Periodontitis_20102.27472.5854* 1.09891.4304
Periodontitis_20112.16011.84061.63962.6411
Periodontitis_20122.48181.69582.23941.3557
Periodontitis_20131.67371.52041.87021.1657
Periodontitis_20141.94852.19911.2673* 1.0691
PM_2014_01** 1.00011.7761** 1.0000** 1.0052
PM_2014_02** 1.0000** 1.0192** 1.0000** 1.0000
PM_2014_031.1123** 1.0092** 1.0002** 1.0000
PM_2014_041.52342.43381.17691.3751
PM_2014_05** 1.0005** 1.0099** 1.0047** 1.0011
PM_2014_06** 1.0002** 1.0028** 1.0059** 1.0002
PM_2014_07** 1.00062.6328** 1.0013** 1.0003
PM_2014_081.1148* 1.09802.62421.3607
PM_2014_09** 1.0000** 1.0053** 1.0075** 1.0340
PM_2014_10** 1.0004* 1.0860** 1.0042* 1.0966
PM_2014_111.15741.84331.79541.5836
PM_2014_12** 1.03671.49801.32421.2909
Prior Cone1.38222.14221.36331.1997
Progesterone** 1.0000** 1.0000** 1.0000** 1.0000
Proton Pump Inhibitor*1.09452.38521.11171.1141
Region (City)** 1.00001.38161.24911.1881
Sleeping Pills1.17881.15262.64981.9911
Socioeconomic Status** 1.03141.26231.2843** 1.0041
Tricyclic Antidepressant** 1.00032.54911.3516** 1.0188
a PTB, preterm birth during 2015–2017; * p < 0.10, ** p < 0.05.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lee, K.-S.; Kim, H.-I.; Kim, H.Y.; Cho, G.J.; Hong, S.C.; Oh, M.J.; Kim, H.J.; Ahn, K.H. Association of Preterm Birth with Depression and Particulate Matter: Machine Learning Analysis Using National Health Insurance Data. Diagnostics 2021, 11, 555. https://doi.org/10.3390/diagnostics11030555

AMA Style

Lee K-S, Kim H-I, Kim HY, Cho GJ, Hong SC, Oh MJ, Kim HJ, Ahn KH. Association of Preterm Birth with Depression and Particulate Matter: Machine Learning Analysis Using National Health Insurance Data. Diagnostics. 2021; 11(3):555. https://doi.org/10.3390/diagnostics11030555

Chicago/Turabian Style

Lee, Kwang-Sig, Hae-In Kim, Ho Yeon Kim, Geum Joon Cho, Soon Cheol Hong, Min Jeong Oh, Hai Joong Kim, and Ki Hoon Ahn. 2021. "Association of Preterm Birth with Depression and Particulate Matter: Machine Learning Analysis Using National Health Insurance Data" Diagnostics 11, no. 3: 555. https://doi.org/10.3390/diagnostics11030555

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop