Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods

Wang, Ning; Guo, Haonan; Jing, Yingyu; Song, Lin; Chen, Huan; Wang, Mengjun; Gao, Lei; Huang, Lili; Song, Yanan; Sun, Bo; Cui, Wei; Xu, Jing

doi:10.3390/metabo12111040

Open AccessArticle

Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods

by

Ning Wang

^1,2,†,

Haonan Guo

^3,†,

Yingyu Jing

^3,†,

Lin Song

⁴

,

Huan Chen

³,

Mengjun Wang

^4,5,

Lei Gao

¹,

Lili Huang

⁶,

Yanan Song

¹,

Bo Sun

^4,*,

Wei Cui

^2,3,* and

Jing Xu

^1,2,*

¹

Department of Endocrinology, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an 710004, China

²

International Center for Obesity and Metabolic Disease Research of Xi’an Jiaotong University, Xi’an 710061, China

³

Department of Endocrinology and Second Department of Geriatrics, The First Affiliated Hospital of Xi’an Jiaotong University, Xi’an 710061, China

⁴

Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xi’an Jiaotong University Health Science Center, Xi’an 710061, China

⁵

Department of Endocrinology, 521 Hospital of Norinco Group, Xi’an 710065, China

⁶

Department of Medical Ultrasound, The Second Affiliated Hospital of Xi’an Jiaotong University, Xi’an 710004, China

^*

Authors to whom correspondence should be addressed.

^†

Ning Wang, Haonan Guo, and Yingyu Jing contributed equally to this work.

Metabolites 2022, 12(11), 1040; https://doi.org/10.3390/metabo12111040

Submission received: 20 August 2022 / Revised: 26 September 2022 / Accepted: 25 October 2022 / Published: 29 October 2022

(This article belongs to the Special Issue Metabolomics in the Interventions of Metabolic Diseases and Physical Activity)

Download

Browse Figures

Versions Notes

Abstract

:

Gestational diabetes mellitus (GDM), a common perinatal disease, is related to increased risks of maternal and neonatal adverse perinatal outcomes. We aimed to establish GDM risk prediction models that can be widely used in the first trimester using four different methods, including a score-scaled model derived from a meta-analysis using 42 studies, a logistic regression model, and two machine learning models (decision tree and random forest algorithms). The score-scaled model (seven variables) was established via a meta-analysis and a stratified cohort of 1075 Chinese pregnant women from the Northwest Women’s and Children’s Hospital (NWCH) and showed an area under the curve (AUC) of 0.772. The logistic regression model (seven variables) was established and validated using the above cohort and showed AUCs of 0.799 and 0.834 for the training and validation sets, respectively. Another two models were established using the decision tree (DT) and random forest (RF) algorithms and showed corresponding AUCs of 0.825 and 0.823 for the training set, and 0.816 and 0.827 for the validation set. The validation of the developed models suggested good performance in a cohort derived from another period. The score-scaled GDM prediction model, the logistic regression GDM prediction model, and the two machine learning GDM prediction models could be employed to identify pregnant women with a high risk of GDM using common clinical indicators, and interventions can be sought promptly.

Keywords:

gestational diabetes mellitus; prediction models; risk factors; early pregnancy

1. Introduction

Gestational diabetes mellitus (GDM) is usually diagnosed in the second and third trimesters, and these pregnant women have no history of diabetes before pregnancy. GDM is characterized by hyperglycemia during pregnancy [1], which increases the risk of developing postpartum type 2 diabetes mellitus (T2DM) among pregnant women [2], along with premature delivery and clinical neonatal hypoglycemia [3]. However, the fetus of GDM women exposed to an intrauterine hyperglycemic environment from the beginning of the second trimester may lead to smaller fetuses [4] and abnormal abdominal circumference growth rates relative to pregnant women with normal glucose tolerance (NGT) [5].

The 75 g oral glucose tolerance test (OGTT) during the 24th–28th week of pregnancy is performed to diagnose GDM [1]. However, the fasting status and strict interval limitation for gestational age required for screening are unrealistic in some low-income areas in western China.

Some studies have employed the levels of angiopoietin-like protein 8, plasma fatty acid binding protein 4, and various adipokines [6,7,8] to predict GDM at the early stages of pregnancy but these have been difficult to popularize in clinical settings. Other common clinical indicators, including excessive weight gain during pregnancy, increased pre-pregnancy body mass index (pre-BMI) [9,10,11], advanced maternal age [12], accepted assisted reproductive technology (ART) treatment [13], family history of T2DM [14], and young menarche age are risk factors for GDM [15], which can be used as potential predictors of GDM development but show poor predictive powers as validated by our data. We screened several published GDM prediction models by literature review, and some of these have limited clinical utility due to the uncommon use of the included variables. Kang and colleagues [16] developed a model comprising several clinically uncommon variables, including the levels of HbA1c, IgA, triglycerides, and the percentage of B lymphocytes at the early stages of pregnancy. The variables from the model of Schoenaker and colleagues [17], including diet and exercise, are difficult to measure in clinical settings. Finally, the GDM prediction model of Sweeting and colleagues [18], which targets the Asian population in different regions, has not been validated in China. Additionally, we validated three published GDM prediction models [19,20,21] using data from a cohort obtained in May 2021 from the Northwest Women’s and Children’s Hospital (NWCH), and these showed limited performances (Supplementary Table S10).

Therefore, we combined these risk factors to establish individualized GDM prediction models to screen women at a high risk of developing GDM during the early stages of pregnancy and guide the implementation of prevention strategies.

2. Materials and Methods

2.1. Data Sources

In the beginning, we screened data of 1075 pregnant women from a retrospective cohort at the Department of Obstetrics at NWCH between November 2019 and March 2020. To validate the performances of the established models, we collected data from the NWCH cohort in May 2021. We obtained data from the routine gestational care visit. Women carrying a full-term fetus and relatively complete pregnancy data were included. Women with diabetes before pregnancy who met the criterion for OGTT and were diagnosed with T2DM or other metabolic diseases were excluded. Finally, 1075 pregnant women were included for further analyses.

2.2. Outcomes

GDM or NGT was diagnosed according to the IADPSG criteria [22] between the 24th and 28th gestational week. All the included pregnant women in this study did not undergo OGTT in the 1st trimester and showed a corresponding fasting blood glucose (FBG) < 5.1 mmol/L.

2.3. Clinical Measurements and Definitions

We calculated pre-BMI using the recorded weight (maternal weight before and during pregnancy were recorded in the electronic medical records) divided by the height squared (m²) at the 1st gestational care visit and stratified based on pre-BMI using the criteria specific for Chinese adults [23] as follows: ≤23.9 kg/m² as normal, 24–27.9 kg/m² as overweight, and ≥28 kg/m² as obese women. We stratified maternal age into <30 years (yr), 30–34 yr, 35–39 yr, and ≥40 yr categories. The 1st-trimester gestational weight gain (GWG) was calculated as the maternal weight at the beginning of the 2nd trimester minus the pre-pregnancy weight, and the “above the recommended GWG” was defined based on the Institute of Medicine guideline (IOM) recommendations [24]. A weight gain of more than 2 kg in the 1st trimester was considered excessive for all the pre-BMI stratification groups. Age at menarche was grouped into <11 yr and ≥11 yr categories, as existing literature suggests that menarche age < 11 yr is a risk factor for metabolic diseases and fetal overgrowth during pregnancy [25,26,27]. FBG in the 1st trimester was stratified into two groups using a 5.0 mmol/L cut-off value as the recommended FBG level during early pregnancy [28]. These stratifications were based on clinical knowledge and practice guidelines. The positive status of relative thyroid antibodies in this study was defined as the positive status of TPO-Ab/Tg-Ab in the 1st trimester.

2.4. Data Collection and Detection of Plasma Biochemical Parameters

The questionnaire survey for participants included general information on medical history, T2DM family history, reproductive history, and age at menarche. All laboratory tests were performed by standard methods at a certified laboratory. The glucose oxidase method was used to test FBG levels, with an intra-and inter-assay variation factor of 2.1% and 2.6%, respectively. The enzyme-catalyzed method was employed to obtain the plasma lipid profiles (triacylglycerol (TG), total cholesterol (CHO), low-density lipoprotein cholesterol (LDL-C), and high-density lipoprotein cholesterol (HDL-C)). Plasma thyroid function was tested using commercially available kits (FT3 (R-A-03-01, 1–81 pmol/L), FT4 (R-A-04-01, 3–200 pmol/L), TSH (R-A-05-01, 0–50 uIU/mL), TG-Ab (R-A-07-01, 30–3000 IU/mL), and TPOAb (R-A-08-01, 10–1000 IU/mL) 3V Bioengineering, Shandong, China). The indexes of liver function (aspartate aminotransferase (AST), alanine aminotransferase (ALT), total protein, albumin, and globuli), vitamin B12, and ferritin levels were tested using the reagents manufactured by Shandong 3V Bioengineering Company, Weifang China, on the Hitachi 7600 automatic 49 biochemical analyzer platform.

2.5. Derivation Cohort for the Score-Scaled GDM Risk Prediction Model

The meta-analysis for the derived cohort, comprising cohorts from 42 studies (12 prospective and 30 retrospective cohorts), was performed with a PROSPERO ID CRD42022302930. We searched for articles that were published in electronic databases, including MEDLINE, Embase, and Web of Science until the end of 10 July 2021, using a merged method consisting of MeSH headings search strategy and the following terms: “Diabetes, Gestational”, “Risk Factors”, and “Cohort Studies.” Pregnant women were from Asia (China, Iran, Israel, Japan, Korea, and Malaysia), Europe (Finland, Italy, Spain, Sweden, and the UK), the Americas (U.S. and Peru), and Australia, and three international multi-center cohorts were included. Odds ratios (ORs) and corresponding 95% CIs (confidence intervals) were estimated for the risk factors extracted from these articles. We used the Newcastle Ottawa Scale to evaluate the quality of the included studies. We used Endnote to screen the titles and abstracts after removing duplicate reports; WN and GHN screened independently. In cases of differences, a third investigator (JYY) was involved for discussion and resolution. Full texts of the screened studies were evaluated by WN and GHN according to the set criteria, and any disagreement was resolved unanimously with the participation of the third investigator (JYY). Further, the data were extracted and checked by WN, GHN, and JYY. Figure 1 presents the flow chart of the research report selection method. Supplementary Tables S1 and S2 list the keywords, derivative words, and retrieval strategy.

2.6. Statistical Analysis

Continuous variables showing normal distribution were presented as means ± standard deviation (SD). Differences between the NGT and GDM groups were compared using the t-test; the Chi-squared test was utilized for categorical variables.

2.6.1. Meta-Analysis

ORs with 95% CIs of all risk factors for GDM were extracted and analyzed using random-effects or fixed-effects models based on their heterogeneity. According to the statistical estimate of the sample size, the inverse of the variance of the OR was the corresponding weight of the study [29]. The heterogeneity of each study was assessed and measured using I²; when I² was >50%, the random-effects model was used; otherwise, the fixed-effects model was chosen [30]. If risk factors showed significant heterogeneity (I² > 50%), sensitivity analyses were conducted by omitting a single study to measure the robustness of the results [31]. Funnel plots were drawn to measure the publication bias. p-value < 0.05 in a two-sided test suggested statistical significance. We used the Review Manager software version 5.0 for the meta-analysis.

2.6.2. Multiple Imputations

Multiple imputations with chained equations were employed to replace the missing values for vitamin B12, ferritin, CHO, TG, LDL-C, and HDL-C. The number of all missing values was within 20% of the total. Five estimation models were used based on the sample size and the capacity of the software (Jupyter Notebook 6.4.5, python 3.9.7).

2.6.3. The Logistic Regression Modeling Strategy

We divide pregnant women into the training (n = 765) and validation (n = 310) sets randomly. Variables were screened by Lasso regression, and these in the model were determined by cross-verification Lasso logistic regression (3 folds, seed 123), which, with the largest lambda for MSPE, are within one standard error (STATA 15.0). The regression modeling strategy employed multivariate logistic regression (SPSS 22.0). The nomograph was constructed and AUCs were measured using STATA (version 15.0).

2.6.4. The Machine Learning (ML) Algorithms

The original data consisted of 722 NGT and 352 GDM women. In order to improve the performance of the ML, a combination of over-sampling by SMOTE (synthetic minority oversampling technique) for the minority class and random under-sampling for the majority class is used to balance the dataset. The DT and RF algorithms were used to establish GDM prediction models using Jupyter Notebook (Anaconda) 6.4.5. All graphs for ML models were plotted using Graphviz 2.38.

2.7. Development and Validation of the Models

2.7.1. The Score-Scaled GDM Risk Prediction Model

Considering the practicality and clinical utility of the prediction model, we only extracted the pooled ORs and their 95% CIs for risk factors with statistically significant differences and selected the appropriate ORs for sensitivity analyses, based on the criteria proposed by Sullivan and colleagues [32]. Scores of risk factors were calculated by multiplying the β-coefficient by 10 and rounding it off. The total score was the sum of each risk factor. The data of 1075 pregnant women from November 2019 to March 2020 were used for the risk stratification, and validation was performed using the data of 210 pregnant women from the May 2021 cohort. The total score was calculated for each pregnant woman based on the variables in the model. The occurrence of GDM in the 2nd trimester was the outcome of assessing the AUC. According to the optimal cut-off value and the median score of the two intervals [33], pregnant women were divided into 4 groups, namely relatively low, moderate, high, and very high, to analyze the differences in the proportion of outcomes.

2.7.2. Logistic Regression Analysis for GDM Risk Prediction Model

Variables were screened by Lasso regression analysis, and cross-verification was performed. Multivariate logistic regression (backward variable selection) was conducted using the training set. Equations and nomographs were used for the popularization of the prediction model. The validation set comprised 310 women, and the AUC reflected an estimate of the average optimal model’s predictive accuracy, which also quantified the level of agreement between the predicted probabilities and the actual incidence of GDM. The calibration curve was plotted to evaluate the association between the probability estimated using the logistic regression model and the observed GDM rate in training/validation sets. The Hosmer–Lemeshow (H-L) test was conducted to determine the differences between the predicted and the true values. The decision curve showed the net return from the model. The model was further validated using data from 210 pregnant women from the May 2021 cohort.

2.7.3. ML Prediction Models

Two ML algorithms, namely DT and RF, were employed. DT is a tree structure consisting of a single root node and several internal nodes and leaf nodes. The starting node is the root and the path from the root to the leaf node represents the classification processes of DT. CART algorithms based on the Gini coefficients were used. RF is an integrated algorithm layered on top of multiple DT classifiers. Each DT in the RF is randomly constructed and controlled by several selected characteristic variables. The learning process includes bagging and random feature selection.

To achieve a better prediction performance by ML algorithms, we used the Random Under-Sampling and Synthetic Minority Over-Sampling Technique (SMOTE) to balance the data. Data were randomly divided into training (70%) and validation (30%) sets for DT and RF algorithms, respectively. The final classification results mostly contained DTs. Indicators such as the AUC-ROC curve, precision, recall, Fi-score, accuracy, and specificity were employed to measure the performance of these ML prediction models.

3. Results

3.1. The Score-Scaled GDM Risk Prediction Model

3.1.1. Derivation Cohort

The patients in the derivation cohort were from 42 different studies, and the period ranged from 1 to 19 years. Supplementary Table S3 lists the characteristics of these studies. The qualities of these studies were measured using the Newcastle-Ottawa scales by SL and CH independently, and the scores ranged from 6 to 9 (Supplementary Table S4).

Seven reasonable risk factors were selected following the meta-analysis. Pooled ORs and their corresponding 95% CIs are shown in Figure 2A (the detailed data, forest plots, sensitivity analyses, and funnel figures of these factors are provided in Supplementary Table S5 and Supplementary Figures S1–S9). Finally, the GDM risk prediction model was established with age (<30 yr scores 0, 30–34 yr scores 5.0, 35–39 yr scores 8.0, and ≥40 yr scored 9.0), BMI (<24.0 kg/m² scores 0, 24–27.9 kg/m² scores 4.0, and ≥28 kg/m² scores 5.0), T2DM family history (no T2DM family history scores 0 and have T2DM family history scores 6.0), age at menarche (0 if age at menarche > 11 and 3.0 if ≤11 years), acceptance of ART treatment (0 if no and 2.0 if yes), the positivity of thyroid-related antibodies (0 if no and 5.0 if yes), and IOM above the recommended GWG in the first trimester (0 if no and 2.0 if yes) (shown in Table 1).

3.1.2. Validation Cohort

The clinical and serological indicators in the first trimester in the GDM and NGT groups are shown in Supplementary Table S6. Compared to the NGT group, women in the GDM group were older. The higher proportions of pregnant women with T2DM family history, higher pre-BMI and FBG levels, and larger proportion of women in the age at menarche ≤ 11 yr, acceptance of ART treatment, the positivity of thyroid-related antibodies, IOM above the recommended GWG in the first trimester, and a history of macrosomia were present in the GDM group relative to the NGT group (all p < 0.05). The model showed an AUC of 0.772 (95% CI 0.742–0.803) (Figure 2B). The optimal cut-off score was 9.5 with the maximum corresponding Youden index (Supplementary Table S7). Based on this model, 1075 pregnant women were further classified into four groups based on risk scores as follows: <5.5 as low (n = 580), 6–9.5 as moderate (n = 203), 10–18.5 as high (n = 259), and ≥19 as very high (n = 33) risk levels. The number of included pregnant women who developed GDM in the second trimester was 93 (15.7%) in the low group, 53 (28.5%) in the moderate group, 178 (67.2%) in the high group, and 28 (88.6%) in the very high group (Figure 2C). Significant statistical differences were found in the pairwise comparisons among the four groups (except for the high and the very high-risk groups).

3.2. Logistic Regression Analysis for the GDM Risk Prediction Model

Table 2 shows the comparison of clinical characteristics between the training cohort (n = 765) and validation cohort 1 (n = 310) and validation cohort 2 (n = 210).

3.2.1. Training Set

Seven variables were included in the model by cross-validated lasso-logistic regression (Figure 3A,B). Multivariate logistic regression was used for the calculation and these results are shown in Table 3. The nomogram was used to predict the incidence of GDM in pregnant women in the first trimester (Figure 3C). The predicted GDM risk was estimated using the following equation:

P = \frac{1}{1 + e x p . (- x)}, X = - 3.417 + 0.492 Maternal Age 30 - 34 y / 0.984 Maternal Age 35 - 39 y / 1.492 Maternal Age \geq 40 y + 0.976 T 2 DM f a m i l y h i s t o r y + 0.691 pre - BMI 24 - 27.9 kg / m^{2} / 1.382 pre - BMI \geq 28 kg / m^{2} + 0.776 ART + 1.381 Thyroid antibodies positive + 1.273 Above IOM recommended GWG at the 1 st trimester + 0.753 FBG \geq 5.0 mmol / L

3.2.2. Discriminant Analysis

The AUC of the ROC curve was 0.799 (95% CI 0.763–0.836) in the training set and 0.834 (95% CI 0.785–0.882) in the validation set. The calibration curve of the training and validation sets, showing the association between the probability of GDM predicted using the model and the observed GDM rate, suggested consistency between the two. The decision curves for the training and validation sets suggested a net benefit without increasing the number of false positives (Figure 4). The p-value for the H-L tests was 0.374 in the training set and 0.530 in the validation set, indicating that the logistic regression analysis prediction model had high consistency.

3.3. Comparison of the Two Prediction Models

The Net Reclassification Index (NRI) and Integrated Discrimination Improvement (IDI) were focused on comparisons of the two prediction models at a certain truncation value. In this study, we compared the score-scaled GDM risk prediction model (model 1) with the logistic regression analysis for the GDM risk prediction model (model 2). The results showed positive improvements in model 2, with an NRI of 0.208 and IDI of 0.045 (Supplementary Table S8), implying that model 2 had better predictive power than model 1.

3.4. ML Models for GDM Prediction

The data of AUC, precision, recall, Fi-score, accuracy, and specificity of training and validation sets for the DT and RF models are shown in Supplementary Table S9. The ROC curves, tree structures, and feature importance curves for DT and RF models are illustrated in Supplementary Figures S10–S13.

3.5. Validation of the Established Models

We used data from the NWCH cohort of May 2021 to further validate the performance of the developed models, and the results suggested a good performance. The AUC was 0.769 (0.681–0.858) for the score-scaled model, 0.841 (0.736–0.891) for the logistic regression model, 0.777 (0.726–0.829) for the DT model, and 0.740 (0.684–0.795) for the RF model (Supplementary Table S10).

4. Discussion

We established GDM prediction models using meta-analysis, logistic regression, and two ML algorithms. For the score-scaled model, the following risk factors were screened for meta-analysis: Maternal age, pre-BMI, GWG in the first trimester, age at menarche, ART, T2DM family history, and positivity for thyroid-related antibodies (TPOAb and TgAb). Data for risk stratification was collected from 1075 pregnant women who carried the fetus to full-term with relatively complete clinical data from NWCH; the AUC was 0.722. Statistically, differences were present among the four risk groups divided at the cut-off point of 9.5 (except for the high and very high-risk groups). For the logistic regression analysis model, the variables included maternal age, pre-BMI, GWG in the first trimester, acceptance of ART treatment, T2DM family history, the positivity of thyroid-related antibodies, and FBG. The training and validation sets showed AUCs of 0.799 (95% CI 0.763–0.836) and 0.834 (95% CI 0.785–0.882), respectively. The prediction models developed using two ML algorithms (DT and RF) comprised all the above-mentioned risk factors and showed relatively reasonable prediction performances in both the training and validation sets. All the models were validated using data from the cohort collected in May 2021 at NCWH and showed good performances.

For the correlation of the risk factors and GDM, a dose–response association from a meta-analysis [12] showed a linear relationship between the risk of developing GDM and advanced maternal age, with the same age stratification as in our study. Overweight/obesity before pregnancy, excess GWG [9,10,11], and younger menarche age [14] are risk factors for GDM and may be mediated by the status of insulin resistance. T2DM is a polygenic inherited disease with overlapping susceptibility genes as in GDM [15]. Elevated FBG levels in the first trimester were associated with GDM [34]; however, the majority of studies show inconsistent cut-off values [35,36]. Some particular ART procedures are related to the increased incidences of GDM [13], which may be caused by advanced maternal age and underlying maternal subfertility-related diseases. The positivity of thyroid-related antibodies (TPOAb or TgAb) was associated with the risk of developing GDM [37]. Increased TPOAb/TgAb in the peripheral blood of patients with Hashimoto thyroiditis can interfere with the homeostasis of β-cells by reducing the number of CD19⁺CD24^hiCD38^hi Breg cells and releasing inflammatory factors in peripheral blood, leading to insulin resistance [38,39]. However, any single clinical indicator is insufficient to predict the incidence of developing GDM accurately. The integration of published risk factors is more reliable for the early diagnosis of GDM.

Existing published predictors of GDM including angiopoietin-like protein 8, plasma fatty acid binding protein 4, and various adipokines are not popular in clinical practice [6,7,8]. Additionally, the data of some GDM prediction models [40,41] were from small cross-sectional and case-control studies, which is not conducive to application in the cohort of pregnant women. For a GDM prediction model that was developed in a retrospective study comprising 580,000 electronic medical records of Israeli pregnant women, the AUC was 0.85 after including all GDM-related variables, while the AUC of the questionnaire prediction model using only nine GDM-related variables was 0.80 [42]. However, the variables in these models were obtained in the 20th gestational week, and thus, GDM could not be predicted and intervened in the early stages of pregnancy. Our score-scaled GDM prediction model was developed using a meta-analysis of 42 high-quality cohort studies, which significantly improved the statistical performance and ease of calculation. The risk factors in our GDM prediction models are easily accessible from the clinical work in the first trimester. These advantages make our model more convenient and easier to use in clinical settings.

To our knowledge, a few published score-scaled GDM risk prediction models have been previously derived from a meta-analysis, which can be used in the first trimester in clinical settings. Several GDM-related risk factors, which were published previously, are not included in our meta-analysis, because of the differential diagnostic criteria for GDM.

The score-scaled GDM risk prediction model has numerous advantages, including being common across clinical settings and ease of generalization in underdeveloped areas. Notably, the model was established based on pregnant women who were >18 years old and is well-suited to this group. Moreover, two different criteria for the classification of women’s pre-BMI in the included studies were employed: (1) Among whites, the criteria were 18.5–24.9 kg/m² for normal, 25.0–29.9 kg/m² for overweight, and 30.0 kg/m² for obese according to World Health Organization guidelines [43] and (2) in Chinese women, the criteria was 18.5–23.9 kg/m² for normal, 24.0–27.9 kg/m² for overweight, and above 28.0 kg/m² for obese [44] according to the criteria recommended by the Working Group on Obesity in China, 2003.

The score-scaled GDM risk prediction model did not include FBG in the first trimester, owing to the inconsistency in the gestational week for FBG detection and the different cut-off values. Thus, the results could not be merged in the meta-analysis. However, in the logistic regression analysis model, the FBG level in the first trimester was a continuous variable, which enabled its stratification according to a published study [28].

For the logistic regression analysis model, logarithmic equations, a relatively tedious process, were employed to measure the probability of developing GDM. The corresponding nomogram could roughly estimate the probability of developing GDM; however, these results showed limited accuracy. Moreover, it may be unrealistic in some low-income areas, such as in western China, as it requires pregnant women to visit the hospital in the fasting state in the first trimester for assessing FBG levels.

Due to the embedded feature selection methods, limited large-scale extrapolation is inevitable for published and ML prediction models in this study [42,45]. Therefore, the ML algorithms in our study, as a method to establish the prediction models, are not recommended for clinical use. The published ML algorithm models comprise serological indicators, including HbA1c and TG, which are relatively hard to obtain in the early stages of pregnancy [42,45].

Modifiable risk factors in prediction models are necessary to identify and assess whether the relationship is causal, particularly important for prevention. In this study, GWG in the first trimester was identified as the modifiable factor, and thus, our prediction models can be used to evaluate the remission of incidence for developing GDM after an early intervention; however, this possibility needs to be tested further.

Our study has some limitations. The data were extracted from a large tertiary institution, with relatively high-risk GDM incidence as compared to that in community hospitals or remote mountainous areas. As this was a single-center study, the cohort may not represent the diverse ethnic populations in China. As it was a retrospective study, we could not collect records of detailed information on diet and exercise, which may interfere with the basal metabolic rates and influence GWG, potentially mediating the association between GWG and GDM. Moreover, the first antenatal care visit of pregnant women is usually in the 12th gestational week, with potential recall bias in pre-pregnancy weight for calculating the pre-BMI. Finally, the prediction models included thyroid function indicators during the first trimester, a characteristic highlight of our study. Although the thyroid function test is becoming increasingly common in China during a regular prenatal visit, extensive testing of thyroid function in pregnant women remains difficult to implement in some remote areas, especially in western China, and a major limiting factor for the application of our prediction models in clinical settings.

5. Conclusions

We developed and validated GDM prediction models based on a meta-analysis, logistic regression, and two ML algorithms, which can be used to predict the incidence of GDM in the first trimester. The calculation method for the score-scaled GDM prediction model was relatively simple but showed lower prediction accuracy as compared to other models. The computational formula of the logistic regression GDM model was relatively complex but showed higher accuracy. The ML models exhibited the highest accuracy, but these are hard to implement in clinical practice. Clinicians can choose an appropriate prediction model based on the variables according to local settings. The predictive tools make it easy to identify women at high risk of developing GDM.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/metabo12111040/s1. Table S1. Keywords and derivative words; Table S2. Retrieval strategy; Table S3. Baseline characteristics of the included studies [46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82]; Table S4. Newcastle-Ottawa Quality Assessment Scale for the 42 cohort studies; Table S5. Seven risk factors are included in the systematic review and meta-analysis; Table S6. The clinical characteristics of pregnant women in the GDM and NGT groups in the 1st trimester; Table S7. Performance of the GDM risk prediction model at different cut-off values; Table S8. The comparison of the score-scaled GDM prediction model and logistic regression analysis of the GDM prediction model; Table S9. The comparison of the two ML models; Table S10. The validation of the published GDM models and models established in this study. Figure S1–S8: The forest plots of included variables in the score-scaled GDM prediction model; Figure S9: Funnel plot of the included variables in the score-scaled GDM prediction model; Figure S10: The ROC curves for the two ML models; Figure S11: The tree structure in the decision tree model; Figure S12: The tree structure in random forest model; Figure S13: Feature importance curve derived from the decision tree and random forest models.

Author Contributions

Conceptualization, N.W., B.S., W.C. and J.X.; data curation, N.W., H.G., Y.J., H.C., M.W., L.G. and L.H.; formal analysis, H.G., Y.J., H.C. and Y.S.; funding acquisition, L.S., B.S. and W.C.; investigation, N.W., Y.J., L.S., M.W., L.G., L.H., Y.S. and J.X.; methodology, N.W., H.G., Y.J., L.S., H.C., M.W., L.G., L.H., Y.S. and B.S.; project administration, B.S.; software, H.C. and M.W.; supervision, L.S., L.G., B.S., W.C. and J.X.; validation, H.G., B.S. and J.X.; visualization, L.S., B.S. and J.X.; writing—original draft, N.W. and L.H.; writing—review and editing, N.W., L.S., H.C., M.W., B.S., W.C. and J.X. All authors have read and agreed to the published version of the manuscript.

Funding

The study received funding from grants from the Natural Science Foundation of Shaanxi Province (No. 2020GXLH-Y-029, 2019JQ069, 2019JM262), the Clinical Research Award of the First Affiliated Hospital of Xi’an Jiaotong University, China (No. XJTU1AF-CRF-2019-007), the Natural Science Foundation of China (No. 81801459; No. 82071732).

Institutional Review Board Statement

Human participants in this study were reviewed and approved by The Ethics Committee of the First Affiliated Hospital of Xi’an Jiaotong University (XJTU1AF2019LSL-007), Xi’an, China. The clinical trial registration number for ChiCTR is 1900026735. This study was conducted in accordance with the guidelines in the Declaration of Helsinki.

Informed Consent Statement

Written informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data supporting the reported results are stored at Xi’an Jiaotong University Second Affiliated Hospital.

Acknowledgments

The authors thank all obstetricians of the Northwest Women and Children’s Hospital who participated in the study and contributed to data collection.

Conflicts of Interest

The authors declare no potential conflicts of interest.

Abbreviations

GDM	gestational diabetes mellitus
ML	machine learning
AUC	area under the curve
NGT	normal glucose tolerance
OGTT	oral glucose tolerance test
pre-BMI	pre-pregnancy body mass index
GWG	gestational weight gain
TPO-Ab	thyroid peroxidase antibody
Tg-Ab	thyroglobulin antibody
ART	assisted reproductive technology
ALT	glutamic-pyruvic transaminase
AST	glutamic oxalacetic transaminase
CHO	total cholesterol
TG	triglyceride
HDL-C	high-density lipoprotein cholesterol
LDL-C	low-density lipoprotein cholesterol

References

American Diabetes Association Professional Practice Committee. 2. Classification and Diagnosis of Diabetes: Standards of Medical Care in Diabetes-2022. Diabetes Care 2022, 45, S17–S38. [Google Scholar] [CrossRef] [PubMed]
Kim, C.; Newton, K.M.; Knopp, R.H. Gestational diabetes and the incidence of type 2 diabetes: A systematic review. Diabetes Care 2002, 25, 1862–1868. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gabbe, S.G. Gestational diabetes mellitus. N. Engl. J. Med. 1986, 315, 1025–1026. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sletner, L.; Jenum, A.K.; Yajnik, C.S.; Morkrid, K.; Nakstad, B.; Rognerud-Jensen, O.H.; Birkeland, K.I.; Vangen, S. Fetal growth trajectories in pregnancies of European and South Asian mothers with and without gestational diabetes, a population-based cohort study. PLoS ONE 2017, 12, e0172946. [Google Scholar] [CrossRef] [Green Version]
Sovio, U.; Murphy, H.R.; Smith, G.C. Accelerated Fetal Growth Prior to Diagnosis of Gestational Diabetes Mellitus: A Prospective Cohort Study of Nulliparous Women. Diabetes Care 2016, 39, 982–987. [Google Scholar] [CrossRef] [Green Version]
Leong, I. Diabetes: ANGPTL8 as an early predictor of gestational diabetes mellitus. Nat. Rev. Endocrinol. 2018, 14, 64. [Google Scholar] [CrossRef]
Ning, H.; Tao, H.; Weng, Z.; Zhao, X. Plasma fatty acid-binding protein 4 (FABP4) as a novel biomarker to predict gestational diabetes mellitus. Acta Diabetol. 2016, 53, 891–898. [Google Scholar] [CrossRef]
Bao, W.; Baecker, A.; Song, Y.; Kiely, M.; Liu, S.; Zhang, C. Adipokine levels during the first or early second trimester of pregnancy and subsequent risk of gestational diabetes mellitus: A systematic review. Metabolism 2015, 64, 756–764. [Google Scholar] [CrossRef] [Green Version]
Santos, S.; Voerman, E.; Amiano, P.; Barros, H.; Beilin, L.J.; Bergstrom, A.; Charles, M.A.; Chatzi, L.; Chevrier, C.; Chrousos, G.P.; et al. Impact of maternal body mass index and gestational weight gain on pregnancy complications: An individual participant data meta-analysis of European, North American and Australian cohorts. BJOG 2019, 126, 984–995. [Google Scholar] [CrossRef]
Kim, S.Y.; England, L.; Wilson, H.G.; Bish, C.; Satten, G.A.; Dietz, P. Percentage of gestational diabetes mellitus attributable to overweight and obesity. Am. J. Public Health 2010, 100, 1047–1052. [Google Scholar] [CrossRef]
Yen, I.W.; Lee, C.N.; Lin, M.W.; Fan, K.C.; Wei, J.N.; Chen, K.Y.; Chen, S.C.; Tai, Y.Y.; Kuo, C.H.; Lin, C.H.; et al. Overweight and obesity are associated with clustering of metabolic risk factors in early pregnancy and the risk of GDM. PLoS ONE 2019, 14, e0225978. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Ren, X.; He, L.; Li, J.; Zhang, S.; Chen, W. Maternal age and the risk of gestational diabetes mellitus: A systematic review and meta-analysis of over 120 million participants. Diabetes Res. Clin. Prac. 2020, 162, 108044. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.A.; Nikravan, R.; Smith, H.C.; Sullivan, E.A. Higher prevalence of gestational diabetes mellitus following assisted reproduction technology treatment. Hum. Reprod. 2013, 28, 2554–2561. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Petry, C.J.; Ong, K.K.; Hughes, I.A.; Acerini, C.L.; Dunger, D.B. The association between age at menarche and later risk of gestational diabetes is mediated by insulin resistance. Acta Diabetol. 2018, 55, 853–859. [Google Scholar] [CrossRef] [Green Version]
Dereke, J.; Palmqvist, S.; Nilsson, C.; Landin-Olsson, M.; Hillman, M. The prevalence and predictive value of the SLC30A8 R325W polymorphism and zinc transporter 8 autoantibodies in the development of GDM and postpartum type 1 diabetes. Endocrine 2016, 53, 740–746. [Google Scholar] [CrossRef]
Kang, M.; Zhang, H.; Zhang, J.; Huang, K.; Zhao, J.; Hu, J.; Lu, C.; Shao, J.; Weng, J.; Yang, Y.; et al. A Novel Nomogram for Predicting Gestational Diabetes Mellitus During Early Pregnancy. Front. Endocrinol. 2021, 12, 779210. [Google Scholar] [CrossRef]
Schoenaker, D.; Vergouwe, Y.; Soedamah-Muthu, S.S.; Callaway, L.K.; Mishra, G.D. Preconception risk of gestational diabetes: Development of a prediction model in nulliparous Australian women. Diabetes Res. Clin. Prac. 2018, 146, 48–57. [Google Scholar] [CrossRef]
Sweeting, A.N.; Appelblom, H.; Ross, G.P.; Wong, J.; Kouru, H.; Williams, P.F.; Sairanen, M.; Hyett, J.A. First trimester prediction of gestational diabetes mellitus: A clinical model based on maternal demographic parameters. Diabetes Res. Clin. Prac. 2017, 127, 44–50. [Google Scholar] [CrossRef]
Zheng, T.; Ye, W.; Wang, X.; Li, X.; Zhang, J.; Little, J.; Zhou, L.; Zhang, L. A simple model to predict risk of gestational diabetes mellitus from 8 to 20 weeks of gestation in Chinese women. BMC Pregnancy Childbirth 2019, 19, 252. [Google Scholar] [CrossRef] [Green Version]
Guo, F.; Yang, S.; Zhang, Y.; Yang, X.; Zhang, C.; Fan, J. Nomogram for prediction of gestational diabetes mellitus in urban, Chinese, pregnant women. BMC Pregnancy Childbirth 2020, 20, 43. [Google Scholar] [CrossRef]
Zhang, X.; Zhao, X.; Huo, L.; Yuan, N.; Sun, J.; Du, J.; Nan, M.; Ji, L. Risk prediction model of gestational diabetes mellitus based on nomogram in a Chinese population cohort study. Sci. Rep. 2020, 10, 21223. [Google Scholar] [CrossRef] [PubMed]
American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care 2011, 34 (Suppl. 1), S62–S69. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, C.; Lu, F.C.; Department of Disease Control Ministry of Health, PR China. The guidelines for prevention and control of overweight and obesity in Chinese adults. Biomed. Environ. Sci. 2004, 17, 1–36. [Google Scholar] [PubMed]
Institute of Medicine (US); National Research Council (US). Weight Gain During Pregnancy: Reexamining the Guidelines; Rasmussen, K.M., Yaktine, A.L., Eds.; The National Academies Collection: Reports funded by National Institutes of Health; National Academies Press (US): Washington, DC, USA, 2009. [Google Scholar]
Wang, L.; Yan, B.; Shi, X.; Song, H.; Su, W.; Huang, B.; Zhang, Y.; Wang, S.; Lv, F.; Lin, M.; et al. Age at menarche and risk of gestational diabetes mellitus: A population-based study in Xiamen, China. BMC Pregnancy Childbirth 2019, 19, 138. [Google Scholar] [CrossRef] [PubMed]
Chen, L.; Li, S.; He, C.; Zhu, Y.; Buck Louis, G.M.; Yeung, E.; Hu, F.B.; Zhang, C. Age at Menarche and Risk of Gestational Diabetes Mellitus: A Prospective Cohort Study Among 27,482 Women. Diabetes Care 2016, 39, 469–471. [Google Scholar] [CrossRef] [Green Version]
Li, H.; Shen, L.; Song, L.; Liu, B.; Zheng, X.; Xu, S.; Wang, Y. Early age at menarche and gestational diabetes mellitus risk: Results from the Healthy Baby Cohort study. Diabetes Metab. 2017, 43, 248–252. [Google Scholar] [CrossRef]
Coustan, D.R.; Berkowitz, R.L.; Hobbins, J.C. Tight metabolic control of overt diabetes in pregnancy. Am. J. Med. 1980, 68, 845–852. [Google Scholar] [CrossRef]
Woodward, M. Epidemiology: Study Design and Data Analysis; Taylor & Francis: Oxfordshire, UK, 2000. [Google Scholar]
Deeks, J.J.; Higgins, J.; Altman, D. Cochrane Handbook: General Methods for Cochrane Reviews: Ch 9: Analysing Data and Undertaking Meta-Analyses; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Greenland, S. Sensitivity Analysis and Bias Analysis; Springer: New York, NY, USA, 2014. [Google Scholar]
Sullivan, L.M.; Massaro, J.M.; D′Agostino, R.B.; Sullivan, L.M.; Massaro, J.M.; D’Agostino, R.B. SrPresentation of multivariate data for clinical use: The Framingham Study risk score functions. Stat. Med. 2004, 23, 1631–1660. [Google Scholar] [CrossRef]
Cook, N. NR: Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation 2007, 115, 928–935. [Google Scholar] [CrossRef] [Green Version]
Powe, C.E. Early Pregnancy Biochemical Predictors of Gestational Diabetes Mellitus. Curr. Diabetes Rep. 2017, 17, 12. [Google Scholar] [CrossRef]
Saeedi, M.; Hanson, U.; Simmons, D.; Fadl, H. Characteristics of different risk factors and fasting plasma glucose for identifying GDM when using IADPSG criteria: A cross-sectional study. BMC Pregnancy Childbirth 2018, 18, 225. [Google Scholar] [CrossRef] [PubMed]
Benhalima, K.; Van Crombrugge, P.; Moyson, C.; Verhaeghe, J.; Vandeginste, S.; Verlaenen, H.; Vercammen, C.; Maes, T.; Dufraimont, E.; De Block, C.; et al. Estimating the risk of gestational diabetes mellitus based on the 2013 WHO criteria: A prediction model based on clinical and biochemical variables in early pregnancy. Acta Diabetol. 2020, 57, 661–671. [Google Scholar] [CrossRef] [PubMed]
Luo, J.; Wang, X.; Yuan, L.; Guo, L. Association of thyroid disorders with gestational diabetes mellitus: A meta-analysis. Endocrine 2021, 73, 550–560. [Google Scholar] [CrossRef] [PubMed]
Montaner, P.; Juan, L.; Campos, R.; Gil, L.; Corcoy, R. Is thyroid autoimmunity associated with gestational diabetes mellitus? Metabolism 2008, 57, 522–525. [Google Scholar] [CrossRef]
Yang, M.; Du, C.; Wang, Y.; Liu, J. CD19(+)CD24(hi)CD38(hi) regulatory B cells are associated with insulin resistance in type I Hashimoto′s thyroiditis in Chinese females. Exp. Ther. Med. 2017, 14, 3887–3893. [Google Scholar] [CrossRef] [Green Version]
Correa, P.J.; Venegas, P.; Palmeiro, Y.; Albers, D.; Rice, G.; Roa, J.; Cortez, J.; Monckeberg, M.; Schepeler, M.; Osorio, E.; et al. First trimester prediction of gestational diabetes mellitus using plasma biomarkers: A case-control study. J. Perinat. Med. 2019, 47, 161–168. [Google Scholar] [CrossRef]
Nombo, A.P.; Mwanri, A.W.; Brouwer-Brolsma, E.M.; Ramaiya, K.L.; Feskens, E.J.M. Gestational diabetes mellitus risk score: A practical tool to predict gestational diabetes mellitus risk in Tanzania. Diabetes Res. Clin. Prac. 2018, 145, 130–137. [Google Scholar] [CrossRef]
Artzi, N.S.; Shilo, S.; Hadar, E.; Rossman, H.; Barbash-Hazan, S.; Ben-Haroush, A.; Balicer, R.D.; Feldman, B.; Wiznitzer, A.; Segal, E. Prediction of gestational diabetes based on nationwide electronic health records. Nat. Med. 2020, 26, 71–76. [Google Scholar] [CrossRef]
World Health Organization. Obesity:Preventing and Managing the Global Epidemic; Publications of World Health Organization: Geneva, Switzerland, 1999. [Google Scholar]
Zhou, B.-f. Effect of Body Mass Index on All-cause Mortality and Incidence of Cardiovascular Diseases—Report for Meta-Analysis of Prospective Studies on Optimal Cut-off Points of Body Mass Index in Chinese Adults. Biomed. Environ. Sci. 2002, 03, 60–67. [Google Scholar]
Wu, Y.T.; Zhang, C.J.; Mol, B.W.; Kawai, A.; Li, C.; Chen, L.; Wang, Y.; Sheng, J.Z.; Fan, J.X.; Shi, Y.; et al. Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning. J. Clin. Endocrinol. Metab. 2021, 106, e1191–e1205. [Google Scholar] [CrossRef]
Qi, Y.; Sun, X.; Tan, J.; Zhang, G.; Chen, M.; Xiong, Y.; Chen, P.; Liu, C.; Zou, K.; Liu, X. Excessive gestational weight gain in the first and second trimester is a risk factor for gestational diabetes mellitus among women pregnant with singletons: A repeated measures analysis. J. Diabetes Investig. 2020, 11, 1651–1660. [Google Scholar] [CrossRef] [PubMed]
Hedderson, M.M.; Gunderson, E.P.; Ferrara, A. Gestational Weight Gain and Risk of Gestational Diabetes Mellitus. Obstet. Gynecol. 2010, 115, 597–604. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhong, C.; Li, X.; Chen, R.; Zhou, X.; Liu, C.; Wu, J.; Xu, S.; Wang, W.; Xiao, M.; Xiong, G.; et al. Greater early and mid-pregnancy gestational weight gain are associated with increased risk of gestational diabetes mellitus: A prospective cohort study. Clin. Nutr. ESPEN 2017, 22, 48–53. [Google Scholar] [CrossRef] [PubMed]
Dishi, M.; Enquobahrie, D.A.; Abetew, D.F.; Qiu, C.; Rudra, C.B.; Williams, M.A. Age at menarche, menstrual cycle characteristics and risk of gestational diabetes. Diabetes Res. Clin. Prac. 2011, 93, 437–442. [Google Scholar] [CrossRef] [PubMed]
Shen, Y.; Hu, H.; Taylor, B.D.; Kan, H.; Xu, X. Early Menarche and Gestational Diabetes Mellitus at First Live Birth. Matern. Child Heal. J. 2016, 21, 593–598. [Google Scholar] [CrossRef]
Ying, H.; Tang, Y.-P.; Bao, Y.-R.; Su, X.-J.; Cai, X.; Li, Y.-H.; Wang, D.-F. Maternal TSH level and TPOAb status in early pregnancy and their relationship to the risk of gestational diabetes mellitus. Endocrine 2016, 54, 742–750. [Google Scholar] [CrossRef]
Li, G.; Wei, T.; Ni, W.; Zhang, A.; Zhang, J.; Xing, Y.; Xing, Q. Incidence and Risk Factors of Gestational Diabetes Mellitus: A Prospective Cohort Study in Qingdao, China. Front. Endocrinol. 2020, 11. [Google Scholar] [CrossRef]
Männistö, T.; Mendola, P.; Grewal, J.; Xie, Y.; Chen, Z.; Laughon, S.K. Thyroid Diseases and Adverse Pregnancy Outcomes in a Contemporary US Cohort. J. Clin. Endocrinol. Metab. 2013, 98, 2725–2733. [Google Scholar] [CrossRef]
Yang, S.; Shi, F.-T.; Leung, P.; Huang, H.-F.; Fan, J. Low Thyroid Hormone in Early Pregnancy Is Associated With an Increased Risk of Gestational Diabetes Mellitus. J. Clin. Endocrinol. Metab. 2016, 101, 4237–4243. [Google Scholar] [CrossRef]
Lei, L.L.; Lan, Y.L.; Wang, S.Y.; Feng, W.; Zhai, Z.J. Perinatal complications and live-birth outcomes following assisted reproductive technology: A retrospective cohort study. Chin. Med. J.-Peking 2019, 132, 2408–2416. [Google Scholar] [CrossRef]
Barua, S.; Hng, T.-M.; Smith, H.; Bradford, J.; McLean, M. Ovulatory disorders are an independent risk factor for pregnancy complications in women receiving assisted reproduction treatments. Aust. N. Z. J. Obstet. Gynaecol. 2016, 57, 286–293. [Google Scholar] [CrossRef] [PubMed]
Hu, S.Q.; Xu, B.; Zhang, Y.N.; Jin, L. Risk factors of gestational diabetes mellitus during assisted reproductive technology procedures. Gynecol. Endocrinol. 2020, 36, 318–321. [Google Scholar]
Nagata, C.; Yang, L.M.; Yamamoto-Hanada, K.; Mezawa, H.; Ayabe, T.; Ishizuka, K.; Konishi, M.; Ohya, Y.; Saito, H.; Sago, H.; et al. Complications and adverse outcomes in pregnancy and childbirth among women who conceived by assisted reproductive technologies: A nationwide birth cohort study of Japan environment and children’s study. Bmc Pregnancy Childb. 2019, 19. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shevell, T.; Malone, F.D.; Vidaver, J.; Porter, T.; A Luthy, D.; Comstock, C.H.; Hankins, G.D.; Eddleman, K.; Dolan, S.; Dugoff, L.; et al. Assisted reproductive technology and pregnancy outcome–a population based screening study (the faster trial). Am. J. Obstet. Gynecol. 2003, 189, S175. [Google Scholar] [CrossRef]
Silberstein, T.; Sheiner, E.; Levy, A.; Harlev, A.; Saphier, O. 520: Perinatal outcome of pregnancies following in vitro fertilization and ovulation induction. Am. J. Obstet. Gynecol. 2014, 210, S257. [Google Scholar] [CrossRef]
Stern, J.E.; Luke, B.; Tobias, M.; Gopal, D.; Hornstein, M.D.; Diop, H. Adverse pregnancy and birth outcomes associated with underlying diagnosis with and without assisted reproductive technology treatment. Fertil. Steril. 2015, 103, 1438–1445. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cosson, E.; Cussac-Pillegand, C.; Benbara, A.; Pharisien, I.; Jaber, Y.; Banu, I.; Nguyen, M.T.; Valensi, P.; Carbillon, L. The Diagnostic and Prognostic Performance of a Selective Screening Strategy for Gestational Diabetes Mellitus According to Ethnicity in Europe. J. Clin. Endocrinol. Metab. 2014, 99, 996–1005. [Google Scholar] [CrossRef] [Green Version]
Hosseini, E.; Janghorbani, M.; Shahshahan, Z. Comparison of risk factors and pregnancy outcomes of gestational diabetes mellitus diagnosed during early and late pregnancy. Midwifery 2018, 66, 64–69. [Google Scholar] [CrossRef]
Larrabure-Torrealva, G.T.; Martinez, S.; Luque-Fernandez, M.A.; Sanchez, S.E.; Mascaro, P.A.; Ingar, H.; Castillo, W.; Zumaeta, R.; Grande, M.; Motta, V.; et al. Prevalence and risk factors of gestational diabetes mellitus: Findings from a universal screening feasibility program in Lima, Peru. BMC Pregnancy Childbirth 2018, 18, 303. [Google Scholar] [CrossRef] [Green Version]
Leng, J.; Shao, P.; Zhang, C.; Tian, H.; Zhang, F.; Zhang, S.; Dong, L.; Li, L.; Yu, Z.; Chan, J.; et al. Prevalence of Gestational Diabetes Mellitus and Its Risk Factors in Chinese Pregnant Women: A Prospective Population-Based Study in Tianjin, China. PLoS ONE 2015, 10, e0121029. [Google Scholar] [CrossRef]
Pirjani, R.; Shirzad, N.; Qorbani, M.; Phelpheli, M.; Nasli-Esfahani, E.; Bandarian, F.; Hemmatabadi, M. Gestational diabetes mellitus its association with obesity: A prospective cohort study. Eat. Weight Disord.-Stud. Anorexia, Bulim. Obes. 2016, 22, 445–450. [Google Scholar] [CrossRef]
Schaefer, K.K.; Xiao, W.; Chen, Q.; He, J.; Lu, J.; Chan, F.; Chen, N.; Yuan, M.; Xia, H.; Lam, K.B.H.; et al. Prediction of gestational diabetes mellitus in the Born in Guangzhou Cohort Study, China. Int. J. Gynecol. Obstet. 2018, 143, 164–171. [Google Scholar] [CrossRef]
Shahbazian, H.; Nouhjah, S.; Shahbazian, N.; Jahanfar, S.; Latifi, S.M.; Aleali, A.; Shahbazian, N.; Saadati, N. Gestational diabetes mellitus in an Iranian pregnant population using IADPSG criteria: Incidence, contributing factors and outcomes. Diabetes Metab. Syndr. Clin. Res. Rev. 2016, 10, 242–246. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Luo, B. Risk factors analysis of gestational diabetes mellitus based on International Association of Diabetes Pregnancy Study Groups criteria. Nan fang yi ke da xue xue bao = J. Southern Med. Univ. 2019, 39, 572–578. [Google Scholar]
Yan, B.; Yu, Y.X.; Lin, M.Z.; Li, Z.B.; Wang, L.Y.; Huang, P.Y.; Song, H.Q.; Shi, X.L.; Yang, S.Y.; Li, X.Y.; et al. High, but stable, trend in the prevalence of gestational diabetes mellitus: A population-based study in Xiamen, China. J. Diabetes Investig. 2019, 10, 1358–1364. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Khalil, A.; Syngelaki, A.; Maiz, N.; Zinevich, Y.; Nicolaides, K.H. Maternal Age and Adverse Pregnancy Outcomes: A Cohort Study EDITORIAL COMMENT. Obstet. Gynecol. Surv. 2013, 68, 779–781. [Google Scholar] [CrossRef]
Londero, A.P.; Rossetti, E.; Pittini, C.; Cagnacci, A.; Driul, L. Maternal age and the risk of adverse pregnancy outcomes: A retrospective cohort study. BMC Pregnancy Childbirth 2019, 19, 1–10. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Wang, X.Y.; Yang, H.X. [Effect of maternal age on pregnancy outcomes in Beijing]. Zhonghua fu chan ke za zhi 2017, 52, 514–520. [Google Scholar] [PubMed]
Koo, Y.-J.; Ryu, H.-M.; Yang, J.-H.; Lim, J.-H.; Lee, J.-E.; Kim, M.-Y.; Chung, J.-H. Pregnancy outcomes according to increasing maternal age. Taiwan. J. Obstet. Gynecol. 2012, 51, 60–65. [Google Scholar] [CrossRef] [Green Version]
Sun, Y.; Shen, Z.; Zhan, Y.; Wang, Y.; Ma, S.; Zhang, S.; Liu, J.; Wu, S.; Feng, Y.; Chen, Y.; et al. Effects of pre-pregnancy body mass index and gestational weight gain on maternal and infant complications. BMC Pregnancy Childbirth 2020, 20, 1–13. [Google Scholar] [CrossRef]
Rodríguez-Mesa, N.; Robles-Benayas, P.; Rodríguez-López, Y.; Pérez-Fernández, E.M.; Cobo-Cuenca, A.I. Influence of Body Mass Index on Gestation and Delivery in Nulliparous Women: A Cohort Study. Int. J. Environ. Res. Public Health 2019, 16, 2015. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shaukat, S.; Nur, U. Effect of prepregnancy maternal BMI on adverse pregnancy and neonatal outcomes: Results from a retrospective cohort study of a multiethnic population in Qatar. BMJ Open 2019, 9, e029757. [Google Scholar] [CrossRef] [Green Version]
Gao, X.; Yan, Y.; Xiang, S.; Zeng, G.; Liu, S.; Sha, T.; He, Q.; Li, H.; Tan, S.; Chen, C.; et al. The mutual effect of pre-pregnancy body mass index, waist circumference and gestational weight gain on obesity-related adverse pregnancy outcomes: A birth cohort study. PLoS ONE 2017, 12, e0177418. [Google Scholar] [CrossRef] [PubMed]
Laine, M.K.; Kautiainen, H.; Gissler, M.; Raina, M.; Aahos, I.; Järvinen, K.; Pennanen, P.; Eriksson, J.G. Gestational diabetes in primiparous women-impact of age and adiposity: A register-based cohort study. Acta Obstet. et Gynecol. Scand. 2017, 97, 187–194. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Yong, H.Y.; Shariff, Z.M.; Yusof, B.N.M.; Rejali, Z.; Tee, Y.Y.S.; Bindels, J.; Van Der Beek, E.M. Independent and combined effects of age, body mass index and gestational weight gain on the risk of gestational diabetes mellitus. Sci. Rep. 2020, 10, 1–8. [Google Scholar] [CrossRef]
Hashemi-Nazari, S.-S.; Najafi, F.; Rahimi, M.-A.; Izadi, N.; Heydarpour, F.; Forooghirad, H. Estimation of gestational diabetes mellitus and dose–response association of BMI with the occurrence of diabetes mellitus in pregnant women of the west of Iran. Health Care Women Int. 2018, 41, 121–130. [Google Scholar] [CrossRef] [PubMed]
Shao, B.; Mo, M.; Xin, X.; Jiang, W.; Wu, J.; Huang, M.; Wang, S.; Muyiduli, X.; Si, S.; Shen, Y.; et al. The interaction between prepregnancy BMI and gestational vitamin D deficiency on the risk of gestational diabetes mellitus subtypes with elevated fasting blood glucose. Clin. Nutr. 2020, 39, 2265–2273. [Google Scholar] [CrossRef]

Figure 1. Flow diagram outlining the literature search and selection based on risk factors of GDM development in pregnant women at early pregnancy.

Figure 2. The score-scaled GDM prediction model. (A) Pooled ORs and their corresponding 95% CIs for risk factors in the score-scaled GDM risk prediction model. Estimates were derived from the fully adjusted models in each included analysis. Red squares and horizontal bars represent the overall estimates and 95% CIs, respectively. (B) ROC curve for the validation cohort based on the score-scaled risk prediction model. (C) Prevalence of GDM in the four groups of the validation cohort. Pairwise comparisons were adjusted using the Bonferroni correction method. ^a different relative to the low-risk group by the Chi-squared test. ^b different as compared to the moderate risk group by the Chi-squared test.

Figure 3. The logistic regression analysis for GDM risk prediction model. (A) Lasso-logistic graph. (B) Cross-validation of lasso-logistic regression results. (C) Nomogram to predict the probability of developing GDM at the first trimester among pregnant women. The total score calculated by summing the scores of FBG stratification, GWG at the first trimester, the positive status of thyroid antibodies, acceptance for ART treatment, pre-BMI stratification, T2DM family history, and age stratification.

Figure 4. Discriminant results. ROC curve graphs for the training set (A) and the validation set (D). The calibration curve for the training set (B) and the validation set (E). The decision curve analysis for the training set (C) and the validation set (F).

Table 1. The score-scaled GDM risk prediction model.

Risk Factors for GDM	Category	Scores
Maternal age (years) *	<30	0
	30–34	5
	35–39	8
	≥40	9
T2DM family history	no	0
	yes	6
pre–BMI (kg/m²) **	<24	0
	24–27.9	4
	≥28	5
Age at menarche (year)	>11	0
	≤11	3
ART	no	0
	yes	2
The positive of related thyroid antibodies ***	no	0
	yes	5
Above IOM recommended GWG at the 1st trimester	no	0
	yes	2

GDM, gestational diabetes mellites; T2DM, type 2 diabetes mellites; pre-BMI, pregestational body mass index; ART, assisted reproductive technology; GWG, gestational weight gain. * Pregnant women in derivation and validation cohorts aged 18–45 years old. ** pre-BMI was categorized as <25.0, 25.0–29.9, and ≥30.0 kg/m² in white patients and <24.0, 24.0–27.9, and ≥28.0 kg/m² in Asians. *** Thyroid antibodies related to thyroid peroxidase antibodies (TPOAb) and the anti-thyroid peroxidase antibody (TgAb). p value < 0.05 is considered statistically significant.

Table 2. The clinical characteristics of training and validation cohorts.

Variables	Training Cohort (n = 765)	Validation Cohort 1 (n = 310)	Validation Cohort 2 (n = 210)
GDM	246 (32.2)	106 (34.2)	39(18.5)
Maternal age	31.77 ± 4.14	31.5 ± 4.03	31.24 ± 4.17
T2DM family history	70 (9.2)	33 (10.6)	15(7.1)
pre-BMI	21.97 ± 3.34	22.07 ± 2.97	21.4 ± 3.12
Age at menarche ≤ 11 yr	66 (8.6)	22 (7.1)	4(1.9)
ART	53 (6.9)	31 (10)	4(1.9)
Thyroid antibodies + (TPOAb/TgAb)	115 (15.0)	53 (17.1)	16(7.6)
Above IOM recommended GWG at the 1st trimester	134 (17.5)	57 (18.4)	34(16.1)
History of macrosomia	28 (3.7)	11 (3.5)	7(3.3)
Parity	1.50 ± 0.60	1.49 ± 0.65	1.22 ± 0.71
Vitamin B12 (pg/mL)	64.33 ± 7.34	65.36 ± 7.72	61.35 ± 6.51
Ferritin (ng/mL)	46.80 ± 5.57	47.03 ± 5.79	42.24 ± 4.63
Total protein (g/L)	69.05 ± 4.13	70.18 ± 3.88	65.43 ± 3.47
Albumin (g/L)	40.15 ± 2.31	40.0 ± 2.56	44.34 ± 3.27
Globulin (g/L)	29.90 ± 3.31	30.18 ± 3.25	31.58 ± 2.64
ALT (U/L)	17.11 ± 10.78	18.81 ± 11.83	17.31 ± 10.81
AST (U/L)	19.40 ± 9.38	18.66 ± 6.62	19.72 ± 7.24
CHO (mmol/L)	4.13 ± 0.73	4.16 ± 0.89	4.61 ± 0.63
TG (mmol/L)	1.51 ± 0.66	1.50 ± 0.75	1.60 ± 0.69
HDL-C (mmol/L)	1.67 ± 0.29	1.62 ± 0.27	1.80 ± 0.31
LDL-C (mmol/L)	2.31 ± 0.60	2.35 ± 0.58	2.43 ± 0.52
FBG (mmol/L)	5.04 ± 0.44	5.01 ± 0.41	4.88 ± 0.49

ALT, glutamic-pyruvic transaminase; AST, glutamic oxalacetic transaminase; CHO, total cholesterol; TG, triglyceride; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol; FBG, fasting blood glucose. p value < 0.05 is considered statistically significant. Validation cohort 1: a cohort comprising pregnant women at NWCH enrolled from Nov. 2019 to Mar. 2020. Validation cohort 2: a cohort of pregnant women at NWCH enrolled in May 2021.

Table 3. The multivariable logistic regression analysis in the training cohort.

	B	S.E.	Wald	P	OR (95%CI)
Age stratification	0.492	0.131	14.202	<0.001	1.636 (1.266–2.113)
T2DM family history	0.976	0.307	10.120	0.001	2.653 (1.454–4.838)
pre-BMI stratification	0.691	0.167	17.153	<0.001	1.996 (1.439–2.769)
ART	0.776	0.381	4.154	0.042	2.173 (1.030–4.585)
Thyroid antibodies + (TPOAb/TgAb)	1.381	0.269	26.423	<0.001	3.979 (2.350–6.737)
Above IOM recommended GWG at the 1st trimester	1.273	0.239	28.470	<0.001	3.573 (2.238–5.703)
FBG stratification	0.753	0.204	13.625	<0.001	2.124 (1.424–3.169)
Constant	−3.417	0.356	92.142	<0.001	0.033

p value < 0.05 is considered statistically significant.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, N.; Guo, H.; Jing, Y.; Song, L.; Chen, H.; Wang, M.; Gao, L.; Huang, L.; Song, Y.; Sun, B.; et al. Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods. Metabolites 2022, 12, 1040. https://doi.org/10.3390/metabo12111040

AMA Style

Wang N, Guo H, Jing Y, Song L, Chen H, Wang M, Gao L, Huang L, Song Y, Sun B, et al. Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods. Metabolites. 2022; 12(11):1040. https://doi.org/10.3390/metabo12111040

Chicago/Turabian Style

Wang, Ning, Haonan Guo, Yingyu Jing, Lin Song, Huan Chen, Mengjun Wang, Lei Gao, Lili Huang, Yanan Song, Bo Sun, and et al. 2022. "Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods" Metabolites 12, no. 11: 1040. https://doi.org/10.3390/metabo12111040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Development and Validation of Risk Prediction Models for Gestational Diabetes Mellitus Using Four Different Methods

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Sources

2.2. Outcomes

2.3. Clinical Measurements and Definitions

2.4. Data Collection and Detection of Plasma Biochemical Parameters

2.5. Derivation Cohort for the Score-Scaled GDM Risk Prediction Model

2.6. Statistical Analysis

2.6.1. Meta-Analysis

2.6.2. Multiple Imputations

2.6.3. The Logistic Regression Modeling Strategy

2.6.4. The Machine Learning (ML) Algorithms

2.7. Development and Validation of the Models

2.7.1. The Score-Scaled GDM Risk Prediction Model

2.7.2. Logistic Regression Analysis for GDM Risk Prediction Model

2.7.3. ML Prediction Models

3. Results

3.1. The Score-Scaled GDM Risk Prediction Model

3.1.1. Derivation Cohort

3.1.2. Validation Cohort

3.2. Logistic Regression Analysis for the GDM Risk Prediction Model

3.2.1. Training Set

3.2.2. Discriminant Analysis

3.3. Comparison of the Two Prediction Models

3.4. ML Models for GDM Prediction

3.5. Validation of the Established Models

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI