Next Article in Journal
PanelDesign: Integrating Epidemiological Information into the Design of Diagnostic NGS Gene Panels
Next Article in Special Issue
Identification of Cdk8 and Cdkn2d as New Prame-Target Genes in 2C-like Embryonic Stem Cells
Previous Article in Journal
Evaluation of a Four-Gene Panel for Hereditary Cancer Risk Assessment
Previous Article in Special Issue
MicroRNA-17-92a-1 Host Gene (MIR17HG) Expression Signature and rs4284505 Variant Association with Alopecia Areata: A Case–Control Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Additive and Interactive Genetically Contextual Effects of HbA1c on cg19693031 Methylation in Type 2 Diabetes

1
Department of Psychiatry, University of Iowa, Iowa City, IA 52242, USA
2
Department of Pediatrics, University of Iowa, Iowa City, IA 52242, USA
3
Department of Sociology, University of Georgia, Athens, GA 30602, USA
4
Behavioral Diagnostics LLC, Coralville, IA 52246, USA
5
Cardio Diagnostics Inc., Coralville, IA 52246, USA
*
Author to whom correspondence should be addressed.
Genes 2022, 13(4), 683; https://doi.org/10.3390/genes13040683
Submission received: 19 March 2022 / Revised: 5 April 2022 / Accepted: 11 April 2022 / Published: 13 April 2022
(This article belongs to the Special Issue Deciphering Epigenetic Signature in Human Health and Disease)

Abstract

:
Type 2 diabetes mellitus (T2D) has a complex genetic and environmental architecture that underlies its development and clinical presentation. Despite the identification of well over a hundred genetic variants and CpG sites that associate with T2D, a robust biosignature that could be used to prevent or forestall clinical disease has not been developed. Based on the premise that underlying genetic variation influences DNA methylation (DNAm) independently of or in combination with environmental exposures, we assessed the ability of local and distal gene x methylation (GxMeth) interactive effects to improve cg19693031 models for predicting T2D status in an African American cohort. Using genome-wide genetic data from 506 subjects, we identified a total of 1476 GxMeth terms associated with HbA1c values. The GxMeth SNPs map to biological pathways associated with the development and complications of T2D, with genetically contextual differences in methylation observed only in diabetic subjects for two GxMeth SNPs (rs2390998 AG vs. GG, p = 4.63 × 10−11, Δβ = 13%, effect size = 0.16 [95% CI = 0.05, 0.32]; rs1074390 AA vs. GG, p = 3.93 × 10−4, Δβ = 9%, effect size = 0.38 [95% CI = 0.12, 0.56]. Using a repeated stratified k-fold cross-validation approach, a series of balanced random forest classifiers with random under-sampling were built to evaluate the addition of GxMeth terms to cg19693031 models to discriminate between normoglycemic controls versus T2D subjects. The results were compared to those obtained from models incorporating only the covariates (age, sex and BMI) and the addition of cg19693031. We found a post-pruned classifier incorporating 10 GxMeth SNPs and cg19693031 adjusted for covariates predicted the T2D status, with the AUC, sensitivity, specificity and precision of the positive target class being 0.76, 0.81, 0.70 and 0.63, respectively. Comparatively, the AUC, sensitivity, specificity and precision using the covariates and cg19693031 were only 0.71, 0.74, 0.67 and 0.59, respectively. Collectively, we demonstrate correcting for genetic confounding of cg19693031 improves its ability to detect type 2 diabetes. We conclude that an integrated genetic–epigenetic approach could inform personalized medicine programming for more effective prevention and treatment of T2D.

1. Introduction

Type 2 diabetes mellitus (T2D) is the seventh leading cause of morbidity and mortality in the US and is responsible for $327 billion annually in economic damage [1]. T2D is diagnosed in the presence of sustained hyperglycemia that is above the threshold that predisposes to microvascular complications [2]. However, hyperglycemia is the end product of several pathophysiological processes that eventually converge on the inability of the pancreatic β cells to produce enough insulin to meet the demands of target tissues. Our inability to identify and adequately understand this etiological heterogeneity underpins the diagnostic and prognostic challenges faced in clinical practice. Thus, novel precision diagnostic methods that refine the characterization of diabetes to optimize prognostication and therapies are highly desired to improve clinical decisions and patient outcomes.
T2D has a complex genetic and environmental architecture that underlies its development and clinical presentation [3]. Genome-wide association studies (GWAS) have identified more than 500 common genetic variants associated with the disease. However, in aggregate, these explain less than 20% of the heritability [4]. Other attempts to elucidate the disease etiology include identifying epigenetic alterations that mediate environmental influences on aberrant gene expression. DNA methylation (DNAm) is the most widely studied and characterized epigenetic modification due to its scalability and its potential for clinical translation. Several investigators have demonstrated that the DNAm status at cg19693031 is highly correlated with both HbA1c values and T2D [5,6,7]. However, these prior studies failed to account for potential confounding genetic influences. Given the dynamic cross-talk between genetic variation, DNA methylation and environmental exposures, these factors should not be considered as self-standing layers of pathophysiological regulation but should only be interpreted in light of each other [8]. For example, several genetic variants for common complex diseases are not detected in GWAS analyses when the environmental influences are not included. Thus, this partially informed approach of traditional GWAS/EWAS analyses fails to adequately explain the interconnected network that underlies the development and progression of diabetes, which limits our ability to prevent and treat the disease.
In this communication, we use an omni-genic approach in an attempt to explain how the interactions between cg19693031 methylation and common variants in the cellular context of hyperglycemia can increase disease susceptibility. We then implement functional analyses to investigate possible mechanisms explaining the association between the methylation status of TXNIP’s 3′UTR and prevalent T2D. Finally, we examine if correcting for genetic confounding improves the ability of cg19693031 to detect T2D.

2. Materials and Methods

2.1. Subjects and Recruitment

This study uses genomic and biological data from Wave 5 of primary caregivers (PC) of the Family and Community Health Study (FACHS). The overall design and methods used in the FACHS study have been described previously [9]. Each of these procedures was approved by the University of Iowa Institutional Review Board (IRB 200802719). The data from a total of 506 subjects were retained for further analyses.
DNA was prepared using standard cold protein precipitation [10]. Glycosylated hemoglobin A1c (HbA1c) levels were determined by the University of Iowa Diagnostic Laboratories (https://medicine.uiowa.edu/uidl/, accessed on 4 February 2018) using turbidimetric immune inhibition [11] in compliance with standard Clinical Laboratory Improvement Amendments (CLIA) procedures [12].

2.2. Genome-Wide DNA Methylation

Genome-wide methylation was determined by the University of Minnesota Genome Center (http://genomics.umn.edu/, accessed on 10 March 2018) using the Infinium MethylationEpic Beadchip (Illumina, San Diego, CA, USA) according to the manufacturer’s protocol. Standard sample and probe level quality control of the resulting data were conducted as previously described [13].

2.3. Genome-Wide Genotypes

Genome-wide genotype information also was determined by the University of Minnesota Genome Center using the Multi-Ethnic Global-8 Beadchip (Illumina, San Diego, CA, USA) according to the manufacturer’s suggested protocol. Quality control was performed at both the sample and single nucleotide polymorphism (SNP) probe levels in PLINK [14]. SNP probes with a minor allele frequency >5% were retained. Linkage disequilibrium-based SNP pruning was performed with a window size of 50 SNPs, window shift of 5 SNPs and a pairwise SNP-SNP LD threshold of 0.5. A total of 516908 variants for 506 subjects passed filters and QC. For the purposes of these analyses, genotypes were recoded as 0, 1 or 2 in PLINK.
Genotypes at rs7211, rs7212 and rs9245 were determined by our standard primer probe genotyping procedures using fluorescent hydrolysable probe assays and reagents from ThermoFisher (Waltham, MA, USA) according to manufacturer’s instructions [15].

2.4. Phenotypes

Phenotypes that were considered in the analyses include age, gender, body mass index (BMI) and type 2 diabetes mellitus (T2D) status. The age used was the age of the subject at the time of the Wave 5 interview. Gender was coded as a binary variable representing self-reported biological sex. BMI was calculated using the measured height and weight of the subject (kg/m2) and used as a continuous variable.
T2D status was determined for each subject by binning the HbA1c value at the time of the W5 interview. The binning criteria were adapted from the clinical diagnostic guidelines set by the American Diabetes Association; i.e., those with an HbA1c ≤ 5.7 were binned as normoglycemic controls, HbA1c between 5.7–6.5 were binned as pre-diabetic and HbA1c ≥ 6.5 as diabetic [2]. Normoglycemic control subjects self-reporting a T2D diagnosis and any non-diabetic subject self-reporting the use of anti-diabetic pharmacotherapy were removed from the final dataset.

2.5. Identification of HbA1c-Associated DNAm Loci

The identification of DNAm loci associated with HbA1c values was determined by fitting a linear regression model in R v.3.1.2 using the following equation:
H b A 1 c   ~   M e t h i + A g e + S e x + B M I
A total of 861,916 independent analyses were conducted. The DNAm loci with a genome-wide significance after FDR correction (α = 0.05) were retained for further analyses.

2.6. Identification of Stochastic Epimutations

The distribution and variability of methylation values were evaluated using boxplots for each of the identified HbA1c associated DNAm loci. Stochastic epimutations (SEMs) were defined as methylation outliers exceeding three times the interquartile range (IQR), consistent with the definition of Gentilini and colleagues [16]. SEMs were further classified based on the methylation value compared to the average of the population, as previously described by Wang and associates [17]. High methylation outliers (HMO) were defined using the following equation:
HMO = Q 3 + ( 3 × IQR )
Similarly, low methylation outliers (LMO) were defined using the following equation:
LMO = Q 1 ( 3 × IQR )
HbA1c-associated DNAm probes containing SEMs were removed from further analyses.

2.7. Identification of HbA1c-Associated Polymorphic CpGs

DNAm probes potentially affected by polymorphisms at the target site were identified using methods previously described. In brief, the Infinium MethylationEPIC BeadChip manifest file was used to identify SNPs found at the target site for both Infinium Type I and Type II probes. To identify those associated with HbA1c values, a linear regression model was fit in R using the following equation:
H b A 1 c   ~   M e t h i + A g e + S e x + B M I
A total of 302,943 independent analyses were conducted. The DNAm loci with a genome-wide significance after FDR correction (α = 0.05) were retained for further analyses.

2.8. Identification of meQTL Effects of cg19693031 Methylation

To understand the genetically contextual effects on cg19693031 methylation, meQTL analyses in both cis and trans were conducted. Linear regression models were fit in R using the following equation:
M e t h   ~   S N P i + A g e
A total of 117 independent cis analyses of potential interaction effects with cg19693031 were conducted. Further, meQTLs acting in cis were defined as SNPs located on chromosome 1 and within 1Mb of cg19693031. A total of 38750 independent long-range cis analyses were conducted, defined as SNPs located on chromosome 1 and located > 1 Mb of cg19693031. A total of 459464 independent trans analyses were conducted, defined as SNPs not residing on chromosome 1. SNPs with nominally significant effects were retained for further analyses (α = 0.05).

2.9. Genetic + Environmental (G + E) Analyses

The additive genetic and environmental effects on cg19693031 methylation were assessed using the following equation in R:
M e t h   ~   S N P i + S N P j + H b A 1 c + A g e
SNPs with nominally significant effects were retained for further analyses (α = 0.05).

2.10. Genetic x Environment (GxE) Analyses

The genetically contextual effects of HbA1c on cg19693031 methylation were assessed using the following equation in R:
M e t h   ~   S N P i × H b A 1 c + S N P i + H b A 1 c + A g e
In particular, the interaction term p-value was used for the selection and ranking of SNPs with nominally significant GxE effects. A total of 117, 38,750 and 459,464 independent cis, long-range cis and trans analyses were conducted, respectively.

2.11. Local TXNIP GxMeth Logistic Regression Classifiers for T2D Status

Training and test datasets were prepared to develop and evaluate each classifier for T2D status in Python v.3.8.10 [18]. A total of 137, 260 and 90 normoglycemic controls, prediabetic and diabetic subjects were retained for logistic modeling. These subjects were used to generate the training (70%) and testing (30%) datasets. T2D status was coded as binary variable, with the more severe dysglycemic status reflecting the positive target class. The target class was stratified between the training and testing datasets to reflect the original class ratio. The performance of each model was evaluated using the receiver operating characteristic (ROC) area under the curve (AUC), sensitivity, specificity and precision of the more severe dysglycemic target class.

2.12. Identification of Distal GxMeth Interactions

The identification of HbA1c-associated distal significant Genetic x Methylation (GxMeth) interactions was determined by fitting a linear regression using the following equation in R:
H b A 1 C   ~   c g 19693031 × S N P i + S N P i + c g 19693031 + A g e + S e x + B M I
Additive and interactive GxMeth terms were evaluated for each SNP that survived data cleaning and filtering, with a total of 614,729 independent analyses conducted. The SNPs of each GxMeth interactive term with genome-wide significance after Bonferroni correction (α = 0.05) were retained for further analyses. The R package CMplot was used to construct the Manhattan plot [19].

2.13. Distal GxMeth Functional SNP Mapping

Each SNP with a significant GxMeth interaction term was annotated using the Multi-Ethnic Global-8 Annotation File. The 560 mapped genes were used to generate a protein–protein interaction (PPI) network using the STRING database [20]. To investigate the functions of the GxMeth genes, functional enrichment analysis was performed using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) [21].

2.14. Integrated Genetic–Epigenetic Balanced Random Forest Classifier for T2D Status

A series of balanced random forest classifiers (BRF) were built using scikit-learn in Python v.3.8.10 to evaluate the predictive ability of integrated GxMeth models for discriminating normoglycemic controls versus diabetic subjects. Each BRF classifier was built using a repeated stratified cross-validation (10 splits and 3 repeats) approach. The dimensionality of the dataset was reduced by taking advantage of the implicit feature selection of the random forest using the SelectFromModel function in sklearn. A BRF classifier was built as previously described to evaluate the selected features for predicting T2D status (normoglycemic controls versus diabetic subjects). The selected features were then ranked according to their Gini importance, with the top 13 features retained.

3. Results

3.1. Clinical and Demographic Characteristics

The key demographic and clinical characteristics of the 506 subjects are given in Table 1. The subjects in each group were almost exclusively African American, and predominantly female. Congruent with the clinical observations, the diabetic subjects were significantly older and more obese than the normoglycemic and prediabetic subjects (p = 1.29 × 10−9 and p = 1.12 × 10−6, respectively). While the systolic and diastolic blood pressures were not significantly different between the groups, there was a higher prevalence of self-reported hypertension and anti-hypertensive therapy in the diabetic subjects. A total of 73 subjects self-reported a history of diabetes, with 70% of these subjects having an HbA1c ≥ 6.5% despite only 4% reporting anti-diabetic pharmacological treatment.

3.2. Identification of HbA1c-Associated DNA Methylation Loci

A total of 514,292 methylation probes survived the data cleaning and quality control measures. To identify HbA1c-associated DNAm loci, each probe was regressed against HbA1c values, controlling for age, sex and BMI. A total of 23 DNAm probes were associated with HbA1c at a false-discovery rate (FDR) of <5% (Table 2). Consistent with prior studies, cg19693031 was the most highly associated probe (R2 = 0.1976, p = 4.43 × 10−12). As illustrated in Figure 1, cg19693031 was significantly demethylated in diabetic subjects compared to both normoglycemic controls (62% vs. 66%; t = 5.089, p = 1.06 × 10−6) and prediabetic subjects (62% vs. 66%; t = −4.4165, p = 2.12 × 10−5).
The distribution and variability of the methylation values were evaluated for each of the identified HbA1c-associated DNAm loci. With the exception of cg19693031 and cg26823705 (Figure 2), the associations to HbA1c were driven by outliers. Consistent with the definition by Gentilini and colleagues, a total of 49 stochastic epimutations (SEMs) were identified in 27 subjects across the 23 DNAm loci. With the exception of cg1963031, SEMs were observed in each probe. The SEMs were significantly enriched in diabetic subjects (OR = 0.135, 95% CI = 0.072–0.248, p-value = 1.40 × 10−11, two-tailed Fischer’s exact test), with 30 of the 49 total number of SEMs found in diabetic subjects. Further, six subjects had more than one SEM across the twenty-three probes, four of whom were diabetic.
The SEMs were further classified into two groups based on the methylation value compared to the population. A total of 26 SEMs were hypermethylated and classified as high methylation outliers (HMO), and a total of 23 SEMs were hypomethylated and classified as low methylation outliers (LMO). Each of the 22 probes contained either HMOs (n = 11) or LMOs (n = 11); however, none contained both. Notably, 14 (54%) and 23 (70%) of the identified HMOs and LMOs were observed in diabetic subjects.
To further characterize the HMOs and LMOs, we examined their locations in relation to CpG island regions. The majority of DNAm loci with HMOs resided in CpG islands (n = 7), with the remainder in open seas (n = 3). Notably, the Hidden Markov Model predicts an additional three of these loci are located within an island. Comparatively, the majority of DNAm loci with LMOs resided in the open sea (n = 8), with the remainder in subsequent shores (n = 1), shelves (n = 1) and islands (n = 1). These findings are consistent with those previously reported by Wang et al. [17]. All the DNAm loci with SEMs were excluded from further analyses.
A total of 37009 SEMs were detected in 64% (n = 13,528) of the 21269 HbA1c-associated DNAm probes with nominal significance. The total number of SEMs per probe ranged from 0 (n = 7740) to 102 across the 506 subjects. Of the 13,528 probes with SEMs detected, a majority (84%) contained less than five SEMs across the cohort. Additionally, a majority of the SEMs detected were HMOs (n = 25,620). The total number of SEMs per subject ranged from 3 to 7271 across the 21,269 probes, with a median of 20 ± 410 SEMs per subject. No relationship was observed in the number of SEMs with T2D status, age, gender or BMI. With the exception of cg19693031 and cg26823705, the association of each probe to HbA1c diminished after the removal of SEM observations.

3.3. Identification of HbA1c-Associated Polymorphic DNAm Loci

A total of 302,943 polymorphic probes were identified using the EPIC manifest file. Each probe was regressed against HbA1c values adjusted for age, sex and BMI. A total of 54 polymorphic probes were associated with HbA1c at an FDR < 5%. The top 30 ranked HbA1c-associated polymorphic probes are described in Table 3.
Of the 54 polymorphic DNAm probes, the island status of 11 were identified. The majority resided in shores (n = 7), with the remainder in subsequent islands (n = 3) and shelves (n = 1). The 54 probes mapped to 37 unique genes, with functional enrichment using UniProt for alternative splicing (FDR p-value = 0.0016). These genes also mapped to several relevant KEGG pathways, including type 1 diabetes mellitus, insulin secretion, PI3K-Akt signaling and diabetic cardiomyopathy.

3.4. Genome-Wide cg19693031 meQTL Analyses

Methylation quantitative trait loci (meQTL) analyses were conducted in both cis and trans to identify genetic factors that confound the methylation status of cg19693031. As underlying genomic mechanisms may differ according to proximity, our analyses were separated into: (1) cis meQTL (SNP-CpG distance < 1 Mb, n = 117), (2) long-range cis meQTL (>1 Mb intra-chromosomal, n = 38,750) and (3) trans meQTL (inter-chromosomal, n = 459,464). A total of 7 and 2052 meQTLs acting in cis and long-range cis were detected with nominal significance, with the degree of explained methylation variance ranging from 0.45% to 1.35% and 0.41% to 4.16%, respectively. While the cis long-range meQTLs explained more variance than those acting in cis, the magnitude of explained variance did not correlate with the SNP’s proximity to cg19693031. The meQTLs acting in cis are described in Table 4, while the top 30 long-range cis-meQTLs are described in Table 5. A total of 23,734 meQTLs acting in trans were also detected with nominal significance, with the degree of explained methylation variance ranging from 0.39% to 4.35%. Similar to those acting in cis, there were no observable differences between the chromosomes. The top 30 trans meQTLs are described in Table 6.
To determine the magnitude of meQTL effects on cg19693031 methylation, a multivariate linear regression model was built using meQTLs that demonstrated an R2 ≥ 0.03 (long-range cis (n = 5) and trans (n = 18)). Using an additive step-wise approach, genetic factors were ranked and fed into the model based on their adj R2. The associations between two meQTLs and cg19693031 methylation were no longer observed when incorporated into the model and were dropped from further analyses. The final meQTL model captured 43.23% of the variance of cg19693031 methylation (p < 2.2 × 10−16).
It has been previously suggested that some of the meQTL effects may in fact be occurrences of genetic and environmental interactions (GxE). To better understand the genetic and environmental confounding architecture influencing the methylation of cg19693031, we assessed whether additive or interactive effects explain the variability of cg19693031 methylation better than genetic factors alone. As a means of comparison, HbA1c was first regressed against cg19693031 methylation adjusted for age (R2 = 0.1314, p < 2.2 × 10−16). With the incorporation of HbA1c values to the final meQTL model (G + E), the amount of variance captured increased by 6% (R2 = 0.4968, p < 2.2 × 10−16). Using ANOVA, the additive G + E model significantly outperformed the genetic factors alone (p = 2.62 × 10−14).
Next, GxE analyses were conducted and compared to the G + E effects for each of the 21 retained meQTLs. Only one meQTL, rs74439, demonstrated a significant GxE term (p = 0.0146). The GxE effect explained ~ 1% more variance than the G + E model (R2 = 0.1712 vs. R2 = 0.1630) and demonstrated a significantly better fit (ANOVA: p = 0.0156). Incorporating the rs74439 GxE term to the final G + E meQTL model captured an additional 1% of the variance (Table 7: R2 = 0.5077, p < 2.2 × 10−16), and explained the variability of cg19693031 methylation better than genetic or additive G + E factors alone (ANOVA: p = 8.896 × 10−16 and p = 7.49 × 10−4, respectively).

3.5. Functional Mapping of meQTL SNPs

Next, we performed secondary analyses to better characterize the functional effects of genetic variation on cg19693031 methylation. The leading hypothesis to explain cis-meQTL effects is that, if a meQTL is located in a TFBS and hinders the TF from binding, the methylation of surrounding CpGs can be altered directly (recruitment of DNMT or TET enzymes) or indirectly.
Of the 25,792 meQTLs, a total of 21 had confirmed associations with T2D via genome-wide association studies. Additionally, a total of 986 are located within exons and 335 are missense mutations. The meQTL SNPs mapped to 4687 genes, with enrichment of two or more SNPs found within 2075 genes. These genes functionally map to several processes, including: alternative splicing (p = 2.09 × 10−25), ABC transporters (p = 1.7 × 10−4), insulin secretion (p = 8.1 × 10−4), lymphocyte mediated immunity (p = 2.61 × 10−4), immunoglobulin production (p = 1.19 × 10−4), complement activation (p = 2.50 × 10−6), immune system process (p = 2.32 × 10−7) and calcium ion transmembrane transporter activity (p = 6.23 × 10−4).
To gain insights into the molecular networks involved in nuclear regulation underlying the relationship between genetic variation and cg19693031 methylation, we assessed whether the proximal candidate gene at a trans-acting locus showed covariation with cg19693031 methylation. Using data from the eQTLGen Consortium (n = 31,684 samples), a total of 14,826 cis-eQTL associations with 5955 unique trans-meQTLs were identified with Bonferroni-corrected significance, with 3249 meQTL SNPs influencing the expression of 2+ genes. The identified cis-eQTLs mapped to 6861 unique genes, with enrichment of two or more cis-eQTLs influencing 3542 genes.
While the biological mechanisms underlying cis-meQTLs are easily understood and experimentally demonstrated, few clear examples for trans have been uncovered. The simplest hypothesis is that SNPs that act as eQTLs of global methylation regulators have downstream effects as trans-acting meQTLs. A similar potential mechanism is that an SNP residing in the coding region of methyl-specific-binding-proteins could alter their specificity and function and, therefore, modify the DNAm of their binding sites. To that end, several of the identified eQTL genes are known methylomic regulators, including: DNAm writers (DNMT1 and DNMT3a), readers (MBD2, MBD5, UHRF1, UHRF2 and BAZ2B) and erasers (TDF and SMUG1). Further, several of the eQTL genes are known regulators of TXNIP, including: transcription factors (FOXO1, HSF2, RXRA and KLF6), chromatin modifications (HDACs and NF-YA) and factors regulating mRNA turnover (mir-17, ZFP36, NEDD1, NEDD4, NEDD9, WWP1, SMURF2 and HECW2). Provided that distal residues can be brought into physical proximity by 3D structures, an SNP could affect DNAm levels at CpG sites in trans by acting either through cis-meQTL mechanisms or by disrupting the formation of structural loops. To that end, two of the three cohesion subunits were identified as an eQTL gene (SMC3 and RAD21).
To gain insights into the potential role of trans meQTL effects on cg19693031 in the pleiotropic functions of TXNIP, functional gene mapping was conducted utilizing KEGG and GO. The eQTL genes mapped to key pathways that are associated with TXNIP’s pathophysiological role in diabetes and coronary heart disease, including: metabolic pathways (n = 504), PI3K-Akt signaling (n = 95), MAPK signaling (n = 81), cytokine-cytokine receptor interaction (n = 72), Rap1 signaling (n = 66), lipid and atherosclerosis (n = 65), ras signaling (n = 59), calcium signaling (n = 56), mTOR (n = 55), cAMP signaling (n = 52), diabetic cardiomyopathy (n = 49), insulin signaling pathway (n = 48), apoptosis (n = 46), Wnt signaling (n = 45), insulin resistance (n = 45), AMPK signaling (n = 45), FoxO signaling (n = 44), natural killer cell mediated cytotoxicity (n = 44), non-alcoholic fatty liver disease (n = 43), TNF signaling (n = 39), leukocyte transendothelial migration (n = 37), NK-kappa B signaling (n = 35), AGE-RAGE signaling pathway in diabetic complications (n = 34), TGF-β signaling (n = 33), b cell receptor signaling (n = 33), glucagon signaling (n = 27), type 1 diabetes (n = 23), PPAR signaling (n = 21), hypertrophic cardiomyopathy (n = 21), regulation of lipolysis in adipocytes (n = 18), glycolysis and gluconeogenesis (n = 18), insulin secretion (n = 18), ABC transporters (n = 17) and type 2 diabetes (n = 12). The eQTLs identified in type I diabetes and AGE-RAGE signaling in diabetic complications are illustrated in Figure 3 and Figure 4, respectively. TXNIP exerts its effects, in part, through the negative regulation of thioredoxin, which both TXN and TXN2 were identified as eQTL genes. Further, the eQTL genes also mapped to pathways associated with TXNIP’s pathophysiological role in cancer and neurodegenerative diseases, including: pathways in cancer (n = 159), pathways in neurodegeneration (n = 135), Alzheimer’s disease (n = 105), Huntington disease (n = 81), Parkinson’s disease (n = 68), microRNAs in cancer (n = 66), proteoglycans in cancer (n = 65), chemical carcinogenesis—ROS (n = 60), transcriptional regulation in cancer (n = 59), hepatocellular carcinoma (n = 52), Hippo signaling pathway (n = 50), cellular senescence (n = 49), breast cancer (n = 34) and p53 pathway (n = 29).

3.6. Genome-Wide cg19693031 GxE Analysis

To gain a better understanding of the genetically contextual effect of HbA1c levels on cg19693031 methylation, genome-wide GxE analyses were conducted. A total of seven and 1355 GxE terms acting in cis and long-range cis were detected with nominal significance, with the degree of explained methylation variance ranging from 13.31% to 14.05% and 12.32% to 16.67%, respectively. A total of 16,760 GxE terms acting in trans were detected with nominal significance, with the degree of explained variance ranging from 10.98% to 17.80%. Of the 18,797 GxE terms, a total of 964 were identified in our meQTL analyses (long-range cis: n = 81; trans: n = 883).
To determine the magnitude of GxE effects on cg19693031 methylation, an additive step-wise linear regression was fit using GxE terms that demonstrated an adj R2 ≥ 0.1550 and did not contain missing values (n = 44). Notably, a majority of the retained GxE terms were previously identified as meQTLs (n = 32). The features were ranked and fed into the model based on their adj R2, with retainment dependent on each feature remaining significant and a significant increase in the captured variance. A total of six GxE terms were retained for the final GxE model. The final model explained 27.78% of the variance of cg19693031 methylation (Table 8: p < 2.2 × 10−16), and significantly improved the fit compared to HbA1c alone (ANOVA: p < 2.2 × 10−16).
Next, we performed secondary analyses to better characterize the functional effects of the genetic variation of the GxE loci on cg19693031 methylation. Of the 18,797 SNPs, a total of nine had confirmed associations with T2D via genome-wide association studies. Additionally, a total of 681 are located within exons and 261 are missense mutations. The 18,797 GxE SNPs mapped to 3808 genes, with enrichment of two or more SNPs found within 1533 genes. Interestingly, these genes functionally mapped to alternative splicing (p = 1.183 × 10−36, the same as the HbA1c-associated polymorphic DNAm probes and meQTLs. Similar to the meQTLs, the GxE network mapped to several key TXNIP mechanistic pathways, including the cell adhesion (p = 1.74 × 10−11), glutamatergic synapse (p = 1.89 × 10−5), calcium signaling pathway (p = 1.7 × 10−4), insulin secretion (p = 1.6 × 10−3) and PI3K-Akt signaling pathway (p = 1.6 × 10−3).

3.7. Integrated Model of Genetic and Environmental Effects for cg19693031

Next, we assessed whether an integrated genetic and environmental model accounting for additive and interactive effects explains cg19693031 methylation variability better than either alone. Using a similar step-wise approach, a total of 20 G + E and 6 GxE terms were used to build the integrated model. The final integrated model captured 55.15% of the methylation variance (Table 9: p < 2.2 × 10−16) and performed significantly better than the genetic factors alone (ANOVA: 3.79 × 10−8).

3.8. cg19693031 Methylation Classifier for Predicting T2D Status

To determine the ability of cg19693031 to predict T2D status, a series of logistic regression classifiers were built, adjusting for age, sex and BMI. The covariate and cg19693031 models performed equally well in distinguishing diabetic from normoglycemic subjects (Table 10A: AUC = 0.72 and AUC = 0.73, respectively), and prediabetic from normoglycemic subjects (Table 10B: AUC = 0.52 and AUC = 0.53, respectively). However, the model incorporating cg1963031 performed better than the covariate model alone in distinguishing diabetic from prediabetic subjects (Table 10C: AUC = 0.68 and AUC = 0.72, respectively).
To determine if correcting for confounding local TXNIP genetic variation improves the predictive ability of the cg19693031 models for T2D status, a series of logistic regression classifiers were built, adjusting for age, sex and BMI. Three SNPS located within TXNIP that were previously identified as being associated with diabetes and cardiovascular disease were assessed. Of these, two were located near cg19693031 in the 3′UTR (rs7211 and rs7212), and one SNP was located in the 5′UTR. The incorporation of any local SNP or the addition of all three did not significantly improve the cg19693031 models for T2D status (Table 10).

3.9. Genome-Wide cg19693031 GxMeth Analyses for HbA1c

Based on the understanding that trans genetic variation can confound DNA methylation status, we conducted a genome-wide analysis of SNP interactions with cg19693031 using the Multi-Ethnic Beadchip. The HbA1c values were regressed against additive and interactive GxMeth terms for each SNP, controlling for age, sex and BMI, with a total of 614,729 independent analyses conducted. After correcting for multiple comparisons, a total of 1476 and 1987 SNPs had significant GxMeth and additive effects, respectively (Figure 5). The top-25-ranked GxMeth SNPs are provided in Table 11.
Next, we performed secondary analyses to better characterize the functional effects of genetic variation in the GxMeth loci on cg19693031 methylation. First, we conducted methylation quantitative trait loci (meQTL) analysis to determine the extent to which genetic variation in the GxMeth loci influences cg19693031 methylation. After Bonferroni correction, rs2390998 was identified as a meQTL (p = 0.045) with a small effect size (0.04, 95% CI = [0.01, 0.08]). As illustrated in Figure 6A, the average methylation across the 506 subjects was significantly lower in rs2390998 heterozygotes (AG) than major allele homozygotes (GG) (pHolm-corrected = 0.005, Δβ = 2%). Notably, the genetic influence of rs2390998 on methylation was only prominent in diabetic subjects, with the most significant difference in methylation seen between the homozygote genotypes (Figure 6B; pHolm-corrected = 4.63 × 10−11, Δβ = 13%; effect size = 0.16, 95% CI = [0.05, 0.32]). Next, we conducted a genetic x environment analysis (GxE) to determine the genetically contextual effect of HbA1c levels on cg19693031 methylation. After Bonferroni correction, rs1074390 had a significant GxE effect (p = 0.048; effect size = 0.165, 95% CI [0.11, 0.22]. While the average cg19693031 methylation did not significantly differ as a function of rs1074390 genotype across the cohort (Δβ = 1%), a significant difference was seen when stratified by T2D status (Figure 7). Similar to rs2390998, differences in methylation across the rs1074390 genotype were observed only in diabetic subjects, with the most significant difference in methylation observed between the homozygote genotypes (pHolm-corrected = 3.92 × 10−4; Δβ = 9%; effect size = 0.38, 95% CI [0.12, 0.56].
The significant GxMeth SNPs mapped to 560 unique genes, with enrichment of two or more SNPs found within 103 genes. Of those 560 genes, 167, 144 and 22 genes had confirmed associations to metabolic and cardiovascular disorders and type 2 diabetes, respectively [4,22]. To gain functional insights into these significant GxMeth interactions, a protein–protein interaction network (PPI) was generated using data from the STRING database. A total of 487 genes were correctly matched to the database. Figure 8 illustrates this network, with nodes only shown if they have an edge with a minimum interaction score of 0.9. This network consisted of 477 nodes with 158 edges (PPI enrichment p = 4.1 × 10−9), with an average node degree of 0.662. This network mapped to several biological process GO pathways that are highly associated with the development and complications of T2D, including regulation of voltage-gated sodium channel activity (p = 1.33 × 10−2), protein kinase C-activating G protein-coupled receptor signaling pathway (p = 1.26 × 10−2), cell communication involved in cardiac conduction (p = 1.41 × 10−2), cardiac muscle cell action potential involved in contraction (p = 3.65 × 10−2), regulation of heart rate (p = 6.63 × 10−3), calcium-mediated signaling (p = 2.30 × 10−2) and angiogenesis (p = 1.13 × 10−2). This network also mapped to several KEGG pathways, including glutamatergic synapse (p = 3.8 × 10−3), dilated cardiomyopathy (p = 7.5 × 10−3), Rap1 signaling pathway (p = 8.1 × 10−3), PI3K-Akt signaling pathway (p = 1.56 × 10−2), phospholipase D signaling pathway (p = 2.2 × 10−2), hypertrophic cardiomyopathy (p = 2.27 × 10−2) and Wnt signaling pathway (p = 2.42 × 10−2).

3.10. Integrated Genetic–Epigenetic Classifier for Predicting T2D Status

To determine if correcting for genetic factors with significant GxMeth effects improves the predictive ability of cg19693031 for T2D status, a series of balanced random forest (BRF) classifiers were built, adjusting for age, sex and BMI (Table 12). Using a repeated stratified k-fold cross-validation approach, each BRF classifier with random under-sampling was built to evaluate the features’ ability to discriminate between control and diabetic subjects. Similar to the logistic regression modeling, two base classifiers were built using the covariate variables solely and another incorporating cg19693031 (AUC = 0.68 and AUC = 0.71, respectively). Next, the 1476 GxMeth genetic factors were incorporated into the model, with a decrease in performance observed in comparison to the base models (AUC = 0.67). This decrease in performance is reflective of the random forest’s known vulnerability to overfitting due to correlated features with little discriminatory value.
To resolve this, we performed multivariate ensemble feature selection using random forest with a subsequent BRF classifier to reduce the dimensionality of the dataset. Multivariate approaches consider conditional higher-order interactions between several or all features simultaneously when measuring each feature’s relevance to the overall classification task. A total of 554 features were selected by the random forest learner, including the covariates, cg19693031 and 550 non-local GxMeth SNPs. Using a repeated stratified k-fold cross-validation approach with random under-samplings of the minority class to counteract class imbalance, a BRF model was built to evaluate the selected features for predicting control versus diabetic subjects. The performance of the model improved, with an AUC of 0.74. To further reduce the dimensionality of the dataset, we used the Gini importance to select the top-ranked relevant features. The selected features were ranked according to their Gini Importance, and the top 13 features were retained (Table 13). Using these features, a final BRF classifier was built. The classifier performed better than the previous models, with an AUC of 0.76 (81% sensitivity, 70% specificity). The confusion matrix and ROC curve are illustrated in Figure 9.

4. Discussion

Using data from a group of African American adults, we confirm and extend the prior findings, showing that the methylation of cg19693031 has a significant dose-dependent relationship with HbA1c. We identify confounding genetic variations influencing the methylation status of cg19693031 both independent of and in combination with HbA1c. The pathway analyses identified the regulatory mechanisms related to the 3D genomic structure and TXNIP’s role in disease susceptibility. Finally, we demonstrate that correcting for genetic confounding improves the ability of cg19693031 to detect type 2 diabetes. Collectively, our analyses suggest that the demethylation response of cg19693031 to sustained hyperglycemia in white blood cells may play a pivotal role in the regulatory network effects that underlie several clinical phenotypes, with insights into TXNIP’s pleiotropic functions for disease onset and progression.
To the best of our knowledge, our study is the first full integration of genetic and hyperglycemia effects on cg19693031 methylation, laying the groundwork for a more comprehensive understanding of the molecular mechanisms linking the methylation status of TXNIP’s 3′UTR to disease. Collectively, we identify 11,240 genes that participated in significant interactions. The enrichment and functional genomic analyses suggest the crosstalk between genetic variants and the demethylation response of cg19693031 may contribute to the dysregulation of gene expression seen in diabetes. For example, TXNIP has been shown to be induced by glucose and overexpressed in diabetic patients [23]. In addition, the TXNIP locus is complexly regulated by multilayered mechanisms, including transcriptional regulation, microRNA, mRNA stabilization and protein degradation [24]. Our analyses identified several of these regulators, including transcription factors (FOX01, HSF2, RXRA and KLF6) and modulators of mRNA stability (mir-17, ZFP36, NEDD1, NEDD4, NEDD9 and WWP1). In addition, several factors known to influence genome and epigenome plasticity were identified, including methylome regulators, histone modifiers and subunits of the cohesion complex. While our analyses suggest that the interplay between cg1963031 methylation and genetic variation may influence the transcriptional program of leukocytes in the context of hyperglycemia, external validation with expression data is needed.
Prior studies have suggested the utility of TXNIP as a biomarker for diabetes due to its hyperglycemia-induced overexpression and the strong association of cg19693031 with HbA1c [7]. While the demethylation of cg19693031 correlates to an increase in TXNIP transcription [5], this upregulation cannot be adequately explained by the demethylation of a 3′UTR CpG site as others have suggested. Whereas methylation within the CpG islands and shores near promoter regions correlates to transcriptional repression [25], the literature to support the same conclusion within gene bodies is lacking. Indeed, the recent work by Albao and colleagues suggests that the demethylation of cg19693031 promotes spurious transcription of the 3′UTR [26]. Prior studies have demonstrated the regulation of TXNIP via miRNA directed degradation of the mRNA, notably miR-17 [27]. Therefore, the spurious transcription of the 3′UTR intragenic region could potentially sequester the miRNA degradation by acting as a competitive endogenous RNA, resulting in an increase in TXNIP mRNA stability. Our findings further support the hypothesis that the demethylation response of cg19693031 may increase the mRNA stability of TXNIP. Further research is needed to validate and extend upon the role of the methylome in the regulation of TXNIP.
Our findings will be particularly useful for those seeking to better understand the pathophysiological function of TXNIP in order to optimize diabetes medicine. For example, several cg19693031 interacting genes were enriched in both type 1 (n = 27) and type 2 diabetes (n = 26). In addition, biological pathways related to diabetes were identified, including: insulin section, downstream signaling of insulin and glucagon and sensory perception of chemical stimulus. TXNIP emerged as a viable therapeutic target for diabetes due to its role in pancreatic β-cell apoptosis via activation of the mitochondrial death pathway [28]. The activation of the NLRP3 inflammasome results in the release of proinflammatory cytokines and apoptosis [29]. Critical to this report, TXNIP-induced inflammasome activation and IL-1β production are primarily observed in resident innate immune cells and not the islet β cells themselves [30,31,32]. Pathway analyses identified several mechanisms related to β-cell destruction, such as T-cell signaling and cytotoxic effects and ROS-induced apoptosis. Remarkably, strong interactions were observed with five major islet autoantigens (INS, GAD, IA-2, CPE and ICA). A total of 88 and 66 confounding genetic variants mapped to the NOD-like receptor signaling pathway and apoptosis. In particular, cg19693031 interacted with key loci for inflammasome activation (TRX, regulators of NLRP3 expression and NLRP3 itself), proinflammatory effects (IL-1β, IL-18, IL-6 and TNFα), death receptors (TRAIL-R, Fas and FADD) and other pro-apoptotic genes (BCL-2, BAD, JNK, BAK, ATM and PUMA). Further, TXNIP has been implicated in the development of diabetes-induced complications, including coronary heart disease. In this report, we identified loci mapping to lipids and atherosclerosis (n = 102), diabetic cardiomyopathy (n = 73), insulin resistance (n = 66) and AGE-RAGE diabetic complications (n = 59). This suggests that the pathophysiological role of the demethylation response of cg19693031 may generalize across the diabetes subtypes. As the diagnosis of diabetes mellitus is based on clinical presentation rather than specific molecular defects, overlaps in their etiologies have been documented. Indeed, prior studies have demonstrated the association of cg19693031 methylation and T1D. Further, current clinical trials are investigating the use of Verapamil for treatment of T1D, which has been shown to decrease the expression of TXNIP.
Altered TXNIP activity has been implicated in the development of several diseases aside from diabetes, including atherosclerosis, cancer and neurodegenerative diseases [33,34,35,36]. Our functional analyses identified enrichment in several of these diseases, including pathways in cancer (n = 274), pathways in neurodegeneration (n = 205), Alzheimer’s disease (n = 155), microRNAs in cancer (n = 106), transcriptional misregulation in cancer (n = 100) and the Hippo signaling pathway (n = 84). Moreover, prior studies have demonstrated an association between cg19693031 methylation and chylomicrons (Class A), triglycerides, hexose, α-hydrobutyrate, systolic hypertension, waist circumference, gender, metabolic syndrome, type 1 diabetes, coronary heart disease, breast cancer and colon cancer [7,37,38,39,40]. Taken together, this suggests that the demethylation response of cg19693031 in white blood cells is not specific to diabetes, let alone T2D. Therefore, the clinical use of cg19693031 methylation status in its current form as a biomarker for T2D diabetes may lead to inappropriate clinical decisions and poor patient outcomes. Extensive follow-up studies will be required to demonstrate the ability of cg19693031 methylation to accurately stratify diabetes risk in spite of the non-specific demethylation response.
We also note that, among those with HbA1c confirmed diabetes, only 57% had come to clinical attention, and less than 1% were on diabetic medications. This observation is reflective of the long asymptomatic phase of the disease, and health disparities seen in African American communities, resulting in a high rate of undiagnosed T2D. Additionally, control and pre-diabetic subjects reported a DM diagnosis, 3% and 9%, respectively, but they did not report the use of anti-diabetic pharmacological therapy. This discordance may be reflective of subjects reporting a pre-diabetic diagnosis, or lowered glucose levels due to lifestyle changes or bariatric surgery.
The findings of this study should be considered in the context of its limitations. First, it is important to note that the FACHS cohort is almost exclusively mature adult African American women. Further examinations using a larger number of subjects of other ethnicities, age groups and equal proportions of biological sexes will be required to demonstrate generalizability. Second, although the HbA1c test is the clinical gold standard, it is not sensitive to all pre-diabetes phenotypes [41]. Therefore, replicating our findings with other glycemic indices is needed. Third, the SNPs identified with having significant interactions with cg19693031 DNAm may not be the polymorphism driving the interaction; instead, it may be a tag SNP in full or partial equilibrium with another SNP that is driving the interaction. Additionally, there are known limitations to the Illumina EPIC arrays, such as cross-reactive and polymorphic probes, and β-value skewing due to the normalization process [42,43]. Hence, a more precise method for measuring cg19693031 DNAm, such as digital PCR, should be used to validate and extend our findings. Finally, cg19693931 methylation is confounded by multiple disease processes and cellular contexts. Therefore, while cg19693031 methylation may be sensitive to sustained hyperglycemia, it is not specific to diabetes.
In summary, we confirm and extend the prior findings of the demethylation response of cg19693031 to increasing HbA1c values in an African American cohort. This study is the first to extensively examine the genetic and hyperglycemia contextual effects on cg19693031 methylation, laying the groundwork for a better understanding of TXNIP’s pleiotropic pathophysiological effects in the development of disease. Future studies examining the extent of non-diabetes cellular-context-confounding cg19693031 methylation are needed to assess the feasibility for translation and potential clinical utility.

Author Contributions

R.P. and R.L.S. obtained the funding for the project. R.P. devised the clinical interview procedures, collected the clinical data, provided initial interpretation of clinical findings and provided a summary of clinical variables. K.D. analyzed the data, composed the figures and wrote the initial manuscript draft, with R.P., W.P. and B.D. providing additional material. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by a grant to the NIH (R01AG055393, Simons PI). The funding agency played no role in the analysis and interpretation of the data, preparation of the manuscript and decision to submit the report for publication. The content is solely the responsibility of the authors.

Institutional Review Board Statement

The study was conducted according to the guidelines of Helsinki and approved by the Institutional Review Board of the University of Iowa (IRB 200802719).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

All data used in this study are available from the corresponding author on reasonable request.

Conflicts of Interest

R.P. is the Chief Medical Officer of Cardio Diagnostics Inc. Cardio Diagnostics has filed intellectual property claims on the use of DNA methylation and GxMethylation effects to assess diabetes status. The remaining authors do not report any significant conflict.

References

  1. Association, A.D. Economic costs of diabetes in the US in 2007. Diabetes Care 2008, 31, 596–615. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Association, A.D. Diagnosis and classification of diabetes mellitus. Diabetes Care 2014, 37, S81–S90. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Raciti, G.A.; Longo, M.; Parrillo, L.; Ciccarelli, M.; Mirra, P.; Ungaro, P.; Formisano, P.; Miele, C.; Béguinot, F. Understanding type 2 diabetes: From genetics to epigenetics. Acta Diabetol. 2015, 52, 821–827. [Google Scholar] [CrossRef]
  4. Fuchsberger, C.; Flannick, J.; Teslovich, T.M.; Mahajan, A.; Agarwala, V.; Gaulton, K.J.; Ma, C.; Fontanillas, P.; Moutsianas, L.; McCarthy, D.J. The genetic architecture of type 2 diabetes. Nature 2016, 536, 41–47. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Soriano-Tárraga, C.; Jiménez-Conde, J.; Giralt-Steinhauer, E.; Mola-Caminal, M.; Vivanco-Hidalgo, R.M.; Ois, A.; Rodríguez-Campello, A.; Cuadrado-Godia, E.; Sayols-Baixeras, S.; Elosua, R. Epigenome-wide association study identifies TXNIP gene associated with type 2 diabetes mellitus and sustained hyperglycemia. Hum. Mol. Genet. 2016, 25, 609–619. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Chambers, J.C.; Loh, M.; Lehne, B.; Drong, A.; Kriebel, J.; Motta, V.; Wahl, S.; Elliott, H.R.; Rota, F.; Scott, W.R. Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: A nested case-control study. Lancet Diabetes Endocrinol. 2015, 3, 526–534. [Google Scholar] [CrossRef] [Green Version]
  7. Juvinao-Quintero, D.L.; Marioni, R.E.; Ochoa-Rosales, C.; Russ, T.C.; Deary, I.J.; Van Meurs, J.B.; Voortman, T.; Hivert, M.-F.; Sharp, G.C.; Relton, C.L. DNA methylation of blood cells is associated with prevalent type 2 diabetes in a meta-analysis of four European cohorts. Clin. Epigenetics 2021, 13, 40. [Google Scholar] [CrossRef]
  8. Villicaña, S.; Bell, J.T. Genetic impacts on DNA methylation: Research findings and future perspectives. Genome Biol. 2021, 22, 127. [Google Scholar]
  9. Simons, R.L.; Murry, V.; McLoyd, V.; Lin, K.-H.; Cutrona, C.; Conger, R.D. Discrimination, crime, ethnic identity, and parenting as correlates of depressive symptoms among African American children: A multilevel analysis. Dev. Psychopathol. 2002, 14, 371–393. [Google Scholar] [CrossRef] [Green Version]
  10. Lahiri, D.K.; Nurnberger, J.I., Jr. A rapid non-enzymatic method for the preparation of HMW DNA from blood for RFLP studies. Nucleic Acids Res. 1991, 19, 5444. [Google Scholar] [CrossRef]
  11. Metus, P.; Ruzzante, N.; Bonvicini, P.; Meneghetti, M.; Zaninotto, M.; Plebani, M. Immunoturbidimetric assay of glycated hemoglobin. J. Clin. Lab. Anal. 1999, 13, 5–8. [Google Scholar] [CrossRef]
  12. Rivers, P.A.; Dobalian, A.; Germinario, F.A. A review and analysis of the clinical laboratory improvement amendment of 1988: Compliance plans and enforcement policy. Health Care Manag. Rev. 2005, 30, 93–102. [Google Scholar] [CrossRef] [PubMed]
  13. Dogan, M.V.; Xiang, J.; Beach, S.R.; Cutrona, C.; Gibbons, F.X.; Simons, R.L.; Brody, G.H.; Stapleton, J.T.; Philibert, R.A. Ethnicity and smoking-associated DNA methylation changes at HIV co-receptor GPR15. Front. Psychiatry 2015, 6, 132. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.; Bender, D.; Maller, J.; Sklar, P.; De Bakker, P.I.; Daly, M.J. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Philibert, R.A.; Zadorozhnyaya, O.; Beach, S.R.; Brody, G.H. Comparison of the genotyping results using DNA obtained from blood and saliva. Psychiatr. Genet. 2008, 18, 275–281. [Google Scholar] [CrossRef] [Green Version]
  16. Gentilini, D.; Garagnani, P.; Pisoni, S.; Bacalini, M.G.; Calzari, L.; Mari, D.; Vitale, G.; Franceschi, C.; Di Blasio, A.M. Stochastic epigenetic mutations (DNA methylation) increase exponentially in human aging and correlate with X chromosome inactivation skewing in females. Aging 2015, 7, 568. [Google Scholar] [CrossRef] [Green Version]
  17. Wang, Y.; Karlsson, R.; Jylhävä, J.; Hedman, Å.K.; Almqvist, C.; Karlsson, I.K.; Pedersen, N.L.; Almgren, M.; Hägg, S. Comprehensive longitudinal study of epigenetic mutations in aging. Clin. Epigenetics 2019, 11, 187. [Google Scholar] [CrossRef] [Green Version]
  18. Van Rossum, G. Python Programming Language. In Proceedings of the USENIX Annual Technical Conference, Santa Clara, CA, USA, 17–22 June 2007; p. 36. [Google Scholar]
  19. Yin, L.; Zhang, H.; Tang, Z.; Xu, J.; Yin, D.; Zhang, Z.; Yuan, X.; Zhu, M.; Zhao, S.; Li, X. rmvp: A memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genom. Proteom. Bioinform. 2021, 19, 619–628. [Google Scholar] [CrossRef]
  20. Mering, C.v.; Huynen, M.; Jaeggi, D.; Schmidt, S.; Bork, P.; Snel, B. STRING: A database of predicted functional associations between proteins. Nucleic Acids Res. 2003, 31, 258–261. [Google Scholar] [CrossRef]
  21. Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef]
  22. Becker, K.G.; Barnes, K.C.; Bright, T.J.; Wang, S.A. The genetic association database. Nat. Genet. 2004, 36, 431–432. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Parikh, H.; Carlsson, E.; Chutkow, W.A.; Johansson, L.E.; Storgaard, H.; Poulsen, P.; Saxena, R.; Ladd, C.; Schulze, P.C.; Mazzini, M.J. TXNIP regulates peripheral glucose metabolism in humans. PLoS Med 2007, 4, e158. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Leibowitz, G.; Ktorza, A.; Cerasi, E. The role of txnip in the pathophysiology of diabetes and its vascular complications: A concise review. Medicographia 2014, 36, 391–397. [Google Scholar]
  25. Jones, P.A. Functions of DNA methylation: Islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 2012, 13, 484–492. [Google Scholar] [CrossRef]
  26. Albao, D.S.; Cutiongco-de la Paz, E.M.; Mercado, M.E.; Lirio, A.; Mariano, M.; Kim, S.; Yangco, A.; Melegrito, J.; Wad-asen, K.; Gauran, I.I. Methylation changes in the peripheral blood of Filipinos with type 2 diabetes suggest spurious transcription initiation at TXNIP. Hum. Mol. Genet. 2019, 28, 4208–4218. [Google Scholar] [CrossRef]
  27. Dong, D.; Fu, N.; Yang, P. MiR-17 downregulation by high glucose stabilizes thioredoxin-interacting protein and removes thioredoxin inhibition on ASK1 leading to apoptosis. Toxicol. Sci. 2016, 150, 84–96. [Google Scholar] [CrossRef] [Green Version]
  28. Thielen, L.; Shalev, A. Diabetes pathogenic mechanisms and potential new therapies based upon a novel target called TXNIP. Curr. Opin. Endocrinol. Diabetes Obes. 2018, 25, 75. [Google Scholar] [CrossRef]
  29. Oslowski, C.M.; Hara, T.; O’Sullivan-Murphy, B.; Kanekura, K.; Lu, S.; Hara, M.; Ishigaki, S.; Zhu, L.J.; Hayashi, E.; Hui, S.T. Thioredoxin-interacting protein mediates ER stress-induced β cell death through initiation of the inflammasome. Cell Metab. 2012, 16, 265–273. [Google Scholar] [CrossRef] [Green Version]
  30. Hoffman, H.M.; Mueller, J.L.; Broide, D.H.; Wanderer, A.A.; Kolodner, R.D. Mutation of a new gene encoding a putative pyrin-like protein causes familial cold autoinflammatory syndrome and Muckle–Wells syndrome. Nat. Genet. 2001, 29, 301–305. [Google Scholar] [CrossRef] [Green Version]
  31. March, C.J.; Mosley, B.; Larsen, A.; Cerretti, D.P.; Braedt, G.; Price, V.; Gillis, S.; Henney, C.S.; Kronheim, S.R.; Grabstein, K. Cloning, sequence and expression of two distinct human interleukin-1 complementary DNAs. Nature 1985, 315, 641–647. [Google Scholar] [CrossRef]
  32. Westwell-Roper, C.Y.; Ehses, J.A.; Verchere, C.B. Resident macrophages mediate islet amyloid polypeptide–induced islet IL-1β production and β-cell dysfunction. Diabetes 2014, 63, 1698–1711. [Google Scholar] [CrossRef] [Green Version]
  33. Alvim, R.; Santos, P.; Ferreira, N.; Mill, J.; Krieger, J.; Pereira, A. Thioredoxin interacting protein (TXNIP) rs7212 polymorphism is associated with arterial stiffness in the Brazilian general population. J. Hum. Hypertens. 2012, 26, 340–342. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Wang, X.-b.; Han, Y.-d.; Zhang, S.; Cui, N.-h.; Liu, Z.-j.; Huang, Z.-l.; Li, C.; Zheng, F. Associations of polymorphisms in TXNIP and gene–environment interactions with the risk of coronary artery disease in a Chinese Han population. J. Cell. Mol. Med. 2016, 20, 2362–2373. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Tsubaki, H.; Tooyama, I.; Walker, D.G. Thioredoxin-Interacting Protein (TXNIP) with Focus on Brain and Neurodegenerative Diseases. Int. J. Mol. Sci. 2020, 21, 9357. [Google Scholar] [CrossRef] [PubMed]
  36. Masutani, H. Thioredoxin-Interacting Protein in Cancer and Diabetes. Antioxid. Redox Signal. 2021. [Google Scholar] [CrossRef]
  37. Petersen, A.-K.; Zeilinger, S.; Kastenmüller, G.; Römisch-Margl, W.; Brugger, M.; Peters, A.; Meisinger, C.; Strauch, K.; Hengstenberg, C.; Pagel, P. Epigenetics meets metabolomics: An epigenome-wide association study with blood serum metabolic traits. Hum. Mol. Genet. 2014, 23, 534–545. [Google Scholar] [CrossRef] [Green Version]
  38. Nuotio, M.-L.; Pervjakova, N.; Joensuu, A.; Karhunen, V.; Hiekkalinna, T.; Milani, L.; Kettunen, J.; Järvelin, M.-R.; Jousilahti, P.; Metspalu, A. An epigenome-wide association study of metabolic syndrome and its components. Sci. Rep. 2020, 10, 20567. [Google Scholar] [CrossRef]
  39. Chen, Z.; Miao, F.; Braffett, B.H.; Lachin, J.M.; Zhang, L.; Wu, X.; Roshandel, D.; Carless, M.; Li, X.A.; Tompkins, J.D. DNA methylation mediates development of HbA1c-associated complications in type 1 diabetes. Nat. Metab. 2020, 2, 744–762. [Google Scholar] [CrossRef]
  40. Ren, J.; Zhang, B.; Wei, D.; Zhang, Z. Identification of methylated gene biomarkers in patients with Alzheimer’s disease based on machine learning. BioMed Res. Int. 2020, 2020, 8348147. [Google Scholar] [CrossRef] [Green Version]
  41. Hellman, R. When are HbA1C values misleading? AACE Clin. Case Rep. 2016, 2, e377–e379. [Google Scholar] [CrossRef] [Green Version]
  42. Pidsley, R.; Zotenko, E.; Peters, T.J.; Lawrence, M.G.; Risbridger, G.P.; Molloy, P.; Van Djik, S.; Muhlhausler, B.; Stirzaker, C.; Clark, S.J. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 2016, 17, 208. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. McCartney, D.L.; Walker, R.M.; Morris, S.W.; McIntosh, A.M.; Porteous, D.J.; Evans, K.L. Identification of polymorphic and off-target probe binding sites on the Illumina Infinium MethylationEPIC BeadChip. Genom. Data 2016, 9, 22–24. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Distribution of cg19693031 methylation across T2D status.
Figure 1. Distribution of cg19693031 methylation across T2D status.
Genes 13 00683 g001
Figure 2. Distribution of cg26823705 across T2D status.
Figure 2. Distribution of cg26823705 across T2D status.
Genes 13 00683 g002
Figure 3. Cis e-QTL genes with trans-meQTL effects on cg19693031 identified in type 1 diabetes pathology. Cis e-QTL genes with meQTL effects on cg19693031 methylation are highlighted in red (n = 23).
Figure 3. Cis e-QTL genes with trans-meQTL effects on cg19693031 identified in type 1 diabetes pathology. Cis e-QTL genes with meQTL effects on cg19693031 methylation are highlighted in red (n = 23).
Genes 13 00683 g003
Figure 4. Cis e-QTL genes with trans-meQTL effects on cg19693031 identified in AGE-RAGE signaling in diabetic complications. Cis e-QTL genes with meQTL effects on cg19693031 methylation are highlighted in red (n = 34).
Figure 4. Cis e-QTL genes with trans-meQTL effects on cg19693031 identified in AGE-RAGE signaling in diabetic complications. Cis e-QTL genes with meQTL effects on cg19693031 methylation are highlighted in red (n = 34).
Genes 13 00683 g004
Figure 5. Manhattan plot illustrating the association of GxMeth and SNP terms with HbA1c. Results were adjusted for age, sex and BMI, and corrected for multiple comparisons. Associations at p < 1.33 × 10−7 are above the horizontal black line. The chromosome density, number of significant SNPs residing within a 1 Mb window, is located at the bottom of the plot and the color-coded legend is to the right.
Figure 5. Manhattan plot illustrating the association of GxMeth and SNP terms with HbA1c. Results were adjusted for age, sex and BMI, and corrected for multiple comparisons. Associations at p < 1.33 × 10−7 are above the horizontal black line. The chromosome density, number of significant SNPs residing within a 1 Mb window, is located at the bottom of the plot and the color-coded legend is to the right.
Genes 13 00683 g005
Figure 6. cg19693031 distribution as a function of rs2390998 genotype across all 506 subjects (A) and stratified by T2D status (B).
Figure 6. cg19693031 distribution as a function of rs2390998 genotype across all 506 subjects (A) and stratified by T2D status (B).
Genes 13 00683 g006aGenes 13 00683 g006b
Figure 7. cg19693031 distribution as a function of rs1074390 genotype across all 506 subjects (A) and stratified by T2D status (B).
Figure 7. cg19693031 distribution as a function of rs1074390 genotype across all 506 subjects (A) and stratified by T2D status (B).
Genes 13 00683 g007aGenes 13 00683 g007b
Figure 8. Protein–protein interactome (PPI) generated from regression analyses of significant cg19693031 GxMeth effects against HbA1c.
Figure 8. Protein–protein interactome (PPI) generated from regression analyses of significant cg19693031 GxMeth effects against HbA1c.
Genes 13 00683 g008
Figure 9. Testing of the GxMeth model using the top 13 ranked features for distinguishing normoglycemic controls and diabetic subjects.
Figure 9. Testing of the GxMeth model using the top 13 ranked features for distinguishing normoglycemic controls and diabetic subjects.
Genes 13 00683 g009
Table 1. Clinical and demographic characteristics of subjects.
Table 1. Clinical and demographic characteristics of subjects.
ControlPre-DiabeticDiabetic
N = 142N = 274N = 90
Age45.3 ± 8.446.7 ± 7.352.1 ± 9.7
Gender
Male36 (25)69 (25)20 (22)
Female106 (75)205 (75)70 (78)
Ethnicity
African American140 (99)273 (99)90 (100)
White2 (1)--
Hispanic-1 (1)-
HbA1c (%)5.34 ± 0.265.96 ± 0.218.02 ± 1.84
BMI31.91 ± 8.4533.73 ± 7.9137.85 ± 9.53
Systolic BP 133.98 ± 21.92134.15 ± 19.59139.67 ± 23.65
Diastolic BP 83.46 ± 13.2483.59 ± 13.2584.06 ± 15.46
Self-Report Dx
Diabetes-22 (8)51 (57)
Hypertension12 (8)43 (16)17 (19)
Cardiac Event1 (1)2 (1)1 (1)
Self-Report Rx *
Diabetes--3 (3)
Blood Pressure12 (8)39 (14)17 (19)
Cholesterol-2 (1)1(1)
Mean ± standard deviation for continuous variables. N (%) for categorical variables. BP represents blood-pressure (mm Hg). Dx represents diagnosis. * Represents treatment.
Table 2. Top 21 ranked HbA1c-associated DNAm loci.
Table 2. Top 21 ranked HbA1c-associated DNAm loci.
RankCpGGenePositionIsland StatusSEMp-Value
1cg19693031TXNIP3′UTROpen Sea-4.43 × 10−12
2cg02458882OPTN5′UTRIslandHMO3.78 × 10−7
3cg21804949 Open SeaLMO5.28 × 10−7
4cg14955495CASKINTSS1500IslandHMO2.16 × 10−6
5cg18890830GPX6BodyOpen SeaLMO2.86 × 10−6
6cg22544867NELFCDTSS1500N ShoreLMO4.35 × 10−6
7cg03577153ZNF350TS200Open SeaHMO4.95 × 10−6
8cg14277924ATP10DBodyOpen SeaLMO8.16 × 10−6
9cg02184744ANKRD11BodyOpen SeaLMO3.35 × 10−4
10cg14471895FAM120AOSBodyIslandHMO5.30 × 10−4
11cg06655623RASGEF1ATSS1500Open SeaHMO8.09 × 10−4
12cg14691530 S ShelfLMO9.90 × 10−4
13cg13990141FAMQ20A0SBodyIslandHMO8.62 × 10−3
14cg22942097 IslandHMO0.012
15cg01105172 Open SeaLMO0.018
16cg26823705NBPF20BodyOpen SeaHMO0.020
17cg24623376IL2RABodyOpen SeaLMO0.020
18cg10536901SLC25A245′UTRIslandLMO0.020
19cg05922331 Open SeaLMO0.030
20cg03916864UNGTSS1500IslandHMO0.037
21cg12936627PTTG1IPTSS200IslandHMO0.037
22cg25962218LRGUKBodyOpen SeaHMO0.039
23cg05055927 Open SeaLMO0.043
Top 23 significant DNAm loci associated with HbA1c after FDR correction for multiple comparisons.
Table 3. Top 30 polymorphic CpGs associated with HbA1c values.
Table 3. Top 30 polymorphic CpGs associated with HbA1c values.
RankCpGGeneSNPp-Value
1cg03422185BCKDHArs1451731400.0003
2cg18909131KIAA0427rs773895860.0003
3cg04608829CYP4F12rs739268670.0003
4cg22228131BCAR3rs5277942380.0004
5cg27350255 rs5415330800.0009
6cg04291024TRAPPC9rs1148932010.0009
7cg00539624 rs534060211, rs1160036790.0009
8cg04272044RAD50rs732577570.0013
9cg21808468MCCrs18224870.0014
10cg08491197 rs764976000.0020
11cg07873290 rs1144341590.0020
12cg26502489SCN2Brs78096838, rs5699926260.0020
13cg00888992BRF1rs1161016360.0022
14cg09208162NDUFA10rs1388993260.0024
15cg20494686 rs148229470, rs1411392680.0025
16cg22098375GABBR1rs556388914, rs5642277380.0025
17cg05614028 rs5593291560.0033
18cg19804132CCL8rs1813025240.0045
19cg15166058ACTG2rs1509406640.0047
20cg23511885 rs622193390.0051
21cg14264512AKAP9rs5409812230.0054
22cg18049287ZBP1rs5329360930.0062
23cg16018596 rs789992820.0068
24cg23879504PRLRrs538603339, rs798237710.0077
25cg01704651C1ORF167rs1897378700.0083
26cg25921170ABL2rs5534158410.0097
27cg23576118RGS6rs363220.0137
28cg21442599SMIM8rs181909564, rs1872489400.0137
29cg19532212TEX14rs1858427720.0142
30cg21606577RSPRY1rs1413501180.0142
Top 30 ranked significant polymorphic CpG loci associated with HbA1c after FDR correction for multiple comparisons.
Table 4. Cis-meQTLs nominally associated with cg19693031.
Table 4. Cis-meQTLs nominally associated with cg19693031.
RankSNPGeneDistanceAdj R2p-Value
1rs587743199NOTCH2NLA2.26 × 1050.013530.003
2rs16827006ITGA109.83 × 1040.010280.008
3rs16827043 4.61 × 1040.0091990.011
4rs61741868ITGA109.99 × 1040.0058410.030
5rs10910843RNF1152.50 × 1050.0052170.036
6rs112006139PDE4DIP4.87 × 1050.0048550.037
7rs5617144ANKRD351.14 × 1040.0044910.045
Table 5. Top 30 long-range cis-meQTLs nominally associated with cg19693031.
Table 5. Top 30 long-range cis-meQTLs nominally associated with cg19693031.
RankSNPGeneDistanceAdj R2p-Value
1rs17023177PTPN146.93 × 1070.04161.52 × 10−6
2rs12718444SLC2A11.02 × 1080.03241.86 × 10−5
3rs2086856SLC2A11.02 × 1080.03092.80 × 10−5
4rs6702764 9.92 × 1070.03082.87 × 10−5
5JHU_1.230641162 8.52 × 1070.033.57 × 10−5
6rs34247575PTGER37.41 × 1070.02688.58 × 10−5
7JHU_1.19366997 1.26 × 1080.02678.80 × 10−5
8rs74121148 1.19 × 1070.02669.06 × 10−5
9rs2792599 1.02 × 1080.02531.30 × 10−4
10rs9326132 1.02 × 1080.02531.31 × 10−4
11JHU_1.78608038 6.68 × 1070.0251.43 × 10−4
12rs57013566 9.97 × 1070.02531.61 × 10−4
13rs12091692 5.29 × 1070.02342.22 × 10−4
14rs72895742TTC229.02 × 1070.02292.52 × 10−4
15rs6699702PRDX19.95 × 1070.02292.52 × 10−4
16rs2422139C1orf1052.69 × 1070.02272.63 × 10−4
17rs10911734 3.40 × 1070.02272.67 × 10−4
18JHU_1.45997600 9.94 × 1070.02252.85 × 10−4
19JHU_1.992041 1.40 × 1080.02273.47 × 10−4
20JHU_1.46359270MAST29.91 × 1070.02094.35 × 10−4
21rs2280511GLIS19.15 × 1070.02084.50 × 10−4
22JHU_1.90440644 5.50 × 1070.02035.04 × 10−4
23JHU_1.49307071AGBL49.61 × 1070.02035.04 × 10−4
24JHU_1.26626016UBXN111.19 × 1080.02035.16 × 10−4
25rs78580993 3.99 × 1070.02025.18 × 10−4
26exm111496FCRL51.21 × 1070.02025.29 × 10−4
27rs10864087 6.89 × 1070.02015.39 × 10−4
28rs12033260LOC1019272447.42 × 1070.02015.43 × 10−4
29rs3935008PTPRVP5.67 × 1070.025.47 × 10−4
30rs11265240 1.41 × 1070.01995.69 × 10−4
Table 6. Top 30 distal meQTLs nominally associated with cg19693031.
Table 6. Top 30 distal meQTLs nominally associated with cg19693031.
RankSNPGeneChromosomeAdj R2p-Value
1rs181228225 150.04251.30 × 10−6
2rs7447064830.04181.44 × 10−6
3rs29620050.03764.46 × 10−6
4rs2390998130.03665.86 × 10−6
5rs744439140.03646.33 × 10−6
6JHU_5.10609303650.03588.38 × 10−6
7rs1054013MEG3140.03461.02 × 10−5
8exm-rs11177669 120.03441.09 × 10−5
9rs7330858 130.03381.30 × 10−5
10rs1451452 20.03371.30 × 10−5
11JHU_18.63134356 180.03361.38 × 10−5
12JHU_16.9476218 160.03281.69 × 10−5
13rs117580705 110.03261.74 × 10−5
14exm-rs11221332ETS1110.0322.09 × 10−5
15rs1380989FGF1230.03152.43 × 10−5
16rs331581 50.03043.24 × 10−5
17rs3794207CAMKK2120.03023.38 × 10−5
18rs568941 160.02973.93 × 10−5
19rs7142470LINC00520140.02963.97 × 10−5
20exm-rs2523608HLA-B60.02964.01 × 10−5
21rs266273 20.02944.24 × 10−5
22JHU_15.87493750AGBL1150.02954.28 × 10−5
23rs73392704 140.02924.50 × 10−5
24exm523653POM121L260.02914.57 × 10−5
25rs6887332CDX150.0294.75 × 10−5
26JHU_14.23027963 140.02894.85 × 10−5
27JHU_3.16412493 30.03124.98 × 10−5
28rs11793972FAM155A90.02885.94 × 10−5
29rs4442648 130.02855.43 × 10−5
30rs73986979FIBCD1170.02825.89 × 10−5
Table 7. Association estimates for features in final multivariate G + E meQTL regression model with one GxE term for cg19693031.
Table 7. Association estimates for features in final multivariate G + E meQTL regression model with one GxE term for cg19693031.
Featureβp-Value
A1C−0.021***
rs1812282250.024***
rs11221332−0.021***
JHU_18.631343560.017***
rs170231770.012***
rs23909980.013***
rs744439:A1C0.008***
rs6702764−0.014***
rs744706480.011**
rs3794207−0.012**
rs331581−0.012**
JHU_1.230641162−0.009**
rs1451452−0.011**
rs73308580.018**
rs1175807050.018**
JHU_5.106093036−0.010**
rs744439−0.044**
rs13809890.009**
rs2962000.009**
rs12718444−0.012**
JHU_16.94762180.012*
rs11177669−0.007*
rs1054013−0.013*
p-value significant codes: <0 (***), <0.001 (**), <0.01 (*). Overall model performance: adj R2 = 0.5077 and p-value < 2.2 × 10−16.
Table 8. Association estimates for features in final multivariate GxE regression model for cg19693031 methylation.
Table 8. Association estimates for features in final multivariate GxE regression model for cg19693031 methylation.
Featureβp-Value
rs22353380.059**
rs72914579−0.091**
A1C:rs2235338−0.008**
JHU_1.520626550.082**
rs112652400.058**
A1C:rs729145790.011*
JHU_2.2348249010.054*
A1C:JHU_1.52062655−0.011*
rs744439:A1C0.007*
A1C:JHU_2.234824901−0.008*
A1C:rs11265240−0.007*
rs744439−0.027
A1C0.007
p-value significiant codes: < 0.001 (**), < 0.01 (*). Overall model performance: adj R2 = 0.2778 and p-value < 2.2 × 10−16.
Table 9. Association estimates for features in final multivariate integrated additive and interactive genetic and HbA1c regression model for cg19693031 methylation.
Table 9. Association estimates for features in final multivariate integrated additive and interactive genetic and HbA1c regression model for cg19693031 methylation.
Featureβp-Value
rs1812282250.023***
rs11221332−0.019***
JHU_18.631343560.017***
rs744439:A1C0.009***
rs744706480.010**
rs744439−0.050**
rs11177669−0.009**
rs1451452−0.011**
JHU_1.230641162−0.008**
rs112652400.052**
rs170231770.011**
rs1175807050.018**
rs6702764−0.012**
JHU_1.520626550.072**
rs1054013−0.014**
rs70219110.138**
A1C:rs7021911−0.023**
rs12718444−0.012**
rs3794207−0.010**
rs2962000.009**
rs73308580.015*
rs331581−0.009*
rs13809890.008*
A1C:rs11265240−0.007*
A1C:JHU_1.52062655−0.010*
rs62437200−0.063*
rs23909980.009*
JHU_5.106093036−0.008*
rs7789476−0.007*
A1C:rs624372000.009*
JHU_16.94762180.009*
A1C0.036
p-value significant codes: <0 (***), <0.001 (**), <0.01 (*). Overall model performance: adj R2 = 0.5515 and p-value < 2.2 × 10−22.
Table 10. Performance metrics of logistic regression models for predicting normoglycemic vs. diabetic subjects (A), normoglycemic vs. prediabetic subjects (B) and prediabetic vs. diabetic subjects (C).
Table 10. Performance metrics of logistic regression models for predicting normoglycemic vs. diabetic subjects (A), normoglycemic vs. prediabetic subjects (B) and prediabetic vs. diabetic subjects (C).
Model PredictorsPrecisionSensitivitySpecificityAUC
1Age + Sex + BMI0.6370%74%0.72
2Model 1 + cg196930310.6670%76%0.73
3Model 2 + rs72110.6574%74%0.74
4Model 2 + rs72120.6670%76%0.73
5Model 2 + rs92450.6163%74%0.68
6Model 2 + rs7211 + rs7212 + rs92450.6670%76%0.73
Missing values for predictors result in analysis sample size of 227 subjects for all models.
(A)
Model PredictorsPrecisionSensitivitySpecificityAUC
1Age + Sex + BMI0.6858%46%0.52
2Model 1 + cg196930310.6859%46%0.53
3Model 2 + rs72110.757%54%0.55
4Model 2 + rs72120.6859%46%0.53
5Model 2 + rs92450.6861%44%0.52
6Model 2 + rs7211 + rs7212 + rs92450.6861%44%0.52
Missing values for predictors result in analysis sample size of 397 subjects for all models.
(B)
Model PredictorsPrecisionSensitivitySpecificityAUC
1Age + Sex + BMI0.4367%69%0.68
2Model 1 + cg196930310.5167%78%0.72
3Model 2 + rs72110.5167%78%0.72
4Model 2 + rs72120.563%78%0.71
5Model 2 + rs92450.5367%79%0.73
6Model 2 + rs7211 + rs7212 + rs92450.4659%76%0.67
Missing values for predictors result in analysis sample size of 350 subjects for all models.
(C)
Table 11. Top 25 ranked SNPs with GxMeth effects after Bonferroni correction.
Table 11. Top 25 ranked SNPs with GxMeth effects after Bonferroni correction.
RankSNPChrGeneMAFGxMeth p-ValueSNP p-Value
1rs19321896 0.0278.85 × 10−171.98 × 10−18
2rs107977451DNM30.0341.04 × 10−118.91 × 10−14
3rs611033320MACROD20.0623.44 × 10−118.52 × 10−12
4rs112443769 0.0256.64 × 10−112.57 × 10−12
5rs436963415 0.3167.32 × 10−111.19 × 10−11
6rs2890417SPACA30.0842.56 × 10−106.76 × 10−11
7rs28164908 0.1582.65 × 10−101.17 × 10−11
8rs788455671XPR10.0154.78 × 10−106.70 × 10−11
9rs7799870114 0.0366.45 × 10−105.23 × 10−11
10rs176499520 0.4777.13 × 10−107.68 × 10−11
11rs6206826217 0.0598.54 × 10−101.43 × 10−10
12rs11721919611 0.0271.20 × 10−93.79 × 10−11
13rs122049786 0.0591.24 × 10−91.28 × 10−10
14rs746358541 0.0121.86 × 10−91.53 × 10−10
15rs727114884 0.152.02 × 10−96.88 × 10−10
16rs13841872519MAP3K100.0142.18 × 10−91.98 × 10−11
17rs168408871 0.0233.85 × 10−91.46 × 10−9
18rs622906933 0.2694.84 × 10−91.65 × 10−9
19rs129672016CREBBP0.0326.70 × 10−95.26 × 10−10
20rs177031862 0.0959.96 × 10−91.72 × 10−9
21rs664627043 0.0791.60 × 10−87.75 × 10−10
22rs14251871813 0.0131.61 × 10−83.10 × 10−9
23rs1165194654 0.0132.00 × 10−89.16 × 10−10
24rs168595172NHEJ10.0132.02 × 10−81.59 × 10−9
25rs126766388 0.0152.45 × 10−84.84 × 10−10
Abbreviations: Chr, chromosome; MAF, minor allele frequency. Top 25 SNPs based on the Bonferroni corrected p-value of the Gxmeth term in regression analyses against HbA1c values adjusted for age, sex and BMI.
Table 12. Balanced random forest models performance metrics for predicting normoglycemic vs. diabetic subjects.
Table 12. Balanced random forest models performance metrics for predicting normoglycemic vs. diabetic subjects.
ModelPredictorsPrecisionSensitivitySpecificityAUC
1Age + Sex + BMI0.5674%63%0.68
2Model 1 + cg196930310.5974%67%0.71
3Model 2 + 1476 GxMeth SNPs0.5470%63%0.67
4Model 2 + 551 selected GxMeth SNPs0.6967%81%0.74
5Model 2 + Top 10 GxMeth SNPs0.6381%70%0.76
Table 13. Top ranked important features selected by a BRF for predicting normoglycemic vs. diabetic subjects.
Table 13. Top ranked important features selected by a BRF for predicting normoglycemic vs. diabetic subjects.
RankFeature
1Age
2cg19693031
3BMI
4rs78125109
5rs9311874
6rs79244502
7rs10496366
8rs10496731
9rs10032200
10rs77173725
11rs2227818
12rs440617
13rs3218194
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dawes, K.; Philibert, W.; Darbro, B.; Simons, R.L.; Philibert, R. Additive and Interactive Genetically Contextual Effects of HbA1c on cg19693031 Methylation in Type 2 Diabetes. Genes 2022, 13, 683. https://doi.org/10.3390/genes13040683

AMA Style

Dawes K, Philibert W, Darbro B, Simons RL, Philibert R. Additive and Interactive Genetically Contextual Effects of HbA1c on cg19693031 Methylation in Type 2 Diabetes. Genes. 2022; 13(4):683. https://doi.org/10.3390/genes13040683

Chicago/Turabian Style

Dawes, Kelsey, Willem Philibert, Benjamin Darbro, Ronald L. Simons, and Robert Philibert. 2022. "Additive and Interactive Genetically Contextual Effects of HbA1c on cg19693031 Methylation in Type 2 Diabetes" Genes 13, no. 4: 683. https://doi.org/10.3390/genes13040683

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop