Omics Data Analysis and Integration in Complex Diseases

A special issue of Biomedicines (ISSN 2227-9059). This special issue belongs to the section "Molecular and Translational Medicine".

Deadline for manuscript submissions: closed (31 May 2022) | Viewed by 27471

Special Issue Editors


E-Mail Website
Guest Editor
1. Department of Statistics, Faculty of Medicine, University of Granada, 18071 Granada, Spain
2. Centre for Genomics and Oncological Research, Pfizer-University of Granada-Andalusian Regional Government, 18016 Granada, Spain
Interests: bioinformatics and biostatistics; computational biomedicine; omics data analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Centre for Genomics and Oncological Research, Pfizer-University of Granada-Andalusian Regional Government, 18016 Granada, Spain
Interests: bioinformatics; autoimmune diseases; computational biology
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The emergence of Omics technologies has revolutionized research in biomedicine, allowing us to analyze molecular mechanisms of complex diseases at an unprecedented scale. The analysis of omics data offers enormous possibilities for applications in biomarker discovery, patient stratification and disease classification or drug discovery, and they are fueling precision medicine strategies.

In this context, the development of statistical and computational methods to properly analyze and extract knowledge from large and heterogeneous omics datasets has become a major focus of research. Additionally, the availability of studies that generate multi-omics data from the same cohort of patients has opened new challengences in the field, as the integration of multi-omics data can provide more accurate and robust results than the analysis of a single type of omics data.

In this Special Issue, we will focus on new statistical and computational methods for omics data analysis and integration in complex diseases, new software and bioinformatics pipelines to analyze omics data and applications in biomarker discovery, disease classification, patient stratification, drug repurposing and drug discovery. Validation experiments are required.

Dr. Pedro Carmona-Sáez
Dr. Daniel Toro-Domínguez
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomedicines is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Omics Data Analysis
  • Omics data integration
  • Biomarker Discovery
  • Disease Classification
  • Precision Medicine
  • Machine Learning
  • Bioinformatics
  • Biostatistics

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

40 pages, 5181 KiB  
Article
Progeria and Aging—Omics Based Comparative Analysis
by Aylin Caliskan, Samantha A. W. Crouch, Sara Giddins, Thomas Dandekar and Seema Dangwal
Biomedicines 2022, 10(10), 2440; https://doi.org/10.3390/biomedicines10102440 - 29 Sep 2022
Cited by 2 | Viewed by 2948
Abstract
Since ancient times aging has also been regarded as a disease, and humankind has always strived to extend the natural lifespan. Analyzing the genes involved in aging and disease allows for finding important indicators and biological markers for pathologies and possible therapeutic targets. [...] Read more.
Since ancient times aging has also been regarded as a disease, and humankind has always strived to extend the natural lifespan. Analyzing the genes involved in aging and disease allows for finding important indicators and biological markers for pathologies and possible therapeutic targets. An example of the use of omics technologies is the research regarding aging and the rare and fatal premature aging syndrome progeria (Hutchinson-Gilford progeria syndrome, HGPS). In our study, we focused on the in silico analysis of differentially expressed genes (DEGs) in progeria and aging, using a publicly available RNA-Seq dataset (GEO dataset GSE113957) and a variety of bioinformatics tools. Despite the GSE113957 RNA-Seq dataset being well-known and frequently analyzed, the RNA-Seq data shared by Fleischer et al. is far from exhausted and reusing and repurposing the data still reveals new insights. By analyzing the literature citing the use of the dataset and subsequently conducting a comparative analysis comparing the RNA-Seq data analyses of different subsets of the dataset (healthy children, nonagenarians and progeria patients), we identified several genes involved in both natural aging and progeria (KRT8, KRT18, ACKR4, CCL2, UCP2, ADAMTS15, ACTN4P1, WNT16, IGFBP2). Further analyzing these genes and the pathways involved indicated their possible roles in aging, suggesting the need for further in vitro and in vivo research. In this paper, we (1) compare “normal aging” (nonagenarians vs. healthy children) and progeria (HGPS patients vs. healthy children), (2) enlist genes possibly involved in both the natural aging process and progeria, including the first mention of IGFBP2 in progeria, (3) predict miRNAs and interactomes for WNT16 (hsa-mir-181a-5p), UCP2 (hsa-mir-26a-5p and hsa-mir-124-3p), and IGFBP2 (hsa-mir-124-3p, hsa-mir-126-3p, and hsa-mir-27b-3p), (4) demonstrate the compatibility of well-established R packages for RNA-Seq analysis for researchers interested but not yet familiar with this kind of analysis, and (5) present comparative proteomics analyses to show an association between our RNA-Seq data analyses and corresponding changes in protein expression. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Graphical abstract

21 pages, 12555 KiB  
Article
Integrated Bioinformatics Analysis of the Hub Genes Involved in Irinotecan Resistance in Colorectal Cancer
by Jakub Kryczka and Joanna Boncela
Biomedicines 2022, 10(7), 1720; https://doi.org/10.3390/biomedicines10071720 - 16 Jul 2022
Cited by 2 | Viewed by 2124
Abstract
Different drug combinations including irinotecan remain some of the most important therapeutic modalities in treating colorectal cancer (CRC). However, chemotherapy often leads to the acquisition of cancer drug resistance. To bridge the gap between in vitro and in vivo models, we compared the [...] Read more.
Different drug combinations including irinotecan remain some of the most important therapeutic modalities in treating colorectal cancer (CRC). However, chemotherapy often leads to the acquisition of cancer drug resistance. To bridge the gap between in vitro and in vivo models, we compared the mRNA expression profiles of CRC cell lines (HT29, HTC116, and LoVo and their respective irinotecan-resistant variants) with patient samples to select new candidate genes for the validation of irinotecan resistance. Data were downloaded from the Gene Expression Omnibus (GEO) (GSE42387, GSE62080, and GSE18105) and the Human Protein Atlas databases and were subjected to an integrated bioinformatics analysis. The protein–protein interaction (PPI) network of differently expressed genes (DEGs) between FOLFIRI-resistant and -sensitive CRC patients delivered several potential irinotecan resistance markers: NDUFA2, SDHD, LSM5, DCAF4, COX10 RBM8A, TIMP1, QKI, TGOLN2, and PTGS2. The chosen DEGs were used to validate irinotecan-resistant cell line models, proving their substantial phylogenetic heterogeneity. These results indicated that in vitro models are highly limited and favor different mechanisms than in vivo, patient-derived ones. Thus, cell lines can be perfectly utilized to analyze specific mechanisms on their molecular levels but cannot mirror the complicated drug resistance network observed in patients. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

15 pages, 2179 KiB  
Article
Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites
by Sara M. de Cripan, Adrià Cereto-Massagué, Pol Herrero, Andrei Barcaru, Núria Canela and Xavier Domingo-Almenara
Biomedicines 2022, 10(4), 879; https://doi.org/10.3390/biomedicines10040879 - 11 Apr 2022
Cited by 7 | Viewed by 2381
Abstract
In gas chromatography–mass spectrometry-based untargeted metabolomics, metabolites are identified by comparing mass spectra and chromatographic retention time with reference databases or standard materials. In that sense, machine learning has been used to predict the retention time of metabolites lacking reference data. However, the [...] Read more.
In gas chromatography–mass spectrometry-based untargeted metabolomics, metabolites are identified by comparing mass spectra and chromatographic retention time with reference databases or standard materials. In that sense, machine learning has been used to predict the retention time of metabolites lacking reference data. However, the retention time prediction of trimethylsilyl derivatives of metabolites, typically analyzed in untargeted metabolomics using gas chromatography, has been poorly explored. Here, we provide a rationalized framework for machine learning-based retention time prediction of trimethylsilyl derivatives of metabolites in gas chromatography. We compared different machine learning paradigms, in addition to exploring the influence of the computational molecular structure representation to train the prediction models: fingerprint class and fingerprint calculation software. Our study challenged predicted retention time when using chemical ionization and electron impact ionization sources in simulated and real cases, demonstrating a good correct identity ranking capability by machine learning, despite observing a limited false identity filtering power in cases where a spectrum or a monoisotopic mass match to multiple candidates. Specifically, machine learning prediction yielded median absolute and relative retention index (relative retention time) errors of 37.1 retention index units and 2%, respectively. In addition, fingerprint class and fingerprint calculation software, as well as the molecular structural similarity between the training and test or real case sets, showed to be critical modulators of the prediction performance. Finally, we leveraged the structural similarity between the training and test or real case set to determine the probability that the prediction error is below a specific threshold. Overall, our study demonstrates that predicted retention time can provide insights into the true structure of unknown metabolites by ranking from the most to the least plausible molecular identity, and sets the guidelines to assess the confidence in metabolite identification using predicted retention time data. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

24 pages, 3562 KiB  
Article
CD5 Deficiency Alters Helper T Cell Metabolic Function and Shifts the Systemic Metabolome
by Kiara V. Whitley, Claudia M. Tellez Freitas, Carlos Moreno, Christopher Haynie, Joshua Bennett, John C. Hancock, Tyler D. Cox, Brett E. Pickett and K. Scott Weber
Biomedicines 2022, 10(3), 704; https://doi.org/10.3390/biomedicines10030704 - 18 Mar 2022
Cited by 1 | Viewed by 2377
Abstract
Metabolic function plays a key role in immune cell activation, destruction of foreign pathogens, and memory cell generation. As T cells are activated, their metabolic profile is significantly changed due to signaling cascades mediated by the T cell receptor (TCR) and co-receptors found [...] Read more.
Metabolic function plays a key role in immune cell activation, destruction of foreign pathogens, and memory cell generation. As T cells are activated, their metabolic profile is significantly changed due to signaling cascades mediated by the T cell receptor (TCR) and co-receptors found on their surface. CD5 is a T cell co-receptor that regulates thymocyte selection and peripheral T cell activation. The removal of CD5 enhances T cell activation and proliferation, but how this is accomplished is not well understood. We examined how CD5 specifically affects CD4+ T cell metabolic function and systemic metabolome by analyzing serum and T cell metabolites from CD5WT and CD5KO mice. We found that CD5 removal depletes certain serum metabolites, and CD5KO T cells have higher levels of several metabolites. Transcriptomic analysis identified several upregulated metabolic genes in CD5KO T cells. Bioinformatic analysis identified glycolysis and the TCA cycle as metabolic pathways promoted by CD5 removal. Functional metabolic analysis demonstrated that CD5KO T cells have higher oxygen consumption rates (OCR) and higher extracellular acidification rates (ECAR). Together, these findings suggest that the loss of CD5 is linked to CD4+ T cell metabolism changes in metabolic gene expression and metabolite concentration. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

12 pages, 1406 KiB  
Article
Functional Enrichment Analysis of Regulatory Elements
by Adrian Garcia-Moreno, Raul López-Domínguez, Juan Antonio Villatoro-García, Alberto Ramirez-Mena, Ernesto Aparicio-Puerta, Michael Hackenberg, Alberto Pascual-Montano and Pedro Carmona-Saez
Biomedicines 2022, 10(3), 590; https://doi.org/10.3390/biomedicines10030590 - 03 Mar 2022
Cited by 48 | Viewed by 5615
Abstract
Statistical methods for enrichment analysis are important tools to extract biological information from omics experiments. Although these methods have been widely used for the analysis of gene and protein lists, the development of high-throughput technologies for regulatory elements demands dedicated statistical and bioinformatics [...] Read more.
Statistical methods for enrichment analysis are important tools to extract biological information from omics experiments. Although these methods have been widely used for the analysis of gene and protein lists, the development of high-throughput technologies for regulatory elements demands dedicated statistical and bioinformatics tools. Here, we present a set of enrichment analysis methods for regulatory elements, including CpG sites, miRNAs, and transcription factors. Statistical significance is determined via a power weighting function for target genes and tested by the Wallenius noncentral hypergeometric distribution model to avoid selection bias. These new methodologies have been applied to the analysis of a set of miRNAs associated with arrhythmia, showing the potential of this tool to extract biological information from a list of regulatory elements. These new methods are available in GeneCodis 4, a web tool able to perform singular and modular enrichment analysis that allows the integration of heterogeneous information. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

14 pages, 2760 KiB  
Article
A Gene-Based Machine Learning Classifier Associated to the Colorectal Adenoma—Carcinoma Sequence
by Antonio Lacalamita, Emanuele Piccinno, Viviana Scalavino, Roberto Bellotti, Gianluigi Giannelli and Grazia Serino
Biomedicines 2021, 9(12), 1937; https://doi.org/10.3390/biomedicines9121937 - 17 Dec 2021
Cited by 5 | Viewed by 2982
Abstract
Colorectal cancer (CRC) carcinogenesis is generally the result of the sequential mutation and deletion of various genes; this is known as the normal mucosa–adenoma–carcinoma sequence. The aim of this study was to develop a predictor-classifier during the “adenoma-carcinoma” sequence using microarray gene expression [...] Read more.
Colorectal cancer (CRC) carcinogenesis is generally the result of the sequential mutation and deletion of various genes; this is known as the normal mucosa–adenoma–carcinoma sequence. The aim of this study was to develop a predictor-classifier during the “adenoma-carcinoma” sequence using microarray gene expression profiles of primary CRC, adenoma, and normal colon epithelial tissues. Four gene expression profiles from the Gene Expression Omnibus database, containing 465 samples (105 normal, 155 adenoma, and 205 CRC), were preprocessed to identify differentially expressed genes (DEGs) between adenoma tissue and primary CRC. The feature selection procedure, using the sequential Boruta algorithm and Stepwise Regression, determined 56 highly important genes. K-Means methods showed that, using the selected 56 DEGs, the three groups were clearly separate. The classification was performed with machine learning algorithms such as Linear Model (LM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Artificial Neural Network (ANN). The best classification method in terms of accuracy (88.06 ± 0.70) and AUC (92.04 ± 0.47) was k-NN. To confirm the relevance of the predictive models, we applied the four models on a validation cohort: the k-NN model remained the best model in terms of performance, with 91.11% accuracy. Among the 56 DEGs, we identified 17 genes with an ascending or descending trend through the normal mucosa–adenoma–carcinoma sequence. Moreover, using the survival information of the TCGA database, we selected six DEGs related to patient prognosis (SCARA5, PKIB, CWH43, TEX11, METTL7A, and VEGFA). The six-gene-based classifier described in the current study could be used as a potential biomarker for the early diagnosis of CRC. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

18 pages, 2926 KiB  
Article
DNA Methylation Signature in Mononuclear Cells and Proinflammatory Cytokines May Define Molecular Subtypes in Sporadic Meniere Disease
by Marisa Flook, Alba Escalera-Balsera, Alvaro Gallego-Martinez, Juan Manuel Espinosa-Sanchez, Ismael Aran, Andres Soto-Varela and Jose Antonio Lopez-Escamez
Biomedicines 2021, 9(11), 1530; https://doi.org/10.3390/biomedicines9111530 - 25 Oct 2021
Cited by 8 | Viewed by 2998
Abstract
Meniere Disease (MD) is a multifactorial disorder of the inner ear characterized by vertigo attacks associated with sensorineural hearing loss and tinnitus with a significant heritability. Although MD has been associated with several genes, no epigenetic studies have been performed on MD. Here [...] Read more.
Meniere Disease (MD) is a multifactorial disorder of the inner ear characterized by vertigo attacks associated with sensorineural hearing loss and tinnitus with a significant heritability. Although MD has been associated with several genes, no epigenetic studies have been performed on MD. Here we performed whole-genome bisulfite sequencing in 14 MD patients and six healthy controls, with the aim of identifying an MD methylation signature and potential disease mechanisms. We observed a high number of differentially methylated CpGs (DMC) when comparing MD patients to controls (n= 9545), several of them in hearing loss genes, such as PCDH15, ADGRV1 and CDH23. Bioinformatic analyses of DMCs and cis-regulatory regions predicted phenotypes related to abnormal excitatory postsynaptic currents, abnormal NMDA-mediated receptor currents and abnormal glutamate-mediated receptor currents when comparing MD to controls. Moreover, we identified various DMCs in genes previously associated with cochleovestibular phenotypes in mice. We have also found 12 undermethylated regions (UMR) that were exclusive to MD, including two UMR in an inter CpG island in the PHB gene. We suggest that the DNA methylation signature allows distinguishing between MD patients and controls. The enrichment analysis confirms previous findings of a chronic inflammatory process underlying MD. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

Review

Jump to: Research

15 pages, 1330 KiB  
Review
Single-Cell Analysis Using Machine Learning Techniques and Its Application to Medical Research
by Ken Asada, Ken Takasawa, Hidenori Machino, Satoshi Takahashi, Norio Shinkai, Amina Bolatkan, Kazuma Kobayashi, Masaaki Komatsu, Syuzo Kaneko, Koji Okamoto and Ryuji Hamamoto
Biomedicines 2021, 9(11), 1513; https://doi.org/10.3390/biomedicines9111513 - 21 Oct 2021
Cited by 12 | Viewed by 4577
Abstract
In recent years, the diversity of cancer cells in tumor tissues as a result of intratumor heterogeneity has attracted attention. In particular, the development of single-cell analysis technology has made a significant contribution to the field; technologies that are centered on single-cell RNA [...] Read more.
In recent years, the diversity of cancer cells in tumor tissues as a result of intratumor heterogeneity has attracted attention. In particular, the development of single-cell analysis technology has made a significant contribution to the field; technologies that are centered on single-cell RNA sequencing (scRNA-seq) have been reported to analyze cancer constituent cells, identify cell groups responsible for therapeutic resistance, and analyze gene signatures of resistant cell groups. However, although single-cell analysis is a powerful tool, various issues have been reported, including batch effects and transcriptional noise due to gene expression variation and mRNA degradation. To overcome these issues, machine learning techniques are currently being introduced for single-cell analysis, and promising results are being reported. In addition, machine learning has also been used in various ways for single-cell analysis, such as single-cell assay of transposase accessible chromatin sequencing (ATAC-seq), chromatin immunoprecipitation sequencing (ChIP-seq) analysis, and multi-omics analysis; thus, it contributes to a deeper understanding of the characteristics of human diseases, especially cancer, and supports clinical applications. In this review, we present a comprehensive introduction to the implementation of machine learning techniques in medical research for single-cell analysis, and discuss their usefulness and future potential. Full article
(This article belongs to the Special Issue Omics Data Analysis and Integration in Complex Diseases)
Show Figures

Figure 1

Back to TopTop