Multi-omics Data Integration in Complex Diseases

A special issue of Biology (ISSN 2079-7737). This special issue belongs to the section "Bioinformatics".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 18143

Special Issue Editors

School of Science, Technology and Engineering, University of the Sunshine Coast, Maroochydore, QLD DC 4558, Australia
Interests: bioinformatics; disease genomics; big data integration
Special Issues, Collections and Topics in MDPI journals
Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
Interests: medical genomics; database construction; omics-based data integration
Special Issues, Collections and Topics in MDPI journals
Institute for Biomedical Technologies-National Research Council (ITB-CNR), Via Fratelli Cervi 93, 20090 Segrate, MI, Italy
Interests: proteomics; liquid chromatography; mass-spectrometry; computational biology methods; biomarker discovery; systems biology; protein–protein interaction network; co-expression networks
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Each year, complex genetic diseases are responsible for the deaths of millions worldwide. Generally, genetic changes may have effects on multiple levels such as gene expression change, post-transcriptional dysregulation, protein expression change and mis-localization. Despite developments in large-scale genomics, transcriptome and proteomics technologies, it is still challenging to interpret molecular intricacies and variations at multiple molecular levels. Therefore, multi-omics-based data integration is the critical step to advance our understanding of molecular mechanisms for human disease.

As a fast-growing multidisciplinary field, the uses of genomics, other big data and artificial intelligence will improve population health by more accurate diagnostic and prognostic prediction. Therefore, this Special Issue intends to cover manuscripts that provide valuable insight for both basic and clinical researcher communities. Particularly, we encouraged the submissions about: (1) molecular insights into the common and unique mechanism using multiple omics technologies; (2) novel computational frameworks for genetic variant classification and functional evaluation; (3) practical omics-guided diagnosis and prognosis treatment strategies. We are open to different types of manuscripts including original research and meta-analysis articles, timely reviews, and short communications.

Dr. Min Zhao
Dr. Ruifeng Hu
Dr. Dario Di Silvestre
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biology is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • omics
  • disease
  • data integration
  • bioinformatics
  • genome

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

12 pages, 6026 KiB  
Article
Identifying the Common Cell-Free DNA Biomarkers across Seven Major Cancer Types
Biology 2023, 12(7), 934; https://doi.org/10.3390/biology12070934 - 29 Jun 2023
Viewed by 978
Abstract
Blood-based detection of circulating cell-free DNA (cfDNA) is a non-invasive and easily accessible method for early cancer detection. Despite the extensive utility of cfDNA, there are still many challenges to developing clinical biomarkers. For example, cfDNA with genetic alterations often composes a small [...] Read more.
Blood-based detection of circulating cell-free DNA (cfDNA) is a non-invasive and easily accessible method for early cancer detection. Despite the extensive utility of cfDNA, there are still many challenges to developing clinical biomarkers. For example, cfDNA with genetic alterations often composes a small portion of the DNA circulating in plasma, which can be confounded by cfDNA contributed by normal cells. Therefore, filtering out the potential false-positive cfDNA mutations from healthy populations will be important for cancer-based biomarkers. Additionally, many low-frequency genetic alterations are easily overlooked in a small number of cfDNA-based cancer tests. We hypothesize that the combination of diverse types of cancer studies on cfDNA will provide us with a new perspective on the identification of low-frequency genetic variants across cancer types for promoting early diagnosis. By building a standardized computational pipeline for 1358 cfDNA samples across seven cancer types, we prioritized 129 shard genetic variants in the major cancer types. Further functional analysis of the 129 variants found that they are mainly enriched in ribosome pathways such as cotranslational protein targeting the membrane, some of which are tumour suppressors, oncogenes, and genes related to cancer initiation. In summary, our integrative analysis revealed the important roles of ribosome proteins as common biomarkers in early cancer diagnosis. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

12 pages, 2532 KiB  
Article
A Database of Lung Cancer-Related Genes for the Identification of Subtype-Specific Prognostic Biomarkers
Biology 2023, 12(3), 357; https://doi.org/10.3390/biology12030357 - 24 Feb 2023
Cited by 1 | Viewed by 1768
Abstract
The molecular subtype is critical for accurate treatment and follow-up in patients with lung cancer; however, information regarding subtype-associated genes is dispersed among thousands of published studies. Systematic curation and cross-validation of the scientific literature would provide a solid foundation for comparative genetic [...] Read more.
The molecular subtype is critical for accurate treatment and follow-up in patients with lung cancer; however, information regarding subtype-associated genes is dispersed among thousands of published studies. Systematic curation and cross-validation of the scientific literature would provide a solid foundation for comparative genetic studies of the major molecular subtypes of lung cancer. Here, we constructed a literature-based lung cancer gene database (LCGene). In the current release, we collected and curated 2507 unique human genes, including 2267 protein-coding and 240 non-coding genes from comprehensive manual examination of 10,960 PubMed article abstracts. Extensive annotations were added to aid identification of differentially expressed genes, potential gene editing sites, and non-coding gene regulation. For instance, we prepared 607 curated genes with CRISPR knockout information in 43 lung cancer cell lines. Further comparison of these implicated genes among different subtypes identified several subtype-specific genes with high mutational frequencies. Common tumor suppressors and oncogenes shared by lung adenocarcinoma and lung squamous cell carcinoma, for example, exhibited different mutational frequencies and prognostic features, suggesting the presence of subtype-specific biomarkers. Our retrospective analysis revealed 43 small cell lung cancer-specific genes. Moreover, 52 tumor suppressors and oncogenes shared by lung adenocarcinoma and squamous cell carcinoma confirmed the different molecular mechanisms of these two cancer subtypes. The subtype-based genetic differences, when combined, may provide insight into subtype-specific biomarkers for genetic testing. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

28 pages, 2298 KiB  
Article
Interpretable and Predictive Deep Neural Network Modeling of the SARS-CoV-2 Spike Protein Sequence to Predict COVID-19 Disease Severity
Biology 2022, 11(12), 1786; https://doi.org/10.3390/biology11121786 - 08 Dec 2022
Cited by 5 | Viewed by 1656
Abstract
Through the COVID-19 pandemic, SARS-CoV-2 has gained and lost multiple mutations in novel or unexpected combinations. Predicting how complex mutations affect COVID-19 disease severity is critical in planning public health responses as the virus continues to evolve. This paper presents a novel computational [...] Read more.
Through the COVID-19 pandemic, SARS-CoV-2 has gained and lost multiple mutations in novel or unexpected combinations. Predicting how complex mutations affect COVID-19 disease severity is critical in planning public health responses as the virus continues to evolve. This paper presents a novel computational framework to complement conventional lineage classification and applies it to predict the severe disease potential of viral genetic variation. The transformer-based neural network model architecture has additional layers that provide sample embeddings and sequence-wide attention for interpretation and visualization. First, training a model to predict SARS-CoV-2 taxonomy validates the architecture’s interpretability. Second, an interpretable predictive model of disease severity is trained on spike protein sequence and patient metadata from GISAID. Confounding effects of changing patient demographics, increasing vaccination rates, and improving treatment over time are addressed by including demographics and case date as independent input to the neural network model. The resulting model can be interpreted to identify potentially significant virus mutations and proves to be a robust predctive tool. Although trained on sequence data obtained entirely before the availability of empirical data for Omicron, the model can predict the Omicron’s reduced risk of severe disease, in accord with epidemiological and experimental data. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

17 pages, 4266 KiB  
Article
A Genomics Resource for 12 Edible Seaweeds to Predict Seaweed-Secreted Peptides with Potential Anti-Cancer Function
Biology 2022, 11(10), 1458; https://doi.org/10.3390/biology11101458 - 04 Oct 2022
Viewed by 1684
Abstract
Seaweeds are multicellular marine macroalgae with natural compounds that have potential anticancer activity. To date, the identification of those compounds has relied on purification and assay, yet few have been documented. Additionally, the genomes and associated proteomes of edible seaweeds that have been [...] Read more.
Seaweeds are multicellular marine macroalgae with natural compounds that have potential anticancer activity. To date, the identification of those compounds has relied on purification and assay, yet few have been documented. Additionally, the genomes and associated proteomes of edible seaweeds that have been identified thus far are scattered among different resources and with no systematic summary available, which hinders the development of a large-scale omics analysis. To enable this, we constructed a comprehensive genomics resource for the edible seaweeds. These data could be used for systematic metabolomics and a proteome search for anti-cancer compound and peptides. In brief, we integrated and annotated 12 publicly available edible seaweed genomes (8 species and 268,071 proteins). In addition, we integrate the new seaweed genomic resources with established cancer bioinformatics pipelines to help identify potential seaweed proteins that could help mitigate the development of cancer. We present 7892 protein domains that were predicted to be associated with cancer proteins based on a protein domain–domain interaction. The most enriched protein families were associated with protein phosphorylation and insulin signalling, both of which are recognised to be crucial molecular components for patient survival in various cancers. In addition, we found 6692 seaweed proteins that could interact with over 100 tumour suppressor proteins, of which 147 are predicted to be secreted proteins. In conclusion, our genomics resource not only may be helpful in exploring the genomics features of these edible seaweed but also may provide a new avenue to explore the molecular mechanisms for seaweed-associated inhibition of human cancer development. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

18 pages, 3951 KiB  
Article
Integrated In Silico Analyses Identify PUF60 and SF3A3 as New Spliceosome-Related Breast Cancer RNA-Binding Proteins
Biology 2022, 11(4), 481; https://doi.org/10.3390/biology11040481 - 22 Mar 2022
Cited by 3 | Viewed by 3030
Abstract
More women are diagnosed with breast cancer (BC) than any other type of cancer. Although large-scale efforts have completely redefined cancer, a cure remains unattainable. In that respect, new molecular functions of the cell should be investigated, such as post-transcriptional regulation. RNA-binding proteins [...] Read more.
More women are diagnosed with breast cancer (BC) than any other type of cancer. Although large-scale efforts have completely redefined cancer, a cure remains unattainable. In that respect, new molecular functions of the cell should be investigated, such as post-transcriptional regulation. RNA-binding proteins (RBPs) are emerging as critical post-transcriptional modulators of tumorigenesis, but only a few have clear roles in BC. To recognize new putative breast cancer RNA-binding proteins, we performed integrated in silico analyses of all human RBPs (n = 1392) in three major cancer databases and identified five putative BC RBPs (PUF60, TFRC, KPNB1, NSF, and SF3A3), which showed robust oncogenic features related to their genomic alterations, immunohistochemical changes, high interconnectivity with cancer driver genes (CDGs), and tumor vulnerabilities. Interestingly, some of these RBPs have never been studied in BC, but their oncogenic functions have been described in other cancer types. Subsequent analyses revealed PUF60 and SF3A3 as central elements of a spliceosome-related cluster involving RBPs and CDGs. Further research should focus on the mechanisms by which these proteins could promote breast tumorigenesis, with the potential to reveal new therapeutic pathways along with novel drug-development strategies. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

15 pages, 907 KiB  
Article
Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes
Biology 2022, 11(3), 360; https://doi.org/10.3390/biology11030360 - 24 Feb 2022
Cited by 5 | Viewed by 2965
Abstract
Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a [...] Read more.
Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a single modality due to the heterogeneity of disease. Using a scientifically developed and tested deep-learning approach that leverages aggregate information collected from multiple repositories with multiple modalities (e.g., mRNA, DNA Methylation, miRNA) could lead to a more accurate and robust prediction of disease progression. Here, we propose an autoencoder based multimodal data fusion system, in which a fusion encoder flexibly integrates collective information available through multiple studies with partially coupled data. Our results on a fully controlled simulation-based study have shown that inferring the missing data through the proposed data fusion pipeline allows a predictor that is superior to other baseline predictors with missing modalities. Results have further shown that short- and long-term survivors of glioblastoma multiforme, acute myeloid leukemia, and pancreatic adenocarcinoma can be successfully differentiated with an AUC of 0.94, 0.75, and 0.96, respectively. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

17 pages, 4035 KiB  
Article
Identification of Key Proteins from the Alternative Lengthening of Telomeres-Associated Promyelocytic Leukemia Nuclear Bodies Pathway
Biology 2022, 11(2), 185; https://doi.org/10.3390/biology11020185 - 25 Jan 2022
Cited by 3 | Viewed by 3354
Abstract
Alternative lengthening of telomeres-associated promyelocytic leukemia nuclear bodies (APBs) are a hallmark of telomere maintenance. In the last few years, APBs have been described as the main place where telomeric extension occurs in ALT-positive cancer cell lines. A different set of proteins have [...] Read more.
Alternative lengthening of telomeres-associated promyelocytic leukemia nuclear bodies (APBs) are a hallmark of telomere maintenance. In the last few years, APBs have been described as the main place where telomeric extension occurs in ALT-positive cancer cell lines. A different set of proteins have been associated with APBs function, however, the molecular mechanisms behind their assembly, colocalization, and clustering of telomeres, among others, remain unclear. To improve the understanding of APBs in the ALT pathway, we integrated multiomics analyses to evaluate genomic, transcriptomic and proteomic alterations, and functional interactions of 71 APBs-related genes/proteins in 32 Pan-Cancer Atlas studies from The Cancer Genome Atlas Consortium (TCGA). As a result, we identified 13 key proteins which showed distinctive mutations, interactions, and functional enrichment patterns across all the cancer types and proposed this set of proteins as candidates for future ex vivo and in vivo analyses that will validate these proteins to improve the understanding of the ALT pathway, fill the current research gap about APBs function and their role in ALT, and be considered as potential therapeutic targets for the diagnosis and treatment of ALT-positive cancers in the future. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

Review

Jump to: Research

23 pages, 1835 KiB  
Review
Integration of Omics Data and Network Models to Unveil Negative Aspects of SARS-CoV-2, from Pathogenic Mechanisms to Drug Repurposing
Biology 2023, 12(9), 1196; https://doi.org/10.3390/biology12091196 - 31 Aug 2023
Viewed by 1115
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the COVID-19 health emergency, affecting and killing millions of people worldwide. Following SARS-CoV-2 infection, COVID-19 patients show a spectrum of symptoms ranging from asymptomatic to very severe manifestations. In particular, bronchial and pulmonary cells, involved [...] Read more.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the COVID-19 health emergency, affecting and killing millions of people worldwide. Following SARS-CoV-2 infection, COVID-19 patients show a spectrum of symptoms ranging from asymptomatic to very severe manifestations. In particular, bronchial and pulmonary cells, involved at the initial stage, trigger a hyper-inflammation phase, damaging a wide range of organs, including the heart, brain, liver, intestine and kidney. Due to the urgent need for solutions to limit the virus’ spread, most efforts were initially devoted to mapping outbreak trajectories and variant emergence, as well as to the rapid search for effective therapeutic strategies. Samples collected from hospitalized or dead COVID-19 patients from the early stages of pandemic have been analyzed over time, and to date they still represent an invaluable source of information to shed light on the molecular mechanisms underlying the organ/tissue damage, the knowledge of which could offer new opportunities for diagnostics and therapeutic designs. For these purposes, in combination with clinical data, omics profiles and network models play a key role providing a holistic view of the pathways, processes and functions most affected by viral infection. In fact, in addition to epidemiological purposes, networks are being increasingly adopted for the integration of multiomics data, and recently their use has expanded to the identification of drug targets or the repositioning of existing drugs. These topics will be covered here by exploring the landscape of SARS-CoV-2 survey-based studies using systems biology approaches derived from omics data, paying particular attention to those that have considered samples of human origin. Full article
(This article belongs to the Special Issue Multi-omics Data Integration in Complex Diseases)
Show Figures

Figure 1

Back to TopTop