Machine Learning and Data Analysis in Bioinformatics

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematical Biology".

Deadline for manuscript submissions: closed (31 January 2024) | Viewed by 3476

Special Issue Editor


E-Mail Website
Guest Editor
1. Laboratory Analysis, Geometry and Applications (LAGA), University Sorbonne Paris Nord, 93430 Paris, France
2. Department Genetics, University of Malaga, 29010 Malaga, Spain
Interests: unsupervised machine learning methods; topological data analysis and deep learning; hidden patterns; causal relationships from high-dimensional biological data

Special Issue Information

Dear Colleagues,

The field of biological research is generating vast amounts of high-dimensional data that require sophisticated analytical tools to uncover the hidden patterns and causal relationships that underlie biological processes. Unsupervised machine learning methods, such as clustering and dimensionality reduction techniques, have been widely used to identify subgroups within large datasets and to visualize complex data structures.

Recently, topological data analysis (TDA) has emerged as a powerful tool to analyze high-dimensional data and extract meaningful features. By focusing on the shape and structure of the data, rather than just the individual data points, TDA can identify topological features and structures in the data that traditional statistical methods may miss.

Deep learning techniques, such as neural networks and convolutional networks, have also shown great promise in identifying subtle patterns and relationships within large datasets. These methods can learn complex representations of the data and can be used to make accurate predictions based on high-dimensional inputs.

In this Special Issue, we invite researchers to submit their original research articles, reviews, and perspectives on unsupervised machine learning methods, topological data analysis, and deep learning in the context of biological data. We encourage submissions that focus on novel applications, methodologies, and algorithmic developments in these areas, as well as studies that showcase the potential of these techniques for advancing our understanding of complex biological systems.

Dr. Ian Morilla
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • unsupervised machine learning
  • data analysis
  • deep learning
  • hidden patterns
  • causal relationships
  • high-dimensional data
  • biological research
  • network analysis
  • complex data structures

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 763 KiB  
Article
The Genotypic Imperative: Unraveling Disease-Permittivity in Functional Modules of Complex Diseases
by Abdoul K. Kaba, Kelly L. Vomo-Donfack and Ian Morilla
Mathematics 2023, 11(24), 4916; https://doi.org/10.3390/math11244916 - 11 Dec 2023
Viewed by 557
Abstract
In complex diseases, the interactions among genes are commonly elucidated through the lens of graphs. Amongst these genes, certain ones form bi-functional modules within the graph, contingent upon their (anti)correlation with a specific functional state, such as susceptibility to a genetic disorder of [...] Read more.
In complex diseases, the interactions among genes are commonly elucidated through the lens of graphs. Amongst these genes, certain ones form bi-functional modules within the graph, contingent upon their (anti)correlation with a specific functional state, such as susceptibility to a genetic disorder of non-Mendelian traits. Consequently, a disease can be delineated by a finite number of these discernible modules. Within each module, there exist allelic variants that pose a genetic risk, thus qualifying as genetic risk factors. These factors precipitate a permissive state, which if all other modules also align in the same permissive state, can ultimately lead to the onset of the disease in an individual. To gain a deeper insight into the incidence of a disease, it becomes imperative to acquire a comprehensive understanding of the genetic transmission of these factors. In this work, we present a non-linear model for this transmission, drawing inspiration from the classic theory of the Bell experiment. This model aids in elucidating the variances observed in SNP interactions concerning the risk of disease. Full article
(This article belongs to the Special Issue Machine Learning and Data Analysis in Bioinformatics)
Show Figures

Figure 1

15 pages, 364 KiB  
Article
A Novel Nonlinear Dynamic Model Describing the Spread of Virus
by Veli B. Shakhmurov, Muhammet Kurulay, Aida Sahmurova, Mustafa Can Gursesli and Antonio Lanata
Mathematics 2023, 11(20), 4226; https://doi.org/10.3390/math11204226 - 10 Oct 2023
Viewed by 745
Abstract
This study proposes a nonlinear mathematical model of virus transmission. The interaction between viruses and immune cells is investigated using phase-space analysis. Specifically, the work focuses on the dynamics and stability behavior of the mathematical model of a virus spread in a population [...] Read more.
This study proposes a nonlinear mathematical model of virus transmission. The interaction between viruses and immune cells is investigated using phase-space analysis. Specifically, the work focuses on the dynamics and stability behavior of the mathematical model of a virus spread in a population and its interaction with human immune system cells. The endemic equilibrium points are found, and local stability analysis of all equilibria points of the related model is obtained. Further, the global stability analysis, either at disease-free equilibria or in endemic equilibria, is discussed by constructing the Lyapunov function, which shows the validity of the concern model. Finally, a simulated solution is achieved, and the relationship between viruses and immune cells is highlighted. Full article
(This article belongs to the Special Issue Machine Learning and Data Analysis in Bioinformatics)
Show Figures

Figure 1

13 pages, 3081 KiB  
Article
A Novel Meta-Analysis-Based Regularized Orthogonal Matching Pursuit Algorithm to Predict Lung Cancer with Selected Biomarkers
by Sai Wang, Bin-Yuan Wang and Hai-Fang Li
Mathematics 2023, 11(19), 4171; https://doi.org/10.3390/math11194171 - 05 Oct 2023
Viewed by 793
Abstract
Biomarker selection for predictive analytics encounters the problem of identifying a minimal-size subset of genes that is maximally predictive of an outcome of interest. For lung cancer gene expression datasets, it is a great challenge to handle the characteristics of small sample size, [...] Read more.
Biomarker selection for predictive analytics encounters the problem of identifying a minimal-size subset of genes that is maximally predictive of an outcome of interest. For lung cancer gene expression datasets, it is a great challenge to handle the characteristics of small sample size, high dimensionality, high noise as well as the low reproducibility of important biomarkers in different studies. In this paper, our proposed meta-analysis-based regularized orthogonal matching pursuit (MA-ROMP) algorithm not only gains strength by using multiple datasets to identify important genomic biomarkers efficiently, but also keeps the selection flexible among datasets to take into account data heterogeneity through a hierarchical decomposition on regression coefficients. For a case study of lung cancer, we downloaded GSE10072, GSE19188 and GSE19804 from the GEO database with inconsistent experimental conditions, sample preparation methods, different study groups, etc. Compared with state-of-the-art methods, our method shows the highest accuracy, of up to 95.63%, with the best discriminative ability (AUC 0.9756) as well as a more than 15-fold decrease in its training time. The experimental results on both simulated data and several lung cancer gene expression datasets demonstrate that MA-ROMP is a more effective tool for biomarker selection and learning cancer prediction. Full article
(This article belongs to the Special Issue Machine Learning and Data Analysis in Bioinformatics)
Show Figures

Figure 1

16 pages, 3260 KiB  
Article
An Improved Graph Isomorphism Network for Accurate Prediction of Drug–Drug Interactions
by Sile Wang, Xiaorui Su, Bowei Zhao, Pengwei Hu, Tao Bai and Lun Hu
Mathematics 2023, 11(18), 3990; https://doi.org/10.3390/math11183990 - 20 Sep 2023
Viewed by 1066
Abstract
Drug–drug interaction (DDI) prediction is one of the essential tasks in drug development to ensure public health and patient safety. Drug combinations with potentially severe DDIs have been verified to threaten the safety of patients critically, and it is therefore of great significance [...] Read more.
Drug–drug interaction (DDI) prediction is one of the essential tasks in drug development to ensure public health and patient safety. Drug combinations with potentially severe DDIs have been verified to threaten the safety of patients critically, and it is therefore of great significance to develop effective computational algorithms for identifying potential DDIs in clinical trials. By modeling DDIs with a graph structure, recent attempts have been made to solve the prediction problem of DDIs by using advanced graph representation learning techniques. Still, their representational capacity is limited by isomorphic structures that are frequently observed in DDI networks. To address this problem, we propose a novel algorithm called DDIGIN to predict DDIs by incorporating a graph isomorphism network (GIN) such that more discriminative representations of drugs can thus be learned for improved performance. Given a DDI network, DDIGIN first initializes the representations of drugs with Node2Vec according to the topological structure and then optimizes these representations by propagating and aggregating the first-order neighboring information in an injective way. By doing so, more powerful representations can thus be learned for drugs with isomorphic structures. Last, DDIGIN estimates the interaction probability for pairwise drugs by multiplying their representations in an end-to-end manner. Experimental results demonstrate that DDIGIN outperforms several state-of-the-art algorithms on the ogbl-ddi (Acc = 0.8518, AUC = 0.8594, and AUPR = 0.9402) and DDInter datasets (Acc = 0.9763, AUC = 0.9772, and AUPR = 0.9868). In addition, our case study indicates that incorporating GIN enhances the expressive power of drug representations for improved performance of DDI prediction. Full article
(This article belongs to the Special Issue Machine Learning and Data Analysis in Bioinformatics)
Show Figures

Figure 1

Back to TopTop