Mathematical Modelling and Machine Learning Methods for Bioinformatics and Data Science Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (15 October 2021) | Viewed by 16770

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Information Engineering and Mathematics, University of Siena, 53100 Siena, Italy
Interests: artificial intelligence; machine learning; deep learning; approximation theory; bioinformatics; image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Information Engineering and Mathematics, University of Siena, 53100 Siena, Italy
Interests: numerical analysis; scientific computing; geometric modeling; constrained interpolation and approximation; isogeometric analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

While statistics has historically been used to solve problems in data science, mathematical methods and machine learning (ML) can be extremely helpful, especially for building concise decision models, making fast approximations, and predicting evolving phenomena based on known samples. In particular, mathematical modeling and machine learning methods are increasingly used to help interpret biomedical data produced by high-throughput genomics and proteomics projects. Indeed, as the study of biological systems becomes more quantitative, the role played by mathematical analysis increases. This ranges from the macroscopic (e.g., how to model the spread of a disease across a community) to the microscopic (e.g., how to determine the three-dimensional structure of proteins from the knowledge of their amino acid sequence).

The revolution in biological and information technologies has produced huge amounts of data and is accelerating the process of knowledge discovery from biological systems. Furthermore, clinical data complement biological data, allowing for detailed descriptions of both healthy and diseased states, as well as disease progression and response to therapies. With medical imaging playing an increasingly prominent role in disease diagnosis, interest in medical image processing has also increased significantly over the past decades, with deep learning methods attracting more and more attention.

However, although advances in machine learning algorithms have been deemed critical for improving performance in analyzing huge datasets, their opacity, if not supported by preventive mathematical modeling of the problem, could prevent human experts, and especially doctors, from trusting their abilities and results.

This Special Issue provides a platform for researchers from academia and industry to present their new and unpublished work and to promote future studies in an emerging field such as applying mathematically founded ML models to highly sensitive data.

Dr. Maria Lucia Sampoli
Dr. Monica Bianchini
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Mathematical modeling
  • Numerical methods
  • Machine learning
  • Bioinformatics
  • Data science

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 10125 KiB  
Article
A Mixed Statistical and Machine Learning Approach for the Analysis of Multimodal Trail Making Test Data
by Niccolò Pancino, Caterina Graziani, Veronica Lachi, Maria Lucia Sampoli, Emanuel Ștefǎnescu, Monica Bianchini and Giovanna Maria Dimitri
Mathematics 2021, 9(24), 3159; https://doi.org/10.3390/math9243159 - 8 Dec 2021
Cited by 7 | Viewed by 3065
Abstract
Eye-tracking can offer a novel clinical practice and a non-invasive tool to detect neuropathological syndromes. In this paper, we show some analysis on data obtained from the visual sequential search test. Indeed, such a test can be used to evaluate the capacity of [...] Read more.
Eye-tracking can offer a novel clinical practice and a non-invasive tool to detect neuropathological syndromes. In this paper, we show some analysis on data obtained from the visual sequential search test. Indeed, such a test can be used to evaluate the capacity of looking at objects in a specific order, and its successful execution requires the optimization of the perceptual resources of foveal and extrafoveal vision. The main objective of this work is to detect if some patterns can be found within the data, to discern among people with chronic pain, extrapyramidal patients and healthy controls. We employed statistical tests to evaluate differences among groups, considering three novel indicators: blinking rate, average blinking duration and maximum pupil size variation. Additionally, to divide the three patient groups based on scan-path images—which appear very noisy and all similar to each other—we applied deep learning techniques to embed them into a larger transformed space. We then applied a clustering approach to correctly detect and classify the three cohorts. Preliminary experiments show promising results. Full article
Show Figures

Figure 1

13 pages, 937 KiB  
Article
Visual Sequential Search Test Analysis: An Algorithmic Approach
by Giuseppe Alessio D’Inverno, Sara Brunetti, Maria Lucia Sampoli, Dafin Fior Muresanu, Alessandra Rufa and Monica Bianchini
Mathematics 2021, 9(22), 2952; https://doi.org/10.3390/math9222952 - 18 Nov 2021
Cited by 2 | Viewed by 1881
Abstract
In this work we present an algorithmic approach to the analysis of the Visual Sequential Search Test (VSST) based on the episode matching method. The data set included two groups of patients, one with Parkinson’s disease, and another with chronic pain syndrome, along [...] Read more.
In this work we present an algorithmic approach to the analysis of the Visual Sequential Search Test (VSST) based on the episode matching method. The data set included two groups of patients, one with Parkinson’s disease, and another with chronic pain syndrome, along with a control group. The VSST is an eye-tracking modified version of the Trail Making Test (TMT) which evaluates high order cognitive functions. The episode matching method is traditionally used in bioinformatics applications. Here it is used in a different context which helps us to assign a score to a set of patients, under a specific VSST task to perform. Experimental results provide statistical evidence of the different behaviour among different classes of patients, according to different pathologies. Full article
Show Figures

Figure 1

13 pages, 3818 KiB  
Article
Machine Learning Techniques Applied to Predict Tropospheric Ozone in a Semi-Arid Climate Region
by Md Al Masum Bhuiyan, Ramanjit K. Sahi, Md Romyull Islam and Suhail Mahmud
Mathematics 2021, 9(22), 2901; https://doi.org/10.3390/math9222901 - 15 Nov 2021
Cited by 5 | Viewed by 2060
Abstract
In the last decade, ground-level ozone exposure has led to a significant increase in environmental and health risks. Thus, it is essential to measure and monitor atmospheric ozone concentration levels. Specifically, recent improvements in machine learning (ML) processes, based on statistical modeling, have [...] Read more.
In the last decade, ground-level ozone exposure has led to a significant increase in environmental and health risks. Thus, it is essential to measure and monitor atmospheric ozone concentration levels. Specifically, recent improvements in machine learning (ML) processes, based on statistical modeling, have provided a better approach to solving these risks. In this study, we compare Naive Bayes, K-Nearest Neighbors, Decision Tree, Stochastic Gradient Descent, and Extreme Gradient Boosting (XGBoost) algorithms and their ensemble technique to classify ground-level ozone concentration in the El Paso-Juarez area. As El Paso-Juarez is a non-attainment city, the concentrations of several air pollutants and meteorological parameters were analyzed. We found that the ensemble (soft voting classifier) of algorithms used in this paper provide high classification accuracy (94.55%) for the ozone dataset. Furthermore, variables that are highly responsible for the high ozone concentration such as Nitrogen Oxide (NOx), Wind Speed and Gust, and Solar radiation have been discovered. Full article
Show Figures

Figure 1

16 pages, 8709 KiB  
Article
A Multi-Stage GAN for Multi-Organ Chest X-ray Image Generation and Segmentation
by Giorgio Ciano, Paolo Andreini, Tommaso Mazzierli, Monica Bianchini and Franco Scarselli
Mathematics 2021, 9(22), 2896; https://doi.org/10.3390/math9222896 - 14 Nov 2021
Cited by 8 | Viewed by 2610
Abstract
Multi-organ segmentation of X-ray images is of fundamental importance for computer aided diagnosis systems. However, the most advanced semantic segmentation methods rely on deep learning and require a huge amount of labeled images, which are rarely available due to both the high cost [...] Read more.
Multi-organ segmentation of X-ray images is of fundamental importance for computer aided diagnosis systems. However, the most advanced semantic segmentation methods rely on deep learning and require a huge amount of labeled images, which are rarely available due to both the high cost of human resources and the time required for labeling. In this paper, we present a novel multi-stage generation algorithm based on Generative Adversarial Networks (GANs) that can produce synthetic images along with their semantic labels and can be used for data augmentation. The main feature of the method is that, unlike other approaches, generation occurs in several stages, which simplifies the procedure and allows it to be used on very small datasets. The method was evaluated on the segmentation of chest radiographic images, showing promising results. The multi-stage approach achieves state-of-the-art and, when very few images are used to train the GANs, outperforms the corresponding single-stage approach. Full article
Show Figures

Figure 1

18 pages, 481 KiB  
Article
Interactions Obtained from Basic Mechanistic Principles: Prey Herds and Predators
by Cecilia Berardo, Iulia Martina Bulai and Ezio Venturino
Mathematics 2021, 9(20), 2555; https://doi.org/10.3390/math9202555 - 12 Oct 2021
Cited by 2 | Viewed by 2100
Abstract
We investigate four predator–prey Rosenzweig–MacArthur models in which the prey exhibit herd behaviour and only the individuals on the edge of the herd are subjected to the predators’ attacks. The key concept is the herding index, i.e., the parameter defining the characteristic shape [...] Read more.
We investigate four predator–prey Rosenzweig–MacArthur models in which the prey exhibit herd behaviour and only the individuals on the edge of the herd are subjected to the predators’ attacks. The key concept is the herding index, i.e., the parameter defining the characteristic shape of the herd. We derive the population equations from the individual state transitions using the mechanistic approach and time scale separation method. We consider one predator and one prey species, linear and hyperbolic responses and the occurrence of predators’ intraspecific competition. For all models, we study the equilibria and their stability and we give the bifurcation analysis. We use standard numerical methods and the software Xppaut to obtain the one-parameter and two-parameter bifurcation diagrams. Full article
Show Figures

Figure 1

14 pages, 289 KiB  
Article
Alzheimer Identification through DNA Methylation and Artificial Intelligence Techniques
by Gerardo Alfonso Perez and Javier Caballero Villarraso
Mathematics 2021, 9(19), 2482; https://doi.org/10.3390/math9192482 - 4 Oct 2021
Cited by 4 | Viewed by 1924
Abstract
A nonlinear approach to identifying combinations of CpGs DNA methylation data, as biomarkers for Alzheimer (AD) disease, is presented in this paper. It will be shown that the presented algorithm can substantially reduce the amount of CpGs used while generating forecasts that are [...] Read more.
A nonlinear approach to identifying combinations of CpGs DNA methylation data, as biomarkers for Alzheimer (AD) disease, is presented in this paper. It will be shown that the presented algorithm can substantially reduce the amount of CpGs used while generating forecasts that are more accurate than using all the CpGs available. It is assumed that the process, in principle, can be non-linear; hence, a non-linear approach might be more appropriate. The proposed algorithm selects which CpGs to use as input data in a classification problem that tries to distinguish between patients suffering from AD and healthy control individuals. This type of classification problem is suitable for techniques, such as support vector machines. The algorithm was used both at a single dataset level, as well as using multiple datasets. Developing robust algorithms for multi-datasets is challenging, due to the impact that small differences in laboratory procedures have in the obtained data. The approach that was followed in the paper can be expanded to multiple datasets, allowing for a gradual more granular understanding of the underlying process. A 92% successful classification rate was obtained, using the proposed method, which is a higher value than the result obtained using all the CpGs available. This is likely due to the reduction in the dimensionality of the data obtained by the algorithm that, in turn, helps to reduce the risk of reaching a local minima. Full article
Show Figures

Figure 1

Back to TopTop