Special Issue "Application of Bioinformatics in Microbiome"

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Microbial Genetics and Genomics".

Deadline for manuscript submissions: 25 December 2023 | Viewed by 5233

Special Issue Editor

The Forsyth Institute, Cambridge, MA 02142, USA
Interests: microbiome research; next-generation sequencing; gene markers; genomics; meta-genomics; meta-transcriptomics; binformatics analysis; artifical intelligence

Special Issue Information

Dear Colleagues,

The microorganisms found in an environment and their activities are collectively referred to as the microbiome. The advent of sequencing technologies has enabled new means of studying microbiomes; it is now possible to answer the questions of “who are there”, “What can they do” and “what are they doing” for a habitat by sequencing its marker genes, meta-genomes and meta-transcriptomes, respectively. With high-throughput next-generation sequencing (NGS), there has been an explosion of sequencing data, which require novel and efficient computing algorithms, software and pipelines to process, manage and interpret their embedded information. Bioinformatics is an interdisciplinary field of science that combines biology, computer science, informatics, mathematics and statistics to analyze and interpret biological and clinical data. Bioinformatics is not only crucial but essential to extracting meaningful information from cryptic NGS data.

This Special Issue aims to demonstrate the latest developments in bioinformatics in the NGS era that have helped advanced our understanding of the microbiome. The issue’s scope includes, but is not limited to, the following topics:

  1. New computing algorithms for analyzing NGS data;
  2. Data management platforms (online or standalone desktop databases) for NGS data;
  3. Pipeline development (a collection of analytic software streamlined from upstream to downstream applications);
  4. Bioinformatics application in the cloud platform;
  5. Artificial intelligence (AI) in microbiome research;
  6. New microbiome research discovery, both medical and non-medical, with a significant bioinformatics component;
  7. Reviews of the above applications.

Dr. Tsute Chen
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • microbiome
  • microbiota
  • next-generation sequencing
  • genomics
  • meta-genomics
  • meta-transcriptomics
  • taxonomy
  • artificial intelligence
  • bioinformatics

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Characterization and Genome Study of a Newly Isolated Temperate Phage Belonging to a New Genus Targeting Alicyclobacillus acidoterrestris
Genes 2023, 14(6), 1303; https://doi.org/10.3390/genes14061303 - 20 Jun 2023
Viewed by 771
Abstract
The spoilage of juices by Alicyclobacillus spp. remains a serious problem in industry and leads to economic losses. Compounds such as guaiacol and halophenols, which are produced by Alicyclobacillus, create undesirable flavors and odors and, thus, decrease the quality of juices. The [...] Read more.
The spoilage of juices by Alicyclobacillus spp. remains a serious problem in industry and leads to economic losses. Compounds such as guaiacol and halophenols, which are produced by Alicyclobacillus, create undesirable flavors and odors and, thus, decrease the quality of juices. The inactivation of Alicyclobacillus spp. constitutes a challenge because it is resistant to environmental factors, such as high temperatures, and active acidity. However, the use of bacteriophages seems to be a promising approach. In this study, we aimed to isolate and comprehensively characterize a novel bacteriophage targeting Alicyclobacillus spp. The Alicyclobacillus phage strain KKP 3916 was isolated from orchard soil against the Alicyclobacillus acidoterrestris strain KKP 3133. The bacterial host’s range and the effect of phage addition at different rates of multiplicity of infections (MOIs) on the host’s growth kinetics were determined using a Bioscreen C Pro growth analyzer. The Alicyclobacillus phage strain KKP 3916, retained its activity in a wide range of temperatures (from 4 °C to 30 °C) and active acidity values (pH from 3 to 11). At 70 °C, the activity of the phage decreased by 99.9%. In turn, at 80 °C, no activity against the bacterial host was observed. Thirty minutes of exposure to UV reduced the activity of the phages by almost 99.99%. Based on transmission-electron microscopy (TEM) and whole-genome sequencing (WGS) analyses, the Alicyclobacillus phage strain KKP 3916 was classified as a tailed bacteriophage. The genomic sequencing revealed that the newly isolated phage had linear double-stranded DNA (dsDNA) with sizes of 120 bp and 131 bp and 40.3% G+C content. Of the 204 predicted proteins, 134 were of unknown function, while the remainder were annotated as structural, replication, and lysis proteins. No genes associated with antibiotic resistance were found in the genome of the newly isolated phage. However, several regions, including four associated with integration into the bacterial host genome and excisionase, were identified, which indicates the temperate (lysogenic) life cycle of the bacteriophage. Due to the risk of its potential involvement in horizontal gene transfer, this phage is not an appropriate candidate for further research on its use in food biocontrol. To the best of our knowledge, this is the first article on the isolation and whole-genome analysis of the Alicyclobacillus-specific phage. Full article
(This article belongs to the Special Issue Application of Bioinformatics in Microbiome)
Show Figures

Figure 1

Article
Novel Clustering Methods Identified Three Caries Status-Related Clusters Based on Oral Microbiome in Thai Mother–Child Dyads
Genes 2023, 14(3), 641; https://doi.org/10.3390/genes14030641 - 03 Mar 2023
Viewed by 764
Abstract
Early childhood caries (ECC) is a disease that globally affects pre-school children. It is important to identify both protective and risk factors associated with this disease. This paper examined a set of saliva samples of Thai mother–child dyads and aimed to analyze how [...] Read more.
Early childhood caries (ECC) is a disease that globally affects pre-school children. It is important to identify both protective and risk factors associated with this disease. This paper examined a set of saliva samples of Thai mother–child dyads and aimed to analyze how the maternal factors and oral microbiome of the dyads influence the development of ECC. However, heterogeneous latent subpopulations may exist that have different characteristics in terms of caries development. Therefore, we introduce a novel method to cluster the correlated outcomes of dependent observations while selecting influential independent variables to unearth latent groupings within this dataset and reveal their association in each group. This paper describes the discovery of three heterogeneous clusters in the dataset, each with its own unique mother–child outcome trend, as well as identifying several microbial factors that contribute to ECC. Significantly, the three identified clusters represent three typical clinical conditions in which mother–child dyads have typical (cluster 1), high–low (cluster 2), and low–high caries experiences (cluster 3) compared to the overall trend of mother–child caries status. Intriguingly, the variables identified as the driving attributes of each cluster, including specific taxa, have the potential to be used in the future as caries preventive measures. Full article
(This article belongs to the Special Issue Application of Bioinformatics in Microbiome)
Show Figures

Figure 1

Article
SCP4ssd: A Serverless Platform for Nucleotide Sequence Synthesis Difficulty Prediction Using an AutoML Model
Genes 2023, 14(3), 605; https://doi.org/10.3390/genes14030605 - 28 Feb 2023
Viewed by 868
Abstract
DNA synthesis is widely used in synthetic biology to construct and assemble sequences ranging from short RBS to ultra-long synthetic genomes. Many sequence features, such as the GC content and repeat sequences, are known to affect the synthesis difficulty and subsequently the synthesis [...] Read more.
DNA synthesis is widely used in synthetic biology to construct and assemble sequences ranging from short RBS to ultra-long synthetic genomes. Many sequence features, such as the GC content and repeat sequences, are known to affect the synthesis difficulty and subsequently the synthesis cost. In addition, there are latent sequence features, especially local characteristics of the sequence, which might affect the DNA synthesis process as well. Reliable prediction of the synthesis difficulty for a given sequence is important for reducing the cost, but this remains a challenge. In this study, we propose a new automated machine learning (AutoML) approach to predict the DNA synthesis difficulty, which achieves an F1 score of 0.930 and outperforms the current state-of-the-art model. We found local sequence features that were neglected in previous methods, which might also affect the difficulty of DNA synthesis. Moreover, experimental validation based on ten genes of Escherichia coli strain MG1655 shows that our model can achieve an 80% accuracy, which is also better than the state of art. Moreover, we developed the cloud platform SCP4SSD using an entirely cloud-based serverless architecture for the convenience of the end users. Full article
(This article belongs to the Special Issue Application of Bioinformatics in Microbiome)
Show Figures

Figure 1

Article
Supervised Machine Learning Enables Geospatial Microbial Provenance
Genes 2022, 13(10), 1914; https://doi.org/10.3390/genes13101914 - 21 Oct 2022
Cited by 1 | Viewed by 2038
Abstract
The recent increase in publicly available metagenomic datasets with geospatial metadata has made it possible to determine location-specific, microbial fingerprints from around the world. Such fingerprints can be useful for comparing microbial niches for environmental research, as well as for applications within forensic [...] Read more.
The recent increase in publicly available metagenomic datasets with geospatial metadata has made it possible to determine location-specific, microbial fingerprints from around the world. Such fingerprints can be useful for comparing microbial niches for environmental research, as well as for applications within forensic science and public health. To determine the regional specificity for environmental metagenomes, we examined 4305 shotgun-sequenced samples from the MetaSUB Consortium dataset—the most extensive public collection of urban microbiomes, spanning 60 different cities, 30 countries, and 6 continents. We were able to identify city-specific microbial fingerprints using supervised machine learning (SML) on the taxonomic classifications, and we also compared the performance of ten SML classifiers. We then further evaluated the five algorithms with the highest accuracy, with the city and continental accuracy ranging from 85–89% to 90–94%, respectively. Thereafter, we used these results to develop Cassandra, a random-forest-based classifier that identifies bioindicator species to aid in fingerprinting and can infer higher-order microbial interactions at each site. We further tested the Cassandra algorithm on the Tara Oceans dataset, the largest collection of marine-based microbial genomes, where it classified the oceanic sample locations with 83% accuracy. These results and code show the utility of SML methods and Cassandra to identify bioindicator species across both oceanic and urban environments, which can help guide ongoing efforts in biotracing, environmental monitoring, and microbial forensics (MF). Full article
(This article belongs to the Special Issue Application of Bioinformatics in Microbiome)
Show Figures

Figure 1

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

1. Title:An AutoML model for predicting the difficulty of nucleotide sequence synthesis

Abstract: DNA synthesis is widely used in synthetic biology to construct and assemble sequences ranging from short RBS to ultra-long synthetic genomes. Many sequence features, such as GC content and repeat sequences, are known to affect the synthesis difficulty and subsequently the synthesis cost. In addition, there are latent sequence features, especially local characteristics of the sequence, which might affect the DNA synthesis process as well. Reliable prediction of synthesis difficulty for a given sequence is important for reducing cost, but this remains a challenge. In this study, we propose a new Automated Machine Learning (AutoML) approach to predict DNA synthesis difficulty, which achieves an F1 score of 0.930 and outperforms the current state-of-the-art model. We found local sequence features that were neglected in previous methods, which might also affect the difficulty DNA synthesis. Moreover, experimental validation based on 10 genes of Escherichia coli MG1655 shows that our model can achieve 80% accuracy, which is also better than the state-of-the-art. The standalone version is provided at https://github.com/tibbdc/scp4ssd. Moreover, we developed the cloud platform SCP4SSD (https://scp4ssd.biodesign.ac.cn), using an entirely cloud-based serverless architecture.

 

2. Title: Pulmonary Bacteriobiota as a prognostic factor in critically ill patients  

Abstract: This study aims to make use of this technology and identify the bacterial composition of bronchial secretion samples from mechanically ventilated patients and establish this as a prognostic factor for survival using high-throughput sequencing platforms Illumina's. An observational, longitudinal, prospective study of critical patients mechanically ventilated for non-respiratory indications, among other exclusion criteria, in a polyvalent intensive care unit, was carried out; the sample was extracted by endotracheal aspiration and subsequently characterized by sequencing the 16S ribosomal RNA gene. The predominant species were Proteobacteria, Firmicutes and Bacteroidata. In the group of surviving patients, they were Proteobacteria, Bacteroidata, and Firmicutes and in the group of deceased patients were Firmicutes, Proteobacteria, and Bacteroidata. The alpha diversity found no significant difference between both, as did the beta diversity. In this group of patients, the microbial composition could not be associated with disease severity. 

3. Title:  "Three Clusters of Caries Status Identified for Thai Mother-Child Dyads Based on the Oral Microbiome and Maternal Relatedness”.

Abstract:  "The severity of early childhood caries (ECC) as a disease is apparent globally. It is important to identify both protective and offensive factors in the development of this disease. This paper concerns the examination of a sample of Thai mother-child dyads, and aims to analyze how the maternal factors and both the oral microbiome of the mother and child influence the development, or lack thereof, of ECC. Here, we introduce a novel clustering method that uses the trajectories of the outcomes of these dependent observations in conjunction with the observed variables to unearth latent groupings within this data set that would satisfy our aims. This paper details the discovery of three latent clusters within the data set, each with its own unique outcome trajectory trend, as well as identifying several microbial factors that contribute to the estimation mechanism of the clustering. Additionally, cross-clustering analysis was performed to identify further differences between the clusters of interest.”
 
4. Title: Airway and Oral Microbiome Profiling of SARS-CoV-2 Infected Asthmatic and Non-Asthmatic Cases Revealing Alterations – A Pulmonary Microbial Investigation
Abstract: New evidence strongly divulges the pathogenesis of host-associated microbiomes in respiratory diseases. The microbiome dysbiosis modulates the lung's behavior and deteriorates the respiratory system's effective functioning. Most microbe colonies' host Upper respiratory tract niches offer vital information about the disease condition. Several exogenous and environmental factors influence the development of asthma and chronic lung disease. The relationship between asthma and microbes is reasonably understood and yet to be investigated for more substantiation. The comorbidities such as SARS-CoV-2 further exacerbate the health condition of the asthma-affected individual. This study scrutinizes the microbial activities of pre-existing asthma and non-asthmatic patients infected by SARS-CoV-2. The experiment is designed in a two-fold pattern, analyzing the affinity between the samples collected from the saliva and nasopharyngeal regions. Later, delve into microbial pathogenesis, its role in exacerbations of respiratory disease, and deciphering the diagnostic biomarkers of the target condition. This study infers that understanding the relationship between the microbiome and respiratory diseases helps develop coherent probiotics as therapeutics. Findings: Rothia mucilaginosa is less abundant, and Corynebacterium tuberculostearicum showed higher abundance in the SARS-CoV-2 asthmatic group. The increase in Streptococcus at the genus level in the SARS-CoV-2-Asthmatic group is evidence of discriminating the subgroups. LEfSE analysis identified that Actinobacteriota and Pseudomonadota are enriched in the SARS-CoV-2-non-Asthmatic group and SARS-CoV-2 asthmatic group of the salivary microbiome, respectively. Random forest algorithm is trained with OTU's attained better classification accuracy, ROC scores on nasal (84% and 87%) and saliva datasets (93% and 97.5%).
Back to TopTop