Nature-Inspired Algorithms in Machine Learning (2nd Edition)

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Evolutionary Algorithms and Machine Learning".

Deadline for manuscript submissions: 30 April 2024 | Viewed by 3660

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, 30-059 Kraków, Poland
Interests: computational intelligence; data mining; metaheuristics; dimensionality reduction; unsupervised learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Systems Research Institute, Polish Academy of Sciences, 01-447 Warsaw, Poland
Interests: data mining; artificial intelligence; computational intelligence; neural networks; metaheuristics; supervised learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Physics & Applied Computer Science, AGH University of Science & Technology, 30-059 Kraków, Poland
Interests: nature inspired algorithms and their applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We cordially invite you to submit your papers to the Special Issue “Nature-Inspired Algorithms in Machine Learning” of Algorithms, an established MPDI journal indexed—among others—in Clarivate Web of Science and Scopus. 

Machine learning algorithms are currently omnipresent in a variety of practical solutions spanning from space engineering to e-commerce. Apart from the standard statistical approach nature-inspired algorithms are also frequently used in this area. It is due to the complexity of the data exploration tasks and the possibility of including additional factors into the scheme of nature-inspired algorithm. 

Our Special Issue will accept a broad range of new advances in the field of nature-inspired machine learning algorithms. We invite contributions describing new techniques, novel evaluation criteria, interesting case-studies, as well as papers dealing with specific variants of existing algorithms and challenges of Big Data. A limited number of state-of-art reviews will also be considered for publication. 

Please feel free to contribute as well as to contact us with any questions and concerns.

Dr. Szymon Łukasik
Dr. Piotr A. Kowalski
Dr. Rohit Salgotra
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data science/data mining
  • clustering
  • classification
  • outlier detection
  • dimensionality reduction
  • unsupervised learning
  • supervised learning
  • nature-inspired algorithms
  • metaheuristics

Related Special Issue

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 451 KiB  
Article
An Objective Function-Based Clustering Algorithm with a Closed-Form Solution and Application to Reference Interval Estimation in Laboratory Medicine
by Frank Klawonn and Georg Hoffmann
Algorithms 2024, 17(4), 143; https://doi.org/10.3390/a17040143 - 29 Mar 2024
Viewed by 528
Abstract
Clustering algorithms are usually iterative procedures. In particular, when the clustering algorithm aims to optimise an objective function like in k-means clustering or Gaussian mixture models, iterative heuristics are required due to the high non-linearity of the objective function. This implies higher [...] Read more.
Clustering algorithms are usually iterative procedures. In particular, when the clustering algorithm aims to optimise an objective function like in k-means clustering or Gaussian mixture models, iterative heuristics are required due to the high non-linearity of the objective function. This implies higher computational costs and the risk of finding only a local optimum and not the global optimum of the objective function. In this paper, we demonstrate that in the case of one-dimensional clustering with one main and one noise cluster, one can formulate an objective function, which permits a closed-form solution with no need for an iteration scheme and the guarantee of finding the global optimum. We demonstrate how such an algorithm can be applied in the context of laboratory medicine as a method to estimate reference intervals that represent the range of “normal” values. Full article
(This article belongs to the Special Issue Nature-Inspired Algorithms in Machine Learning (2nd Edition))
Show Figures

Figure 1

12 pages, 826 KiB  
Article
A Markov Chain Genetic Algorithm Approach for Non-Parametric Posterior Distribution Sampling of Regression Parameters
by Parag C. Pendharkar
Algorithms 2024, 17(3), 111; https://doi.org/10.3390/a17030111 - 07 Mar 2024
Viewed by 781
Abstract
This paper proposes a genetic algorithm-based Markov Chain approach that can be used for non-parametric estimation of regression coefficients and their statistical confidence bounds. The proposed approach can generate samples from an unknown probability density function if a formal functional form of its [...] Read more.
This paper proposes a genetic algorithm-based Markov Chain approach that can be used for non-parametric estimation of regression coefficients and their statistical confidence bounds. The proposed approach can generate samples from an unknown probability density function if a formal functional form of its likelihood is known. The approach is tested in the non-parametric estimation of regression coefficients, where the least-square minimizing function is considered the maximum likelihood of a multivariate distribution. This approach has an advantage over traditional Markov Chain Monte Carlo methods because it is proven to converge and generate unbiased samples computationally efficiently. Full article
(This article belongs to the Special Issue Nature-Inspired Algorithms in Machine Learning (2nd Edition))
Show Figures

Figure 1

46 pages, 21402 KiB  
Article
On the Development of Descriptor-Based Machine Learning Models for Thermodynamic Properties: Part 2—Applicability Domain and Outliers
by Cindy Trinh, Silvia Lasala, Olivier Herbinet and Dimitrios Meimaroglou
Algorithms 2023, 16(12), 573; https://doi.org/10.3390/a16120573 - 18 Dec 2023
Viewed by 1508
Abstract
This article investigates the applicability domain (AD) of machine learning (ML) models trained on high-dimensional data, for the prediction of the ideal gas enthalpy of formation and entropy of molecules via descriptors. The AD is crucial as it describes the space of chemical [...] Read more.
This article investigates the applicability domain (AD) of machine learning (ML) models trained on high-dimensional data, for the prediction of the ideal gas enthalpy of formation and entropy of molecules via descriptors. The AD is crucial as it describes the space of chemical characteristics in which the model can make predictions with a given reliability. This work studies the AD definition of a ML model throughout its development procedure: during data preprocessing, model construction and model deployment. Three AD definition methods, commonly used for outlier detection in high-dimensional problems, are compared: isolation forest (iForest), random forest prediction confidence (RF confidence) and k-nearest neighbors in the 2D projection of descriptor space obtained via t-distributed stochastic neighbor embedding (tSNE2D/kNN). These methods compute an anomaly score that can be used instead of the distance metrics of classical low-dimension AD definition methods, the latter being generally unsuitable for high-dimensional problems. Typically, in low- (high-) dimensional problems, a molecule is considered to lie within the AD if its distance from the training domain (anomaly score) is below a given threshold. During data preprocessing, the three AD definition methods are used to identify outlier molecules and the effect of their removal is investigated. A more significant improvement of model performance is observed when outliers identified with RF confidence are removed (e.g., for a removal of 30% of outliers, the MAE (Mean Absolute Error) of the test dataset is divided by 2.5, 1.6 and 1.1 for RF confidence, iForest and tSNE2D/kNN, respectively). While these three methods identify X-outliers, the effect of other types of outliers, namely Model-outliers and y-outliers, is also investigated. In particular, the elimination of X-outliers followed by that of Model-outliers enables us to divide MAE and RMSE (Root Mean Square Error) by 2 and 3, respectively, while reducing overfitting. The elimination of y-outliers does not display a significant effect on the model performance. During model construction and deployment, the AD serves to verify the position of the test data and of different categories of molecules with respect to the training data and associate this position with their prediction accuracy. For the data that are found to be close to the training data, according to RF confidence, and display high prediction errors, tSNE 2D representations are deployed to identify the possible sources of these errors (e.g., representation of the chemical information in the training data). Full article
(This article belongs to the Special Issue Nature-Inspired Algorithms in Machine Learning (2nd Edition))
Show Figures

Graphical abstract

Back to TopTop