Journal Description
Machine Learning and Knowledge Extraction
Machine Learning and Knowledge Extraction
is an international, scientific, peer-reviewed, open access journal. It publishes original research articles, reviews, tutorials, research ideas, short notes and Special Issues that focus on machine learning and applications. Please see our video on YouTube explaining the MAKE journal concept. The journal is published quarterly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, and other databases.
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 16.7 days after submission; acceptance to publication is undertaken in 2.9 days (median values for papers published in this journal in the second half of 2022).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
- MAKE is a companion journal of Entropy.
Latest Articles
What about the Latent Space? The Need for Latent Feature Saliency Detection in Deep Time Series Classification
Mach. Learn. Knowl. Extr. 2023, 5(2), 539-559; https://doi.org/10.3390/make5020032 - 18 May 2023
Abstract
Saliency methods are designed to provide explainability for deep image processing models by assigning feature-wise importance scores and thus detecting informative regions in the input images. Recently, these methods have been widely adapted to the time series domain, aiming to identify important temporal
[...] Read more.
Saliency methods are designed to provide explainability for deep image processing models by assigning feature-wise importance scores and thus detecting informative regions in the input images. Recently, these methods have been widely adapted to the time series domain, aiming to identify important temporal regions in a time series. This paper extends our former work on identifying the systematic failure of such methods in the time series domain to produce relevant results when informative patterns are based on underlying latent information rather than temporal regions. First, we both visually and quantitatively assess the quality of explanations provided by multiple state-of-the-art saliency methods, including Integrated Gradients, Deep-Lift, Kernel SHAP, and Lime using univariate simulated time series data with temporal or latent patterns. In addition, to emphasize the severity of the latent feature saliency detection problem, we also run experiments on a real-world predictive maintenance dataset with known latent patterns. We identify Integrated Gradients, Deep-Lift, and the input-cell attention mechanism as potential candidates for refinement to yield latent saliency scores. Finally, we provide recommendations on using saliency methods for time series classification and suggest a guideline for developing latent saliency methods for time series.
Full article
(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI))
►
Show Figures
Open AccessArticle
Alzheimer’s Disease Detection from Fused PET and MRI Modalities Using an Ensemble Classifier
Mach. Learn. Knowl. Extr. 2023, 5(2), 512-538; https://doi.org/10.3390/make5020031 - 18 May 2023
Abstract
Alzheimer’s disease (AD) is an old-age disease that comes in different stages and directly affects the different regions of the brain. The research into the detection of AD and its stages has new advancements in terms of single-modality and multimodality approaches. However, sustainable
[...] Read more.
Alzheimer’s disease (AD) is an old-age disease that comes in different stages and directly affects the different regions of the brain. The research into the detection of AD and its stages has new advancements in terms of single-modality and multimodality approaches. However, sustainable techniques for the detection of AD and its stages still require a greater extent of research. In this study, a multimodal image-fusion method is initially proposed for the fusion of two different modalities, i.e., PET (Positron Emission Tomography) and MRI (Magnetic Resonance Imaging). Further, the features obtained from fused and non-fused biomarkers are passed to the ensemble classifier with a Random Forest-based feature selection strategy. Three classes of Alzheimer’s disease are used in this work, namely AD, MCI (Mild Cognitive Impairment) and CN (Cognitive Normal). In the resulting analysis, the Binary classifications, i.e., AD vs. CN and MCI vs. CN, attained an accuracy (Acc) of 99% in both cases. The class AD vs. MCI detection achieved an adequate accuracy (Acc) of 91%. Furthermore, the Multi Class classification, i.e., AD vs. MCI vs. CN, achieved 96% (Acc).
Full article
(This article belongs to the Special Issue Machine Learning for Biomedical Data Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Biologically Inspired Self-Organizing Computational Model to Mimic Infant Learning
by
, , , and
Mach. Learn. Knowl. Extr. 2023, 5(2), 491-511; https://doi.org/10.3390/make5020030 - 15 May 2023
Abstract
Recent technological advancements have fostered human–robot coexistence in work and residential environments. The assistive robot must exhibit humane behavior and consistent care to become an integral part of the human habitat. Furthermore, the robot requires an adaptive unsupervised learning model to explore unfamiliar
[...] Read more.
Recent technological advancements have fostered human–robot coexistence in work and residential environments. The assistive robot must exhibit humane behavior and consistent care to become an integral part of the human habitat. Furthermore, the robot requires an adaptive unsupervised learning model to explore unfamiliar conditions and collaborate seamlessly. This paper introduces variants of the growing hierarchical self-organizing map (GHSOM)-based computational models for assistive robots, which constructs knowledge from unsupervised exploration-based learning. Traditional self-organizing map (SOM) algorithms have shortcomings, including finite neuron structure, user-defined parameters, and non-hierarchical adaptive architecture. The proposed models overcome these limitations and dynamically grow to form problem-dependent hierarchical feature clusters, thereby allowing associative learning and symbol grounding. Infants can learn from their surroundings through exploration and experience, developing new neuronal connections as they learn. They can also apply their prior knowledge to solve unfamiliar problems. With infant-like emergent behavior, the presented models can operate on different problems without modifications, producing new patterns not present in the input vectors and allowing interactive result visualization. The proposed models are applied to the color, handwritten digits clustering, finger identification, and image classification problems to evaluate their adaptiveness and infant-like knowledge building. The results show that the proposed models are the preferred generalized models for assistive robots.
Full article
(This article belongs to the Section Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research
Mach. Learn. Knowl. Extr. 2023, 5(2), 473-490; https://doi.org/10.3390/make5020029 - 14 May 2023
Abstract
►▼
Show Figures
Fields in the social sciences, such as education research, have started to expand the use of computer-based research methods to supplement traditional research approaches. Natural language processing techniques, such as topic modeling, may support qualitative data analysis by providing early categories that researchers
[...] Read more.
Fields in the social sciences, such as education research, have started to expand the use of computer-based research methods to supplement traditional research approaches. Natural language processing techniques, such as topic modeling, may support qualitative data analysis by providing early categories that researchers may interpret and refine. This study contributes to this body of research and answers the following research questions: (RQ1) What is the relative coverage of the latent Dirichlet allocation (LDA) topic model and human coding in terms of the breadth of the topics/themes extracted from the text collection? (RQ2) What is the relative depth or level of detail among identified topics using LDA topic models and human coding approaches? A dataset of student reflections was qualitatively analyzed using LDA topic modeling and human coding approaches, and the results were compared. The findings suggest that topic models can provide reliable coverage and depth of themes present in a textual collection comparable to human coding but require manual interpretation of topics. The breadth and depth of human coding output is heavily dependent on the expertise of coders and the size of the collection; these factors are better handled in the topic modeling approach.
Full article

Figure 1
Open AccessArticle
A Multi-Input Machine Learning Approach to Classifying Sex Trafficking from Online Escort Advertisements
Mach. Learn. Knowl. Extr. 2023, 5(2), 460-472; https://doi.org/10.3390/make5020028 - 10 May 2023
Abstract
Sex trafficking victims are often advertised through online escort sites. These ads can be publicly accessed, but law enforcement lacks the resources to comb through hundreds of ads to identify those that may feature sex-trafficked individuals. The purpose of this study was to
[...] Read more.
Sex trafficking victims are often advertised through online escort sites. These ads can be publicly accessed, but law enforcement lacks the resources to comb through hundreds of ads to identify those that may feature sex-trafficked individuals. The purpose of this study was to implement and test multi-input, deep learning (DL) binary classification models to predict the probability of an online escort ad being associated with sex trafficking (ST) activity and aid in the detection and investigation of ST. Data from 12,350 scraped and classified ads were split into training and test sets (80% and 20%, respectively). Multi-input models that included recurrent neural networks (RNN) for text classification, convolutional neural networks (CNN, specifically EfficientNetB6 or ENET) for image/emoji classification, and neural networks (NN) for feature classification were trained and used to classify the 20% test set. The best-performing DL model included text and imagery inputs, resulting in an accuracy of 0.82 and an F1 score of 0.70. More importantly, the best classifier (RNN + ENET) correctly identified 14 of 14 sites that had classification probability estimates of 0.845 or greater (1.0 precision); precision was 96% for the multi-input model (NN + RNN + ENET) when only the ads associated with the highest positive classification probabilities (>0.90) were considered (n = 202 ads). The models developed could be productionalized and piloted with criminal investigators, as they could potentially increase their efficiency in identifying potential ST victims.
Full article
(This article belongs to the Special Issue Deep Learning Methods for Natural Language Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Tree-Structured Model with Unbiased Variable Selection and Interaction Detection for Ranking Data
by
and
Mach. Learn. Knowl. Extr. 2023, 5(2), 448-459; https://doi.org/10.3390/make5020027 - 09 May 2023
Abstract
►▼
Show Figures
In this article, we propose a tree-structured method for either complete or partial rank data that incorporates covariate information into the analysis. We use conditional independence tests based on hierarchical log-linear models for three-way contingency tables to select split variables and cut points,
[...] Read more.
In this article, we propose a tree-structured method for either complete or partial rank data that incorporates covariate information into the analysis. We use conditional independence tests based on hierarchical log-linear models for three-way contingency tables to select split variables and cut points, and apply a simple Bonferroni rule to declare whether a node worths splitting or not. Through simulations, we also demonstrate that the proposed method is unbiased and effective in selecting informative split variables. Our proposed method can be applied across various fields to provide a flexible and robust framework for analyzing rank data and understanding how various factors affect individual judgments on ranking. This can help improve the quality of products or services and assist with informed decision making.
Full article

Figure 1
Open AccessArticle
Artificial Intelligence-Based Prediction of Spanish Energy Pricing and Its Impact on Electric Consumption
by
, , and
Mach. Learn. Knowl. Extr. 2023, 5(2), 431-447; https://doi.org/10.3390/make5020026 - 02 May 2023
Abstract
►▼
Show Figures
The energy supply sector faces significant challenges, such as the ongoing COVID-19 pandemic and the ongoing conflict in Ukraine, which affect the stability and efficiency of the energy system. In this study, we highlight the importance of electricity pricing and the need for
[...] Read more.
The energy supply sector faces significant challenges, such as the ongoing COVID-19 pandemic and the ongoing conflict in Ukraine, which affect the stability and efficiency of the energy system. In this study, we highlight the importance of electricity pricing and the need for accurate models to estimate electricity consumption and prices, with a focus on Spain. Using hourly data, we implemented various machine learning models, including linear regression, random forest, XGBoost, LSTM, and GRU, to forecast electricity consumption and prices. Our findings have important policy implications. Firstly, our study demonstrates the potential of using advanced analytics to enhance the accuracy of electricity price and consumption forecasts, helping policymakers anticipate changes in energy demand and supply and ensure grid stability. Secondly, we emphasize the importance of having access to high-quality data for electricity demand and price modeling. Finally, we provide insights into the strengths and weaknesses of different machine learning algorithms for electricity price and consumption modeling. Our results show that the LSTM and GRU artificial neural networks are the best models for price and consumption modeling with no significant difference.
Full article

Figure 1
Open AccessArticle
A Reinforcement Learning Approach for Scheduling Problems with Improved Generalization through Order Swapping
Mach. Learn. Knowl. Extr. 2023, 5(2), 418-430; https://doi.org/10.3390/make5020025 - 29 Apr 2023
Abstract
The scheduling of production resources (such as associating jobs to machines) plays a vital role for the manufacturing industry not only for saving energy, but also for increasing the overall efficiency. Among the different job scheduling problems, the Job Shop Scheduling Problem (JSSP)
[...] Read more.
The scheduling of production resources (such as associating jobs to machines) plays a vital role for the manufacturing industry not only for saving energy, but also for increasing the overall efficiency. Among the different job scheduling problems, the Job Shop Scheduling Problem (JSSP) is addressed in this work. JSSP falls into the category of NP-hard Combinatorial Optimization Problem (COP), in which solving the problem through exhaustive search becomes unfeasible. Simple heuristics such as First-In, First-Out, Largest Processing Time First and metaheuristics such as taboo search are often adopted to solve the problem by truncating the search space. The viability of the methods becomes inefficient for large problem sizes as it is either far from the optimum or time consuming. In recent years, the research towards using Deep Reinforcement Learning (DRL) to solve COPs has gained interest and has shown promising results in terms of solution quality and computational efficiency. In this work, we provide an novel approach to solve the JSSP examining the objectives generalization and solution effectiveness using DRL. In particular, we employ the Proximal Policy Optimization (PPO) algorithm that adopts the policy-gradient paradigm that is found to perform well in the constrained dispatching of jobs. We incorporated a new method called Order Swapping Mechanism (OSM) in the environment to achieve better generalized learning of the problem. The performance of the presented approach is analyzed in depth by using a set of available benchmark instances and comparing our results with the work of other groups.
Full article
(This article belongs to the Topic Artificial Intelligence and Computational Methods: Modeling, Simulations and Optimization of Complex Systems)
►▼
Show Figures

Figure 1
Open AccessArticle
Lottery Ticket Search on Untrained Models with Applied Lottery Sample Selection
by
and
Mach. Learn. Knowl. Extr. 2023, 5(2), 400-417; https://doi.org/10.3390/make5020024 - 18 Apr 2023
Abstract
►▼
Show Figures
In this paper, we present a new approach to improve tabular datasets by applying the lottery ticket hypothesis to tabular neural networks. Prior approaches were required to train the original large-sized model to find these lottery tickets. In this paper we eliminate the
[...] Read more.
In this paper, we present a new approach to improve tabular datasets by applying the lottery ticket hypothesis to tabular neural networks. Prior approaches were required to train the original large-sized model to find these lottery tickets. In this paper we eliminate the need to train the original model and discover lottery tickets using networks a fraction of the model’s size. Moreover, we show that we can remove up to 95% of the training dataset to discover lottery tickets, while still maintaining similar accuracy. The approach uses a genetic algorithm (GA) to train candidate pruned models by encoding the nodes of the original model for selection measured by performance and weight metrics. We found that the search process does not require a large portion of the training data, but when the final pruned model is selected it can be retrained on the full dataset, even if it is often not required. We propose a lottery sample hypothesis similar to the lottery ticket hypotheses where a subsample of lottery samples of the training set can train a model with equivalent performance to the original dataset. We show that the combination of finding lottery samples alongside lottery tickets can allow for faster searches and greater accuracy.
Full article

Figure 1
Open AccessArticle
A Diabetes Prediction System Based on Incomplete Fused Data Sources
Mach. Learn. Knowl. Extr. 2023, 5(2), 384-399; https://doi.org/10.3390/make5020023 - 10 Apr 2023
Abstract
In recent years, the diabetes population has grown younger. Therefore, it has become a key problem to make a timely and effective prediction of diabetes, especially given a single data source. Meanwhile, there are many data sources of diabetes patients collected around the
[...] Read more.
In recent years, the diabetes population has grown younger. Therefore, it has become a key problem to make a timely and effective prediction of diabetes, especially given a single data source. Meanwhile, there are many data sources of diabetes patients collected around the world, and it is extremely important to integrate these heterogeneous data sources to accurately predict diabetes. For the different data sources used to predict diabetes, the predictors may be different. In other words, some special features exist only in certain data sources, which leads to the problem of missing values. Considering the uncertainty of the missing values within the fused dataset, multiple imputation and a method based on graph representation is used to impute the missing values within the fused dataset. The logistic regression model and stacking strategy are applied for diabetes training and prediction on the fused dataset. It is proved that the idea of combining heterogeneous datasets and imputing the missing values produced in the fusion process can effectively improve the performance of diabetes prediction. In addition, the proposed diabetes prediction method can be further extended to any scenarios where heterogeneous datasets with the same label types and different feature attributes exist.
Full article
(This article belongs to the Topic Advances in Data Analytics with Applications to Health Care)
►▼
Show Figures

Figure 1
Open AccessArticle
3t2FTS: A Novel Feature Transform Strategy to Classify 3D MRI Voxels and Its Application on HGG/LGG Classification
Mach. Learn. Knowl. Extr. 2023, 5(2), 359-383; https://doi.org/10.3390/make5020022 - 06 Apr 2023
Cited by 1
Abstract
The distinction between high-grade glioma (HGG) and low-grade glioma (LGG) is generally performed with two-dimensional (2D) image analyses that constitute semi-automated tumor classification. However, a fully automated computer-aided diagnosis (CAD) can only be realized using an adaptive classification framework based on three-dimensional (3D)
[...] Read more.
The distinction between high-grade glioma (HGG) and low-grade glioma (LGG) is generally performed with two-dimensional (2D) image analyses that constitute semi-automated tumor classification. However, a fully automated computer-aided diagnosis (CAD) can only be realized using an adaptive classification framework based on three-dimensional (3D) segmented tumors. In this paper, we handle the classification section of a fully automated CAD related to the aforementioned requirement. For this purpose, a 3D to 2D feature transform strategy (3t2FTS) is presented operating first-order statistics (FOS) in order to form the input data by considering every phase (T1, T2, T1c, and FLAIR) of information on 3D magnetic resonance imaging (3D MRI). Herein, the main aim is the transformation of 3D data analyses into 2D data analyses so as to applicate the information to be fed to the efficient deep learning methods. In other words, 2D identification (2D-ID) of 3D voxels is produced. In our experiments, eight transfer learning models (DenseNet201, InceptionResNetV2, InceptionV3, ResNet50, ResNet101, SqueezeNet, VGG19, and Xception) were evaluated to reveal the appropriate one for the output of 3t2FTS and to design the proposed framework categorizing the 210 HGG–75 LGG instances in the BraTS 2017/2018 challenge dataset. The hyperparameters of the models were examined in a comprehensive manner to reveal the highest performance of the models to be reached. In our trails, two-fold cross-validation was considered as the test method to assess system performance. Consequently, the highest performance was observed with the framework including the 3t2FTS and ResNet50 models by achieving 80% classification accuracy for the 3D-based classification of brain tumors.
Full article
(This article belongs to the Special Issue Machine Learning for Biomedical Data Processing)
►▼
Show Figures

Figure 1
Open AccessArticle
Generalized Persistence for Equivariant Operators in Machine Learning
Mach. Learn. Knowl. Extr. 2023, 5(2), 346-358; https://doi.org/10.3390/make5020021 - 24 Mar 2023
Abstract
Artificial neural networks can learn complex, salient data features to achieve a given task. On the opposite end of the spectrum, mathematically grounded methods such as topological data analysis allow users to design analysis pipelines fully aware of data constraints and symmetries. We
[...] Read more.
Artificial neural networks can learn complex, salient data features to achieve a given task. On the opposite end of the spectrum, mathematically grounded methods such as topological data analysis allow users to design analysis pipelines fully aware of data constraints and symmetries. We introduce an original class of neural network layers based on a generalization of topological persistence. The proposed persistence-based layers allow the users to encode specific data properties (e.g., equivariance) easily. Additionally, these layers can be trained through standard optimization procedures (backpropagation) and composed with classical layers. We test the performance of generalized persistence-based layers as pooling operators in convolutional neural networks for image classification on the MNIST, Fashion-MNIST and CIFAR-10 datasets.
Full article
(This article belongs to the Topic Topology vs. Geometry in Data Analysis/Machine Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Human Action Recognition-Based IoT Services for Emergency Response Management
Mach. Learn. Knowl. Extr. 2023, 5(1), 330-345; https://doi.org/10.3390/make5010020 - 13 Mar 2023
Abstract
Emergency incidents can appear anytime and any place, which makes it very challenging for emergency medical services practitioners to predict the location and the time of such emergencies. The dynamic nature of the appearance of emergency incidents can cause delays in emergency medical
[...] Read more.
Emergency incidents can appear anytime and any place, which makes it very challenging for emergency medical services practitioners to predict the location and the time of such emergencies. The dynamic nature of the appearance of emergency incidents can cause delays in emergency medical services, which can sometimes lead to vital injury complications or even death, in some cases. The delay of emergency medical services may occur as a result of a call that was made too late or because no one was present to make the call. With the emergence of smart cities and promising technologies, such as the Internet of Things (IoT) and computer vision techniques, such issues can be tackled. This article proposes a human action recognition-based IoT services architecture for emergency response management. In particular, the architecture exploits IoT devices (e.g., surveillance cameras) that are distributed in public areas to detect emergency incidents, make a request for the nearest emergency medical services, and send emergency location information. Moreover, this article proposes an emergency incidents detection model, based on human action recognition and object tracking, using image processing and classifying the collected images, based on action modeling. The primary notion of the proposed model is to classify human activity, whether it is an emergency incident or other daily activities, using a Convolutional Neural Network (CNN) and Support Vector Machine (SVM). To demonstrate the feasibility of the proposed emergency detection model, several experiments were conducted using the UR fall detection dataset, which consists of emergency and other daily activities footage. The results of the conducted experiments were promising, with the proposed model scoring 0.99, 0.97, 0.97, and 0.98 in terms of sensitivity, specificity, precision, and accuracy, respectively.
Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection
Mach. Learn. Knowl. Extr. 2023, 5(1), 304-329; https://doi.org/10.3390/make5010019 - 11 Mar 2023
Cited by 1
Abstract
Data augmentation is an important procedure in deep learning. GAN-based data augmentation can be utilized in many domains. For instance, in the credit card fraud domain, the imbalanced dataset problem is a major one as the number of credit card fraud cases is
[...] Read more.
Data augmentation is an important procedure in deep learning. GAN-based data augmentation can be utilized in many domains. For instance, in the credit card fraud domain, the imbalanced dataset problem is a major one as the number of credit card fraud cases is in the minority compared to legal payments. On the other hand, generative techniques are considered effective ways to rebalance the imbalanced class issue, as these techniques balance both minority and majority classes before the training. In a more recent period, Generative Adversarial Networks (GANs) are considered one of the most popular data generative techniques as they are used in big data settings. This research aims to present a survey on data augmentation using various GAN variants in the credit card fraud detection domain. In this survey, we offer a comprehensive summary of several peer-reviewed research papers on GAN synthetic generation techniques for fraud detection in the financial sector. In addition, this survey includes various solutions proposed by different researchers to balance imbalanced classes. In the end, this work concludes by pointing out the limitations of the most recent research articles and future research issues, and proposes solutions to address these problems.
Full article
(This article belongs to the Special Issue Privacy and Security in Machine Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Skew Class-Balanced Re-Weighting for Unbiased Scene Graph Generation
by
and
Mach. Learn. Knowl. Extr. 2023, 5(1), 287-303; https://doi.org/10.3390/make5010018 - 10 Mar 2023
Abstract
An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-Balanced Re-Weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing
[...] Read more.
An unbiased scene graph generation (SGG) algorithm referred to as Skew Class-Balanced Re-Weighting (SCR) is proposed for considering the unbiased predicate prediction caused by the long-tailed distribution. The prior works focus mainly on alleviating the deteriorating performances of the minority predicate predictions, showing drastic dropping recall scores, i.e., losing the majority predicate performances. It has not yet correctly analyzed the trade-off between majority and minority predicate performances in the limited SGG datasets. In this paper, to alleviate the issue, the Skew Class-Balanced Re-Weighting (SCR) loss function is considered for the unbiased SGG models. Leveraged by the skewness of biased predicate predictions, the SCR estimates the target predicate weight coefficient and then re-weights more to the biased predicates for better trading-off between the majority predicates and the minority ones. Extensive experiments conducted on the standard Visual Genome dataset and Open Image V4 and V6 show the performances and generality of the SCR with the traditional SGG models.
Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
►▼
Show Figures

Figure 1
Open AccessArticle
Painting the Black Box White: Experimental Findings from Applying XAI to an ECG Reading Setting
by
, , , , and
Mach. Learn. Knowl. Extr. 2023, 5(1), 269-286; https://doi.org/10.3390/make5010017 - 08 Mar 2023
Abstract
The emergence of black-box, subsymbolic, and statistical AI systems has motivated a rapid increase in the interest regarding explainable AI (XAI), which encompasses both inherently explainable techniques, as well as approaches to make black-box AI systems explainable to human decision makers. Rather than
[...] Read more.
The emergence of black-box, subsymbolic, and statistical AI systems has motivated a rapid increase in the interest regarding explainable AI (XAI), which encompasses both inherently explainable techniques, as well as approaches to make black-box AI systems explainable to human decision makers. Rather than always making black boxes transparent, these approaches are at risk of painting the black boxes white, thus failing to provide a level of transparency that would increase the system’s usability and comprehensibility, or even at risk of generating new errors (i.e., white-box paradox). To address these usability-related issues, in this work we focus on the cognitive dimension of users’ perception of explanations and XAI systems. We investigated these perceptions in light of their relationship with users’ characteristics (e.g., expertise) through a questionnaire-based user study involved 44 cardiology residents and specialists in an AI-supported ECG reading task. Our results point to the relevance and correlation of the dimensions of trust, perceived quality of explanations, and tendency to defer the decision process to automation (i.e., technology dominance). This contribution calls for the evaluation of AI-based support systems from a human–AI interaction-oriented perspective, laying the ground for further investigation of XAI and its effects on decision making and user experience.
Full article
(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI))
►▼
Show Figures

Figure 1
Open AccessArticle
A Novel Pipeline Age Evaluation: Considering Overall Condition Index and Neural Network Based on Measured Data
Mach. Learn. Knowl. Extr. 2023, 5(1), 252-268; https://doi.org/10.3390/make5010016 - 20 Feb 2023
Cited by 2
Abstract
►▼
Show Figures
Today, the chemical corrosion of metals is one of the main problems of large productions, especially in the oil and gas industries. Due to massive downtime connected to corrosion failures, pipeline corrosion is a central issue in many oil and gas industries. Therefore,
[...] Read more.
Today, the chemical corrosion of metals is one of the main problems of large productions, especially in the oil and gas industries. Due to massive downtime connected to corrosion failures, pipeline corrosion is a central issue in many oil and gas industries. Therefore, the determination of the corrosion progress of oil and gas pipelines is crucial for monitoring the reliability and alleviation of failures that can positively impact health, safety, and the environment. Gas transmission and distribution pipes and other structures buried (or immersed) in an electrolyte, by the existing conditions and due to the metallurgical structure, are corroded. After some time, this disrupts an active system and process by causing damage. The worst corrosion for metals implanted in the soil is in areas where electrical currents are lost. Therefore, cathodic protection (CP) is the most effective method to prevent the corrosion of structures buried in the soil. Our aim in this paper is first to investigate the effect of stray currents on failure rate using the condition index, and then to estimate the remaining useful life of CP gas pipelines using an artificial neural network (ANN). Predicting future values using previous data based on the time series feature is also possible. Therefore, this paper first uses the general equipment condition monitoring method to detect failures. The time series model of data is then measured and operated by neural networks. Finally, the amount of failure over time is determined.
Full article

Figure 1
Open AccessArticle
Can Principal Component Analysis Be Used to Explore the Relationship of Rowing Kinematics and Force Production in Elite Rowers during a Step Test? A Pilot Study
Mach. Learn. Knowl. Extr. 2023, 5(1), 237-251; https://doi.org/10.3390/make5010015 - 17 Feb 2023
Cited by 1
Abstract
►▼
Show Figures
Investigating the relationship between the movement patterns of multiple limb segments during the rowing stroke on the resulting force production in elite rowers can provide foundational insight into optimal technique. It can also highlight potential mechanisms of injury and performance improvement. The purpose
[...] Read more.
Investigating the relationship between the movement patterns of multiple limb segments during the rowing stroke on the resulting force production in elite rowers can provide foundational insight into optimal technique. It can also highlight potential mechanisms of injury and performance improvement. The purpose of this study was to conduct a kinematic analysis of the rowing stroke together with force production during a step test in elite national-team heavyweight men to evaluate the fundamental patterns that contribute to expert performance. Twelve elite heavyweight male rowers performed a step test on a row-perfect sliding ergometer [5 × 1 min with 1 min rest at set stroke rates (20, 24, 28, 32, 36)]. Joint angle displacement and velocity of the hip, knee and elbow were measured with electrogoniometers, and force was measured with a tension/compression force transducer in line with the handle. To explore interactions between kinematic patterns and stroke performance variables, joint angular velocities of the hip, knee and elbow were entered into principal component analysis (PCA) and separate ANCOVAs were run for each performance variable (peak force, impulse, split time) with dependent variables, and the kinematic loading scores (Kpc,ls) as covariates with athlete/stroke rate as fixed factors. The results suggested that rowers’ kinematic patterns respond differently across varying stroke rates. The first seven PCs accounted for 79.5% (PC1 [26.4%], PC2 [14.6%], PC3 [11.3%], PC4 [8.4%], PC5 [7.5%], PC6 [6.5%], PC7 [4.8%]) of the variances in the signal. The PCs contributing significantly (p ≤ 0.05) to performance metrics based on PC loading scores from an ANCOVA were (PC1, PC2, PC6) for split time, (PC3, PC4, PC5, PC6) for impulse, and (PC1, PC6, PC7) for peak force. The significant PCs for each performance measure were used to reconstruct the kinematic patterns for split time, impulse and peak force separately. Overall, PCA was able to differentiate between rowers and stroke rates, and revealed features of the rowing-stroke technique correlated with measures of performance that may highlight meaningful technique-optimization strategies. PCA could be used to provide insight into differences in kinematic strategies that could result in suboptimal performance, potential asymmetries or to determine how well a desired technique change has been accomplished by group and/or individual athletes.
Full article

Figure 1
Open AccessArticle
InvMap and Witness Simplicial Variational Auto-Encoders
Mach. Learn. Knowl. Extr. 2023, 5(1), 199-236; https://doi.org/10.3390/make5010014 - 05 Feb 2023
Abstract
Variational auto-encoders (VAEs) are deep generative models used for unsupervised learning, however their standard version is not topology-aware in practice since the data topology may not be taken into consideration. In this paper, we propose two different approaches with the aim to preserve
[...] Read more.
Variational auto-encoders (VAEs) are deep generative models used for unsupervised learning, however their standard version is not topology-aware in practice since the data topology may not be taken into consideration. In this paper, we propose two different approaches with the aim to preserve the topological structure between the input space and the latent representation of a VAE. Firstly, we introduce InvMap-VAE as a way to turn any dimensionality reduction technique, given an embedding it produces, into a generative model within a VAE framework providing an inverse mapping into original space. Secondly, we propose the Witness Simplicial VAE as an extension of the simplicial auto-encoder to the variational setup using a witness complex for computing the simplicial regularization, and we motivate this method theoretically using tools from algebraic topology. The Witness Simplicial VAE is independent of any dimensionality reduction technique and together with its extension, Isolandmarks Witness Simplicial VAE, preserves the persistent Betti numbers of a dataset better than a standard VAE.
Full article
(This article belongs to the Topic Topology vs. Geometry in Data Analysis/Machine Learning)
►▼
Show Figures

Figure 1
Open AccessSystematic Review
Machine Learning and Prediction of Infectious Diseases: A Systematic Review
Mach. Learn. Knowl. Extr. 2023, 5(1), 175-198; https://doi.org/10.3390/make5010013 - 01 Feb 2023
Cited by 1
Abstract
The aim of the study is to show whether it is possible to predict infectious disease outbreaks early, by using machine learning. This study was carried out following the guidelines of the Cochrane Collaboration and the meta-analysis of observational studies in epidemiology and
[...] Read more.
The aim of the study is to show whether it is possible to predict infectious disease outbreaks early, by using machine learning. This study was carried out following the guidelines of the Cochrane Collaboration and the meta-analysis of observational studies in epidemiology and the preferred reporting items for systematic reviews and meta-analyses. The suitable bibliography on PubMed/Medline and Scopus was searched by combining text, words, and titles on medical topics. At the end of the search, this systematic review contained 75 records. The studies analyzed in this systematic review demonstrate that it is possible to predict the incidence and trends of some infectious diseases; by combining several techniques and types of machine learning, it is possible to obtain accurate and plausible results.
Full article
(This article belongs to the Special Issue Machine Learning for Biomedical Data Processing)
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Algorithms, Data, Entropy, MAKE, Mathematics, Robotics
Safe and Secure Autonomous Systems
Topic Editors: Xiaowei Huang, Wenjie Ruan, Xingyu ZhaoDeadline: 30 June 2023
Topic in
Applied Sciences, Sensors, J. Imaging, MAKE
Applications in Image Analysis and Pattern Recognition
Topic Editors: Bin Fan, Wenqi RenDeadline: 31 August 2023
Topic in
Applied Sciences, Electronics, MAKE, J. Imaging, Sensors
Applied Computer Vision and Pattern Recognition: 2nd Volume
Topic Editors: Antonio Fernández-Caballero, Byung-Gyu KimDeadline: 30 September 2023
Topic in
Entropy, Algorithms, Computation, MAKE, Energies, Materials
Artificial Intelligence and Computational Methods: Modeling, Simulations and Optimization of Complex Systems
Topic Editors: Jaroslaw Krzywanski, Yunfei Gao, Marcin Sosnowski, Karolina Grabowska, Dorian Skrobek, Ghulam Moeen Uddin, Anna Kulakowska, Anna Zylka, Bachil El FilDeadline: 20 October 2023

Conferences
Special Issues
Special Issue in
MAKE
Machine Learning for Biomedical Data Processing
Guest Editors: Abdulhamit Subasi, Humaira Nisar, Saeed Mian QaisarDeadline: 15 June 2023
Special Issue in
MAKE
Advances in Explainable Artificial Intelligence (XAI)
Guest Editor: Luca LongoDeadline: 15 July 2023
Special Issue in
MAKE
Fairness and Explanation for Trustworthy AI
Guest Editors: Jianlong Zhou, Andreas Holzinger, Fang ChenDeadline: 15 August 2023
Special Issue in
MAKE
Deep Learning in Image Analysis and Pattern Recognition
Guest Editors: Guoqing Chao, Xianzhi WangDeadline: 30 August 2023
Topical Collections
Topical Collection in
MAKE
Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction
Collection Editor: Andreas Holzinger