Next Issue
Volume 6, March
Previous Issue
Volume 5, September
 
 

Mach. Learn. Knowl. Extr., Volume 5, Issue 4 (December 2023) – 33 articles

Cover Story (view full-size image): Unraveling the opacity of Deep Reinforcement Learning (DRL), our study delves into optimizing resource use. Contrary to the trend of increasing Experience Replay capacity, we intentionally reduce it, discovering a path to resource-efficient DRL. Across 20 Atari games and capacities from 1×106 to 1×102, we show that reducing capacity from 1×104 to 5×103 doesn't significantly impact rewards. To enhance interpretability, we visualize Experience Replay with the Deep SHAP Explainer, providing transparent explanations for agent decisions. Our findings advocate for a cautious reduction in Experience Replay, emphasizing interpretable decision explanations for efficient DRL. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
26 pages, 16374 KiB  
Article
Statistical Analysis of Imbalanced Classification with Training Size Variation and Subsampling on Datasets of Research Papers in Biomedical Literature
by Jose Dixon and Md Rahman
Mach. Learn. Knowl. Extr. 2023, 5(4), 1953-1978; https://doi.org/10.3390/make5040095 - 11 Dec 2023
Viewed by 1783
Abstract
The overall purpose of this paper is to demonstrate how data preprocessing, training size variation, and subsampling can dynamically change the performance metrics of imbalanced text classification. The methodology encompasses using two different supervised learning classification approaches of feature engineering and data preprocessing [...] Read more.
The overall purpose of this paper is to demonstrate how data preprocessing, training size variation, and subsampling can dynamically change the performance metrics of imbalanced text classification. The methodology encompasses using two different supervised learning classification approaches of feature engineering and data preprocessing with the use of five machine learning classifiers, five imbalanced sampling techniques, specified intervals of training and subsampling sizes, statistical analysis using R and tidyverse on a dataset of 1000 portable document format files divided into five labels from the World Health Organization Coronavirus Research Downloadable Articles of COVID-19 papers and PubMed Central databases of non-COVID-19 papers for binary classification that affects the performance metrics of precision, recall, receiver operating characteristic area under the curve, and accuracy. One approach that involves labeling rows of sentences based on regular expressions significantly improved the performance of imbalanced sampling techniques verified by performing statistical analysis using a t-test documenting performance metrics of iterations versus another approach that automatically labels the sentences based on how the documents are organized into positive and negative classes. The study demonstrates the effectiveness of ML classifiers and sampling techniques in text classification datasets, with different performance levels and class imbalance issues observed in manual and automatic methods of data processing. Full article
(This article belongs to the Topic Bioinformatics and Intelligent Information Processing)
Show Figures

Figure 1

16 pages, 1801 KiB  
Article
Effective Detection of Epileptic Seizures through EEG Signals Using Deep Learning Approaches
by Sakorn Mekruksavanich and Anuchit Jitpattanakul
Mach. Learn. Knowl. Extr. 2023, 5(4), 1937-1952; https://doi.org/10.3390/make5040094 - 11 Dec 2023
Viewed by 1836
Abstract
Epileptic seizures are a prevalent neurological condition that impacts a considerable portion of the global population. Timely and precise identification can result in as many as 70% of individuals achieving freedom from seizures. To achieve this, there is a pressing need for smart, [...] Read more.
Epileptic seizures are a prevalent neurological condition that impacts a considerable portion of the global population. Timely and precise identification can result in as many as 70% of individuals achieving freedom from seizures. To achieve this, there is a pressing need for smart, automated systems to assist medical professionals in identifying neurological disorders correctly. Previous efforts have utilized raw electroencephalography (EEG) data and machine learning techniques to classify behaviors in patients with epilepsy. However, these studies required expertise in clinical domains like radiology and clinical procedures for feature extraction. Traditional machine learning for classification relied on manual feature engineering, limiting performance. Deep learning excels at automated feature learning directly from raw data sans human effort. For example, deep neural networks now show promise in analyzing raw EEG data to detect seizures, eliminating intensive clinical or engineering needs. Though still emerging, initial studies demonstrate practical applications across medical domains. In this work, we introduce a novel deep residual model called ResNet-BiGRU-ECA, analyzing brain activity through EEG data to accurately identify epileptic seizures. To evaluate our proposed deep learning model’s efficacy, we used a publicly available benchmark dataset on epilepsy. The results of our experiments demonstrated that our suggested model surpassed both the basic model and cutting-edge deep learning models, achieving an outstanding accuracy rate of 0.998 and the top F1-score of 0.998. Full article
(This article belongs to the Special Issue Sustainable Applications for Machine Learning)
Show Figures

Figure 1

16 pages, 7452 KiB  
Article
Social Intelligence Mining: Unlocking Insights from X
by Hossein Hassani, Nadejda Komendantova, Elena Rovenskaya and Mohammad Reza Yeganegi
Mach. Learn. Knowl. Extr. 2023, 5(4), 1921-1936; https://doi.org/10.3390/make5040093 - 11 Dec 2023
Cited by 1 | Viewed by 1627
Abstract
Social trend mining, situated at the confluence of data science and social research, provides a novel lens through which to examine societal dynamics and emerging trends. This paper explores the intricate landscape of social trend mining, with a specific emphasis on discerning leading [...] Read more.
Social trend mining, situated at the confluence of data science and social research, provides a novel lens through which to examine societal dynamics and emerging trends. This paper explores the intricate landscape of social trend mining, with a specific emphasis on discerning leading and lagging trends. Within this context, our study employs social trend mining techniques to scrutinize X (formerly Twitter) data pertaining to risk management, earthquakes, and disasters. A comprehensive comprehension of how individuals perceive the significance of these pivotal facets within disaster risk management is essential for shaping policies that garner public acceptance. This paper sheds light on the intricacies of public sentiment and provides valuable insights for policymakers and researchers alike. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

16 pages, 731 KiB  
Article
Generalized Permutants and Graph GENEOs
by Faraz Ahmad, Massimo Ferri and Patrizio Frosini
Mach. Learn. Knowl. Extr. 2023, 5(4), 1905-1920; https://doi.org/10.3390/make5040092 - 09 Dec 2023
Viewed by 1231
Abstract
This paper is part of a line of research devoted to developing a compositional and geometric theory of Group Equivariant Non-Expansive Operators (GENEOs) for Geometric Deep Learning. It has two objectives. The first objective is to generalize the notions of permutants and permutant [...] Read more.
This paper is part of a line of research devoted to developing a compositional and geometric theory of Group Equivariant Non-Expansive Operators (GENEOs) for Geometric Deep Learning. It has two objectives. The first objective is to generalize the notions of permutants and permutant measures, originally defined for the identity of a single “perception pair”, to a map between two such pairs. The second and main objective is to extend the application domain of the whole theory, which arose in the set-theoretical and topological environments, to graphs. This is performed using classical methods of mathematical definitions and arguments. The theoretical outcome is that, both in the case of vertex-weighted and edge-weighted graphs, a coherent theory is developed. Several simple examples show what may be hoped from GENEOs and permutants in graph theory and how they can be built. Rather than being a competitor to other methods in Geometric Deep Learning, this theory is proposed as an approach that can be integrated with such methods. Full article
(This article belongs to the Section Visualization)
Show Figures

Figure 1

17 pages, 3015 KiB  
Article
Solving Partially Observable 3D-Visual Tasks with Visual Radial Basis Function Network and Proximal Policy Optimization
by Julien Hautot, Céline Teulière and Nourddine Azzaoui
Mach. Learn. Knowl. Extr. 2023, 5(4), 1888-1904; https://doi.org/10.3390/make5040091 - 01 Dec 2023
Viewed by 1420
Abstract
Visual Reinforcement Learning (RL) has been largely investigated in recent decades. Existing approaches are often composed of multiple networks requiring massive computational power to solve partially observable tasks from high-dimensional data such as images. Using State Representation Learning (SRL) [...] Read more.
Visual Reinforcement Learning (RL) has been largely investigated in recent decades. Existing approaches are often composed of multiple networks requiring massive computational power to solve partially observable tasks from high-dimensional data such as images. Using State Representation Learning (SRL) has been shown to improve the performance of visual RL by reducing the high-dimensional data into compact representation, but still often relies on deep networks and on the environment. In contrast, we propose a lighter, more generic method to extract sparse and localized features from raw images without training. We achieve this using a Visual Radial Basis Function Network (VRBFN), which offers significant practical advantages, including efficient and accurate training with minimal complexity due to its two linear layers. For real-world applications, its scalability and resilience to noise are essential, as real sensors are subject to change and noise. Unlike CNNs, which may require extensive retraining, this network might only need minor fine-tuning. We test the efficiency of the VRBFN representation to solve different RL tasks using Proximal Policy Optimization (PPO). We present a large study and comparison of our extraction methods with five classical visual RL and SRL approaches on five different first-person partially observable scenarios. We show that this approach presents appealing features such as sparsity and robustness to noise and that the obtained results when training RL agents are better than other tested methods on four of the five proposed scenarios. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

11 pages, 695 KiB  
Article
Bayesian Network Structural Learning Using Adaptive Genetic Algorithm with Varying Population Size
by Rafael Rodrigues Mendes Ribeiro and Carlos Dias Maciel
Mach. Learn. Knowl. Extr. 2023, 5(4), 1877-1887; https://doi.org/10.3390/make5040090 - 01 Dec 2023
Viewed by 1394
Abstract
A Bayesian network (BN) is a probabilistic graphical model that can model complex and nonlinear relationships. Its structural learning from data is an NP-hard problem because of its search-space size. One method to perform structural learning is a search and score approach, which [...] Read more.
A Bayesian network (BN) is a probabilistic graphical model that can model complex and nonlinear relationships. Its structural learning from data is an NP-hard problem because of its search-space size. One method to perform structural learning is a search and score approach, which uses a search algorithm and structural score. A study comparing 15 algorithms showed that hill climbing (HC) and tabu search (TABU) performed the best overall on the tests. This work performs a deeper analysis of the application of the adaptive genetic algorithm with varying population size (AGAVaPS) on the BN structural learning problem, which a preliminary test showed that it had the potential to perform well on. AGAVaPS is a genetic algorithm that uses the concept of life, where each solution is in the population for a number of iterations. Each individual also has its own mutation rate, and there is a small probability of undergoing mutation twice. Parameter analysis of AGAVaPS in BN structural leaning was performed. Also, AGAVaPS was compared to HC and TABU for six literature datasets considering F1 score, structural Hamming distance (SHD), balanced scoring function (BSF), Bayesian information criterion (BIC), and execution time. HC and TABU performed basically the same for all the tests made. AGAVaPS performed better than the other algorithms for F1 score, SHD, and BIC, showing that it can perform well and is a good choice for BN structural learning. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

29 pages, 19255 KiB  
Article
Analysing Semi-Supervised ConvNet Model Performance with Computation Processes
by Elie Neghawi and Yan Liu
Mach. Learn. Knowl. Extr. 2023, 5(4), 1848-1876; https://doi.org/10.3390/make5040089 - 29 Nov 2023
Viewed by 1257
Abstract
The rapid development of semi-supervised machine learning (SSML) algorithms has shown enhanced versatility, but pinpointing the primary influencing factors remains a challenge. Historically, deep neural networks (DNNs) have been used to underpin these algorithms, resulting in improved classification precision. This study aims to [...] Read more.
The rapid development of semi-supervised machine learning (SSML) algorithms has shown enhanced versatility, but pinpointing the primary influencing factors remains a challenge. Historically, deep neural networks (DNNs) have been used to underpin these algorithms, resulting in improved classification precision. This study aims to delve into the performance determinants of SSML models by employing post-hoc explainable artificial intelligence (XAI) methods. By analyzing the components of well-established SSML algorithms and comparing them to newer counterparts, this work redefines semi-supervised computation processes for both data preprocessing and classification. Integrating different types of DNNs, we evaluated the effects of parameter adjustments during training across varied labeled and unlabeled data proportions. Our analysis of 45 experiments showed a notable 8% drop in training loss and a 6.75% enhancement in learning precision when using the Shake-Shake26 classifier with the RemixMatch SSML algorithm. Additionally, our findings suggest a strong positive relationship between the amount of labeled data and training duration, indicating that more labeled data leads to extended training periods, which further influences parameter adjustments in learning processes. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

22 pages, 2387 KiB  
Article
Android Malware Classification Based on Fuzzy Hashing Visualization
by Horacio Rodriguez-Bazan, Grigori Sidorov and Ponciano Jorge Escamilla-Ambrosio
Mach. Learn. Knowl. Extr. 2023, 5(4), 1826-1847; https://doi.org/10.3390/make5040088 - 28 Nov 2023
Cited by 1 | Viewed by 1547
Abstract
The proliferation of Android-based devices has brought about an unprecedented surge in mobile application usage, making the Android ecosystem a prime target for cybercriminals. In this paper, a new method for Android malware classification is proposed. The method implements a convolutional neural network [...] Read more.
The proliferation of Android-based devices has brought about an unprecedented surge in mobile application usage, making the Android ecosystem a prime target for cybercriminals. In this paper, a new method for Android malware classification is proposed. The method implements a convolutional neural network for malware classification using images. The research presents a novel approach to transforming the Android Application Package (APK) into a grayscale image. The image creation utilizes natural language processing techniques for text cleaning, extraction, and fuzzy hashing to represent the decompiled code from the APK in a set of hashes after preprocessing, where the image is composed of n fuzzy hashes that represent an APK. The method was tested on an Android malware dataset with 15,493 samples of five malware types. The proposed method showed an increase in accuracy compared to others in the literature, achieving up to 98.24% in the classification task. Full article
Show Figures

Figure 1

30 pages, 3241 KiB  
Article
Detecting Adversarial Examples Using Surrogate Models
by Borna Feldsar, Rudolf Mayer and Andreas Rauber
Mach. Learn. Knowl. Extr. 2023, 5(4), 1796-1825; https://doi.org/10.3390/make5040087 - 27 Nov 2023
Viewed by 1653
Abstract
Deep Learning has enabled significant progress towards more accurate predictions and is increasingly integrated into our everyday lives in real-world applications; this is true especially for Convolutional Neural Networks (CNNs) in the field of image analysis. Nevertheless, it has been shown that Deep [...] Read more.
Deep Learning has enabled significant progress towards more accurate predictions and is increasingly integrated into our everyday lives in real-world applications; this is true especially for Convolutional Neural Networks (CNNs) in the field of image analysis. Nevertheless, it has been shown that Deep Learning is vulnerable against well-crafted, small perturbations to the input, i.e., adversarial examples. Defending against such attacks is therefore crucial to ensure the proper functioning of these models—especially when autonomous decisions are taken in safety-critical applications, such as autonomous vehicles. In this work, shallow machine learning models, such as Logistic Regression and Support Vector Machine, are utilised as surrogates of a CNN based on the assumption that they would be differently affected by the minute modifications crafted for CNNs. We develop three detection strategies for adversarial examples by analysing differences in the prediction of the surrogate and the CNN model: namely, deviation in (i) the prediction, (ii) the distance of the predictions, and (iii) the confidence of the predictions. We consider three different feature spaces: raw images, extracted features, and the activations of the CNN model. Our evaluation shows that our methods achieve state-of-the-art performance compared to other approaches, such as Feature Squeezing, MagNet, PixelDefend, and Subset Scanning, on the MNIST, Fashion-MNIST, and CIFAR-10 datasets while being robust in the sense that they do not entirely fail against selected single attacks. Further, we evaluate our defence against an adaptive attacker in a grey-box setting. Full article
Show Figures

Figure 1

36 pages, 1553 KiB  
Article
Explainable Artificial Intelligence Using Expressive Boolean Formulas
by Gili Rosenberg, John Kyle Brubaker, Martin J. A. Schuetz, Grant Salton, Zhihuai Zhu, Elton Yechao Zhu, Serdar Kadıoğlu, Sima E. Borujeni and Helmut G. Katzgraber
Mach. Learn. Knowl. Extr. 2023, 5(4), 1760-1795; https://doi.org/10.3390/make5040086 - 24 Nov 2023
Cited by 1 | Viewed by 5199
Abstract
We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability) according to which [...] Read more.
We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability) according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule- and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special-purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over the subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations. Therefore, using specialized or quantum hardware could lead to a significant speedup through the rapid proposal of non-local moves. Full article
(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI): 2nd Edition)
Show Figures

Figure 1

14 pages, 2157 KiB  
Article
FCIoU: A Targeted Approach for Improving Minority Class Detection in Semantic Segmentation Systems
by Jonathan Plangger, Mohamed Atia and Hicham Chaoui
Mach. Learn. Knowl. Extr. 2023, 5(4), 1746-1759; https://doi.org/10.3390/make5040085 - 23 Nov 2023
Viewed by 1311
Abstract
In this paper, we present a comparative study of modern semantic segmentation loss functions and their resultant impact when applied with state-of-the-art off-road datasets. Class imbalance, inherent in these datasets, presents a significant challenge to off-road terrain semantic segmentation systems. With numerous environment [...] Read more.
In this paper, we present a comparative study of modern semantic segmentation loss functions and their resultant impact when applied with state-of-the-art off-road datasets. Class imbalance, inherent in these datasets, presents a significant challenge to off-road terrain semantic segmentation systems. With numerous environment classes being extremely sparse and underrepresented, model training becomes inefficient and struggles to comprehend the infrequent minority classes. As a solution to this problem, loss functions have been configured to take class imbalance into account and counteract this issue. To this end, we present a novel loss function, Focal Class-based Intersection over Union (FCIoU), which directly targets performance imbalance through the optimization of class-based Intersection over Union (IoU). The new loss function results in a general increase in class-based performance when compared to state-of-the-art targeted loss functions. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

29 pages, 1807 KiB  
Article
Active Learning in the Detection of Anomalies in Cryptocurrency Transactions
by Leandro L. Cunha, Miguel A. Brito, Domingos F. Oliveira and Ana P. Martins
Mach. Learn. Knowl. Extr. 2023, 5(4), 1717-1745; https://doi.org/10.3390/make5040084 - 23 Nov 2023
Viewed by 1628
Abstract
The cryptocurrency market has grown significantly, and this quick growth has given rise to scams. It is necessary to put fraud detection mechanisms in place. The challenge of inadequate labeling is addressed in this work, which is a barrier to the training of [...] Read more.
The cryptocurrency market has grown significantly, and this quick growth has given rise to scams. It is necessary to put fraud detection mechanisms in place. The challenge of inadequate labeling is addressed in this work, which is a barrier to the training of high-performance supervised classifiers. It aims to lessen the necessity for laborious and time-consuming manual labeling. Some unlabeled data points have labels that are more pertinent and informative for the supervised model to learn from. The viability of utilizing unsupervised anomaly detection algorithms and active learning strategies to build an iterative process of acquiring labeled transactions in a cold start scenario, where there are no initial-labeled transactions, is being investigated. Investigating anomaly detection capabilities for a subset of data that maximizes supervised models’ learning potential is the goal. The anomaly detection algorithms under performed, according to the results. The findings underscore the need that anomaly detection algorithms be reserved for situations involving cold starts. As a result, using active learning techniques would produce better outcomes and supervised machine learning model performance. Full article
(This article belongs to the Section Privacy)
Show Figures

Figure 1

37 pages, 53762 KiB  
Review
A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS
by Juan Terven, Diana-Margarita Córdova-Esparza and Julio-Alejandro Romero-González
Mach. Learn. Knowl. Extr. 2023, 5(4), 1680-1716; https://doi.org/10.3390/make5040083 - 20 Nov 2023
Cited by 87 | Viewed by 14882
Abstract
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO’s evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO [...] Read more.
YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO’s evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with transformers. We start by describing the standard metrics and postprocessing; then, we discuss the major changes in network architecture and training tricks for each model. Finally, we summarize the essential lessons from YOLO’s development and provide a perspective on its future, highlighting potential research directions to enhance real-time object detection systems. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

20 pages, 3524 KiB  
Article
Proximal Policy Optimization-Based Reinforcement Learning and Hybrid Approaches to Explore the Cross Array Task Optimal Solution
by Samuel Corecco, Giorgia Adorni and Luca Maria Gambardella
Mach. Learn. Knowl. Extr. 2023, 5(4), 1660-1679; https://doi.org/10.3390/make5040082 - 20 Nov 2023
Viewed by 2172
Abstract
In an era characterised by rapid technological advancement, the application of algorithmic approaches to address complex problems has become crucial across various disciplines. Within the realm of education, there is growing recognition of the pivotal role played by computational thinking (CT). This skill [...] Read more.
In an era characterised by rapid technological advancement, the application of algorithmic approaches to address complex problems has become crucial across various disciplines. Within the realm of education, there is growing recognition of the pivotal role played by computational thinking (CT). This skill set has emerged as indispensable in our ever-evolving digital landscape, accompanied by an equal need for effective methods to assess and measure these skills. This research places its focus on the Cross Array Task (CAT), an educational activity designed within the Swiss educational system to assess students’ algorithmic skills. Its primary objective is to evaluate pupils’ ability to deconstruct complex problems into manageable steps and systematically formulate sequential strategies. The CAT has proven its effectiveness as an educational tool in tracking and monitoring the development of CT skills throughout compulsory education. Additionally, this task presents an enthralling avenue for algorithmic research, owing to its inherent complexity and the necessity to scrutinise the intricate interplay between different strategies and the structural aspects of this activity. This task, deeply rooted in logical reasoning and intricate problem solving, often poses a substantial challenge for human solvers striving for optimal solutions. Consequently, the exploration of computational power to unearth optimal solutions or uncover less intuitive strategies presents a captivating and promising endeavour. This paper explores two distinct algorithmic approaches to the CAT problem. The first approach combines clustering, random search, and move selection to find optimal solutions. The second approach employs reinforcement learning techniques focusing on the Proximal Policy Optimization (PPO) model. The findings of this research hold the potential to deepen our understanding of how machines can effectively tackle complex challenges like the CAT problem but also have broad implications, particularly in educational contexts, where these approaches can be seamlessly integrated into existing tools as a tutoring mechanism, offering assistance to students encountering difficulties. This can ultimately enhance students’ CT and problem-solving abilities, leading to an enriched educational experience. Full article
Show Figures

Figure 1

48 pages, 12604 KiB  
Systematic Review
Human Pose Estimation Using Deep Learning: A Systematic Literature Review
by Esraa Samkari, Muhammad Arif, Manal Alghamdi and Mohammed A. Al Ghamdi
Mach. Learn. Knowl. Extr. 2023, 5(4), 1612-1659; https://doi.org/10.3390/make5040081 - 13 Nov 2023
Cited by 5 | Viewed by 4817
Abstract
Human Pose Estimation (HPE) is the task that aims to predict the location of human joints from images and videos. This task is used in many applications, such as sports analysis and surveillance systems. Recently, several studies have embraced deep learning to enhance [...] Read more.
Human Pose Estimation (HPE) is the task that aims to predict the location of human joints from images and videos. This task is used in many applications, such as sports analysis and surveillance systems. Recently, several studies have embraced deep learning to enhance the performance of HPE tasks. However, building an efficient HPE model is difficult; many challenges, like crowded scenes and occlusion, must be handled. This paper followed a systematic procedure to review different HPE models comprehensively. About 100 articles published since 2014 on HPE using deep learning were selected using several selection criteria. Both image and video data types of methods were investigated. Furthermore, both single and multiple HPE methods were reviewed. In addition, the available datasets, different loss functions used in HPE, and pretrained feature extraction models were all covered. Our analysis revealed that Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are the most used in HPE. Moreover, occlusion and crowd scenes remain the main problems affecting models’ performance. Therefore, the paper presented various solutions to address these issues. Finally, this paper highlighted the potential opportunities for future work in this task. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

23 pages, 5671 KiB  
Article
Reconstruction-Based Adversarial Attack Detection in Vision-Based Autonomous Driving Systems
by Manzoor Hussain and Jang-Eui Hong
Mach. Learn. Knowl. Extr. 2023, 5(4), 1589-1611; https://doi.org/10.3390/make5040080 - 07 Nov 2023
Cited by 1 | Viewed by 1719
Abstract
The perception system is a safety-critical component that directly impacts the overall safety of autonomous driving systems (ADSs). It is imperative to ensure the robustness of the deep-learning model used in the perception system. However, studies have shown that these models are highly [...] Read more.
The perception system is a safety-critical component that directly impacts the overall safety of autonomous driving systems (ADSs). It is imperative to ensure the robustness of the deep-learning model used in the perception system. However, studies have shown that these models are highly vulnerable to the adversarial perturbation of input data. The existing works mainly focused on studying the impact of these adversarial attacks on classification rather than regression models. Therefore, this paper first introduces two generalized methods for perturbation-based attacks: (1) We used naturally occurring noises to create perturbations in the input data. (2) We introduce a modified square, HopSkipJump, and decision-based/boundary attack to attack the regression models used in ADSs. Then, we propose a deep-autoencoder-based adversarial attack detector. In addition to offline evaluation metrics (e.g., F1 score and precision, etc.), we introduce an online evaluation framework to evaluate the robustness of the model under attack. The framework considers the reconstruction loss of the deep autoencoder that validates the robustness of the models under attack in an end-to-end fashion at runtime. Our experimental results showed that the proposed adversarial attack detector could detect square, HopSkipJump, and decision-based/boundary attacks with a true positive rate (TPR) of 93%. Full article
(This article belongs to the Section Network)
Show Figures

Figure 1

19 pages, 1405 KiB  
Article
Explainable Stacked Ensemble Deep Learning (SEDL) Framework to Determine Cause of Death from Verbal Autopsies
by Michael T. Mapundu, Chodziwadziwa W. Kabudula, Eustasius Musenge, Victor Olago and Turgay Celik
Mach. Learn. Knowl. Extr. 2023, 5(4), 1570-1588; https://doi.org/10.3390/make5040079 - 25 Oct 2023
Cited by 3 | Viewed by 1526
Abstract
Verbal autopsies (VA) are commonly used in Low- and Medium-Income Countries (LMIC) to determine cause of death (CoD) where death occurs outside clinical settings, with the most commonly used international gold standard being physician medical certification. Interviewers elicit information from relatives of the [...] Read more.
Verbal autopsies (VA) are commonly used in Low- and Medium-Income Countries (LMIC) to determine cause of death (CoD) where death occurs outside clinical settings, with the most commonly used international gold standard being physician medical certification. Interviewers elicit information from relatives of the deceased, regarding circumstances and events that might have led to death. This information is stored in textual format as VA narratives. The narratives entail detailed information that can be used to determine CoD. However, this approach still remains a manual task that is costly, inconsistent, time-consuming and subjective (prone to errors), amongst many drawbacks. As such, this negatively affects the VA reporting process, despite it being vital for strengthening health priorities and informing civil registration systems. Therefore, this study seeks to close this gap by applying novel deep learning (DL) interpretable approaches for reviewing VA narratives and generate CoD prediction in a timely, easily interpretable, cost-effective and error-free way. We validate our DL models using optimisation and performance accuracy machine learning (ML) curves as a function of training samples. We report on validation with training set accuracy (LSTM = 76.11%, CNN = 76.35%, and SEDL = 82.1%), validation accuracy (LSTM = 67.05%, CNN = 66.16%, and SEDL = 82%) and test set accuracy (LSTM = 67%, CNN = 66.2%, and SEDL = 82%) for our models. Furthermore, we also present Local Interpretable Model-agnostic Explanations (LIME) for ease of interpretability of the results, thereby building trust in the use of machines in healthcare. We presented robust deep learning methods to determine CoD from VAs, with the stacked ensemble deep learning (SEDL) approaches performing optimally and better than Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). Our empirical results suggest that ensemble DL methods may be integrated in the CoD process to help experts get to a diagnosis. Ultimately, this will reduce the turnaround time needed by physicians to go through the narratives in order to be able to give an appropriate diagnosis, cut costs and minimise errors. This study was limited by the number of samples needed for training our models and the high levels of lexical variability in the words used in our textual information. Full article
(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI): 2nd Edition)
Show Figures

Figure 1

13 pages, 632 KiB  
Article
Evaluating the Role of Machine Learning in Defense Applications and Industry
by Evaldo Jorge Alcántara Suárez and Victor Monzon Baeza
Mach. Learn. Knowl. Extr. 2023, 5(4), 1557-1569; https://doi.org/10.3390/make5040078 - 22 Oct 2023
Viewed by 3844
Abstract
Machine learning (ML) has become a critical technology in the defense sector, enabling the development of advanced systems for threat detection, decision making, and autonomous operations. However, the increasing ML use in defense systems has raised ethical concerns related to accountability, transparency, and [...] Read more.
Machine learning (ML) has become a critical technology in the defense sector, enabling the development of advanced systems for threat detection, decision making, and autonomous operations. However, the increasing ML use in defense systems has raised ethical concerns related to accountability, transparency, and bias. In this paper, we provide a comprehensive analysis of the impact of ML on the defense sector, including the benefits and drawbacks of using ML in various applications such as surveillance, target identification, and autonomous weapons systems. We also discuss the ethical implications of using ML in defense, focusing on privacy, accountability, and bias issues. Finally, we present recommendations for mitigating these ethical concerns, including increased transparency, accountability, and stakeholder involvement in designing and deploying ML systems in the defense sector. Full article
(This article belongs to the Special Issue Fairness and Explanation for Trustworthy AI)
Show Figures

Figure 1

18 pages, 2565 KiB  
Article
Enhancing Electrocardiogram (ECG) Analysis of Implantable Cardiac Monitor Data: An Efficient Pipeline for Multi-Label Classification
by Amnon Bleich, Antje Linnemann, Benjamin Jaidi, Björn H. Diem and Tim O. F. Conrad
Mach. Learn. Knowl. Extr. 2023, 5(4), 1539-1556; https://doi.org/10.3390/make5040077 - 21 Oct 2023
Viewed by 1565
Abstract
Implantable Cardiac Monitor (ICM) devices are demonstrating, as of today, the fastest-growing market for implantable cardiac devices. As such, they are becoming increasingly common in patients for measuring heart electrical activity. ICMs constantly monitor and record a patient’s heart rhythm, and when triggered, [...] Read more.
Implantable Cardiac Monitor (ICM) devices are demonstrating, as of today, the fastest-growing market for implantable cardiac devices. As such, they are becoming increasingly common in patients for measuring heart electrical activity. ICMs constantly monitor and record a patient’s heart rhythm, and when triggered, send it to a secure server where health care professionals (HCPs) can review it. These devices employ a relatively simplistic rule-based algorithm (due to energy consumption constraints) to make alerts for abnormal heart rhythms. This algorithm is usually parameterized to an over-sensitive mode in order to not miss a case (resulting in a relatively high false-positive rate), and this, combined with the device’s nature of constantly monitoring the heart rhythm and its growing popularity, results in HCPs having to analyze and diagnose an increasingly growing number of data. In order to reduce the load on the latter, automated methods for ECG analysis are nowadays becoming a great tool to assist HCPs in their analysis. While state-of-the-art algorithms are data-driven rather than rule-based, training data for ICMs often consist of specific characteristics that make their analysis unique and particularly challenging. This study presents the challenges and solutions in automatically analyzing ICM data and introduces a method for its classification that outperforms existing methods on such data. It carries this out by combining high-frequency noise detection (which often occurs in ICM data) with a semi-supervised learning pipeline that allows for the re-labeling of training episodes and by using segmentation and dimension-reduction techniques that are robust to morphology variations of the sECG signal (which are typical to ICM data). As a result, it performs better than state-of-the-art techniques on such data with, e.g., an F1 score of 0.51 vs. 0.38 of our baseline state-of-the-art technique in correctly calling atrial fibrillation in ICM data. As such, it could be used in numerous ways, such as aiding HCPs in the analysis of ECGs originating from ICMs by, e.g., suggesting a rhythm type. Full article
(This article belongs to the Topic Bioinformatics and Intelligent Information Processing)
Show Figures

Figure 1

20 pages, 1444 KiB  
Article
FairCaipi: A Combination of Explanatory Interactive and Fair Machine Learning for Human and Machine Bias Reduction
by Louisa Heidrich, Emanuel Slany, Stephan Scheele and Ute Schmid
Mach. Learn. Knowl. Extr. 2023, 5(4), 1519-1538; https://doi.org/10.3390/make5040076 - 18 Oct 2023
Viewed by 1542
Abstract
The rise of machine-learning applications in domains with critical end-user impact has led to a growing concern about the fairness of learned models, with the goal of avoiding biases that negatively impact specific demographic groups. Most existing bias-mitigation strategies adapt the importance of [...] Read more.
The rise of machine-learning applications in domains with critical end-user impact has led to a growing concern about the fairness of learned models, with the goal of avoiding biases that negatively impact specific demographic groups. Most existing bias-mitigation strategies adapt the importance of data instances during pre-processing. Since fairness is a contextual concept, we advocate for an interactive machine-learning approach that enables users to provide iterative feedback for model adaptation. Specifically, we propose to adapt the explanatory interactive machine-learning approach Caipi for fair machine learning. FairCaipi incorporates human feedback in the loop on predictions and explanations to improve the fairness of the model. Experimental results demonstrate that FairCaipi outperforms a state-of-the-art pre-processing bias mitigation strategy in terms of the fairness and the predictive performance of the resulting machine-learning model. We show that FairCaipi can both uncover and reduce bias in machine-learning models and allows us to detect human bias. Full article
(This article belongs to the Special Issue Fairness and Explanation for Trustworthy AI)
Show Figures

Figure 1

26 pages, 4285 KiB  
Article
Deep Learning Techniques for Radar-Based Continuous Human Activity Recognition
by Ruchita Mehta, Sara Sharifzadeh, Vasile Palade, Bo Tan, Alireza Daneshkhah and Yordanka Karayaneva
Mach. Learn. Knowl. Extr. 2023, 5(4), 1493-1518; https://doi.org/10.3390/make5040075 - 14 Oct 2023
Cited by 3 | Viewed by 1705
Abstract
Human capability to perform routine tasks declines with age and age-related problems. Remote human activity recognition (HAR) is beneficial for regular monitoring of the elderly population. This paper addresses the problem of the continuous detection of daily human activities using a mm-wave Doppler [...] Read more.
Human capability to perform routine tasks declines with age and age-related problems. Remote human activity recognition (HAR) is beneficial for regular monitoring of the elderly population. This paper addresses the problem of the continuous detection of daily human activities using a mm-wave Doppler radar. In this study, two strategies have been employed: the first method uses un-equalized series of activities, whereas the second method utilizes a gradient-based strategy for equalization of the series of activities. The dynamic time warping (DTW) algorithm and Long Short-term Memory (LSTM) techniques have been implemented for the classification of un-equalized and equalized series of activities, respectively. The input for DTW was provided using three strategies. The first approach uses the pixel-level data of frames (UnSup-PLevel). In the other two strategies, a convolutional variational autoencoder (CVAE) is used to extract Un-Supervised Encoded features (UnSup-EnLevel) and Supervised Encoded features (Sup-EnLevel) from the series of Doppler frames. The second approach for equalized data series involves the application of four distinct feature extraction methods: i.e., convolutional neural networks (CNN), supervised and unsupervised CVAE, and principal component Analysis (PCA). The extracted features were considered as an input to the LSTM. This paper presents a comparative analysis of a novel supervised feature extraction pipeline, employing Sup-ENLevel-DTW and Sup-EnLevel-LSTM, against several state-of-the-art unsupervised methods, including UnSUp-EnLevel-DTW, UnSup-EnLevel-LSTM, CNN-LSTM, and PCA-LSTM. The results demonstrate the superiority of the Sup-EnLevel-LSTM strategy. However, the UnSup-PLevel strategy worked surprisingly well without using annotations and frame equalization. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

19 pages, 9171 KiB  
Article
Similarity-Based Framework for Unsupervised Domain Adaptation: Peer Reviewing Policy for Pseudo-Labeling
by Joel Arweiler, Cihan Ates, Jesus Cerquides, Rainer Koch and Hans-Jörg Bauer
Mach. Learn. Knowl. Extr. 2023, 5(4), 1474-1492; https://doi.org/10.3390/make5040074 - 12 Oct 2023
Viewed by 1411
Abstract
The inherent dependency of deep learning models on labeled data is a well-known problem and one of the barriers that slows down the integration of such methods into different fields of applied sciences and engineering, in which experimental and numerical methods can easily [...] Read more.
The inherent dependency of deep learning models on labeled data is a well-known problem and one of the barriers that slows down the integration of such methods into different fields of applied sciences and engineering, in which experimental and numerical methods can easily generate a colossal amount of unlabeled data. This paper proposes an unsupervised domain adaptation methodology that mimics the peer review process to label new observations in a different domain from the training set. The approach evaluates the validity of a hypothesis using domain knowledge acquired from the training set through a similarity analysis, exploring the projected feature space to examine the class centroid shifts. The methodology is tested on a binary classification problem, where synthetic images of cubes and cylinders in different orientations are generated. The methodology improves the accuracy of the object classifier from 60% to around 90% in the case of a domain shift in physical feature space without human labeling. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

18 pages, 17816 KiB  
Article
Mssgan: Enforcing Multiple Generators to Learn Multiple Subspaces to Avoid the Mode Collapse
by Miguel S. Soriano-Garcia, Ricardo Sevilla-Escoboza and Angel Garcia-Pedrero
Mach. Learn. Knowl. Extr. 2023, 5(4), 1456-1473; https://doi.org/10.3390/make5040073 - 10 Oct 2023
Viewed by 1376
Abstract
Generative Adversarial Networks are powerful generative models that are used in different areas and with multiple applications. However, this type of model has a training problem called mode collapse. This problem causes the generator to not learn the complete distribution of the data [...] Read more.
Generative Adversarial Networks are powerful generative models that are used in different areas and with multiple applications. However, this type of model has a training problem called mode collapse. This problem causes the generator to not learn the complete distribution of the data with which it is trained. To force the network to learn the entire data distribution, MSSGAN is introduced. This model has multiple generators and distributes the training data in multiple subspaces, where each generator is enforced to learn only one of the groups with the help of a classifier. We demonstrate that our model performs better on the FID and Sample Distribution metrics compared to previous models to avoid mode collapse. Experimental results show how each of the generators learns different information and, in turn, generates satisfactory quality samples. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

23 pages, 3936 KiB  
Article
Explaining Deep Q-Learning Experience Replay with SHapley Additive exPlanations
by Robert S. Sullivan and Luca Longo
Mach. Learn. Knowl. Extr. 2023, 5(4), 1433-1455; https://doi.org/10.3390/make5040072 - 09 Oct 2023
Viewed by 1913
Abstract
Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this [...] Read more.
Reinforcement Learning (RL) has shown promise in optimizing complex control and decision-making processes but Deep Reinforcement Learning (DRL) lacks interpretability, limiting its adoption in regulated sectors like manufacturing, finance, and healthcare. Difficulties arise from DRL’s opaque decision-making, hindering efficiency and resource use, this issue is amplified with every advancement. While many seek to move from Experience Replay to A3C, the latter demands more resources. Despite efforts to improve Experience Replay selection strategies, there is a tendency to keep the capacity high. We investigate training a Deep Convolutional Q-learning agent across 20 Atari games intentionally reducing Experience Replay capacity from 1×106 to 5×102. We find that a reduction from 1×104 to 5×103 doesn’t significantly affect rewards, offering a practical path to resource-efficient DRL. To illuminate agent decisions and align them with game mechanics, we employ a novel method: visualizing Experience Replay via Deep SHAP Explainer. This approach fosters comprehension and transparent, interpretable explanations, though any capacity reduction must be cautious to avoid overfitting. Our study demonstrates the feasibility of reducing Experience Replay and advocates for transparent, interpretable decision explanations using the Deep SHAP Explainer to promote enhancing resource efficiency in Experience Replay. Full article
(This article belongs to the Special Issue Advances in Explainable Artificial Intelligence (XAI): 2nd Edition)
Show Figures

Figure 1

26 pages, 3974 KiB  
Article
Machine Learning Method for Changepoint Detection in Short Time Series Data
by Veronika Smejkalová, Radovan Šomplák, Martin Rosecký and Kristína Šramková
Mach. Learn. Knowl. Extr. 2023, 5(4), 1407-1432; https://doi.org/10.3390/make5040071 - 05 Oct 2023
Viewed by 2211
Abstract
Analysis of data is crucial in waste management to improve effective planning from both short- and long-term perspectives. Real-world data often presents anomalies, but in the waste management sector, anomaly detection is seldom performed. The main goal and contribution of this paper is [...] Read more.
Analysis of data is crucial in waste management to improve effective planning from both short- and long-term perspectives. Real-world data often presents anomalies, but in the waste management sector, anomaly detection is seldom performed. The main goal and contribution of this paper is a proposal of a complex machine learning framework for changepoint detection in a large number of short time series from waste management. In such a case, it is not possible to use only an expert-based approach due to the time-consuming nature of this process and subjectivity. The proposed framework consists of two steps: (1) outlier detection via outlier test for trend-adjusted data, and (2) changepoints are identified via comparison of linear model parameters. In order to use the proposed method, it is necessary to have a sufficient number of experts’ assessments of the presence of anomalies in time series. The proposed framework is demonstrated on waste management data from the Czech Republic. It is observed that certain waste categories in specific regions frequently exhibit changepoints. On the micro-regional level, approximately 31.1% of time series contain at least one outlier and 16.4% exhibit changepoints. Certain groups of waste are more prone to the occurrence of anomalies. The results indicate that even in the case of aggregated data, anomalies are not rare, and their presence should always be checked. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

25 pages, 1350 KiB  
Review
When Federated Learning Meets Watermarking: A Comprehensive Overview of Techniques for Intellectual Property Protection
by Mohammed Lansari, Reda Bellafqira, Katarzyna Kapusta, Vincent Thouvenot, Olivier Bettan and Gouenou Coatrieux
Mach. Learn. Knowl. Extr. 2023, 5(4), 1382-1406; https://doi.org/10.3390/make5040070 - 04 Oct 2023
Cited by 2 | Viewed by 2507
Abstract
Federated learning (FL) is a technique that allows multiple participants to collaboratively train a Deep Neural Network (DNN) without the need to centralize their data. Among other advantages, it comes with privacy-preserving properties, making it attractive for application in sensitive contexts, such as [...] Read more.
Federated learning (FL) is a technique that allows multiple participants to collaboratively train a Deep Neural Network (DNN) without the need to centralize their data. Among other advantages, it comes with privacy-preserving properties, making it attractive for application in sensitive contexts, such as health care or the military. Although the data are not explicitly exchanged, the training procedure requires sharing information about participants’ models. This makes the individual models vulnerable to theft or unauthorized distribution by malicious actors. To address the issue of ownership rights protection in the context of machine learning (ML), DNN watermarking methods have been developed during the last five years. Most existing works have focused on watermarking in a centralized manner, but only a few methods have been designed for FL and its unique constraints. In this paper, we provide an overview of recent advancements in federated learning watermarking, shedding light on the new challenges and opportunities that arise in this field. Full article
(This article belongs to the Section Privacy)
Show Figures

Figure 1

23 pages, 1188 KiB  
Article
Entropy-Aware Time-Varying Graph Neural Networks with Generalized Temporal Hawkes Process: Dynamic Link Prediction in the Presence of Node Addition and Deletion
by Bahareh Najafi, Saeedeh Parsaeefard and Alberto Leon-Garcia
Mach. Learn. Knowl. Extr. 2023, 5(4), 1359-1381; https://doi.org/10.3390/make5040069 - 04 Oct 2023
Viewed by 1461
Abstract
This paper addresses the problem of learning temporal graph representations, which capture the changing nature of complex evolving networks. Existing approaches mainly focus on adding new nodes and edges to capture dynamic graph structures. However, to achieve more accurate representation of graph evolution, [...] Read more.
This paper addresses the problem of learning temporal graph representations, which capture the changing nature of complex evolving networks. Existing approaches mainly focus on adding new nodes and edges to capture dynamic graph structures. However, to achieve more accurate representation of graph evolution, we consider both the addition and deletion of nodes and edges as events. These events occur at irregular time scales and are modeled using temporal point processes. Our goal is to learn the conditional intensity function of the temporal point process to investigate the influence of deletion events on node representation learning for link-level prediction. We incorporate network entropy, a measure of node and edge significance, to capture the effect of node deletion and edge removal in our framework. Additionally, we leveraged the characteristics of a generalized temporal Hawkes process, which considers the inhibitory effects of events where past occurrences can reduce future intensity. This framework enables dynamic representation learning by effectively modeling both addition and deletion events in the temporal graph. To evaluate our approach, we utilize autonomous system graphs, a family of inhomogeneous sparse graphs with instances of node and edge additions and deletions, in a link prediction task. By integrating these enhancements into our framework, we improve the accuracy of dynamic link prediction and enable better understanding of the dynamic evolution of complex networks. Full article
Show Figures

Figure 1

19 pages, 2041 KiB  
Article
Predicting the Long-Term Dependencies in Time Series Using Recurrent Artificial Neural Networks
by Cristian Ubal, Gustavo Di-Giorgi, Javier E. Contreras-Reyes and Rodrigo Salas
Mach. Learn. Knowl. Extr. 2023, 5(4), 1340-1358; https://doi.org/10.3390/make5040068 - 02 Oct 2023
Cited by 1 | Viewed by 2239
Abstract
Long-term dependence is an essential feature for the predictability of time series. Estimating the parameter that describes long memory is essential to describing the behavior of time series models. However, most long memory estimation methods assume that this parameter has a constant value [...] Read more.
Long-term dependence is an essential feature for the predictability of time series. Estimating the parameter that describes long memory is essential to describing the behavior of time series models. However, most long memory estimation methods assume that this parameter has a constant value throughout the time series, and do not consider that the parameter may change over time. In this work, we propose an automated methodology that combines the estimation methodologies of the fractional differentiation parameter (and/or Hurst parameter) with its application to Recurrent Neural Networks (RNNs) in order for said networks to learn and predict long memory dependencies from information obtained in nonlinear time series. The proposal combines three methods that allow for better approximation in the prediction of the values of the parameters for each one of the windows obtained, using Recurrent Neural Networks as an adaptive method to learn and predict the dependencies of long memory in Time Series. For the RNNs, we have evaluated four different architectures: the Simple RNN, LSTM, the BiLSTM, and the GRU. These models are built from blocks with gates controlling the cell state and memory. We have evaluated the proposed approach using both synthetic and real-world data sets. We have simulated ARFIMA models for the synthetic data to generate several time series by varying the fractional differentiation parameter. We have evaluated the proposed approach using synthetic and real datasets using Whittle’s estimates of the Hurst parameter classically obtained in each window. We have simulated ARFIMA models in such a way that the synthetic data generate several time series by varying the fractional differentiation parameter. The real-world IPSA stock option index and Tree Ringtime series datasets were evaluated. All of the results show that the proposed approach can predict the Hurst exponent with good performance by selecting the optimal window size and overlap change. Full article
Show Figures

Figure 1

20 pages, 4233 KiB  
Article
Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution
by Saman Sarraf and Milton Kabia
Mach. Learn. Knowl. Extr. 2023, 5(4), 1320-1339; https://doi.org/10.3390/make5040067 - 29 Sep 2023
Viewed by 1441
Abstract
This study introduces an optimal topology of vision transformers for real-time video action recognition in a cloud-based solution. Although model performance is a key criterion for real-time video analysis use cases, inference latency plays a more crucial role in adopting such technology in [...] Read more.
This study introduces an optimal topology of vision transformers for real-time video action recognition in a cloud-based solution. Although model performance is a key criterion for real-time video analysis use cases, inference latency plays a more crucial role in adopting such technology in real-world scenarios. Our objective is to reduce the inference latency of the solution while admissibly maintaining the vision transformer’s performance. Thus, we employed the optimal cloud components as the foundation of our machine learning pipeline and optimized the topology of vision transformers. We utilized UCF101, including more than one million action recognition video clips. The modeling pipeline consists of a preprocessing module to extract frames from video clips, training two-dimensional (2D) vision transformer models, and deep learning baselines. The pipeline also includes a postprocessing step to aggregate the frame-level predictions to generate the video-level predictions at inference. The results demonstrate that our optimal vision transformer model with an input dimension of 56 × 56 × 3 with eight attention heads produces an F1 score of 91.497% for the testing set. The optimized vision transformer reduces the inference latency by 40.70%, measured through a batch-processing approach, with a 55.63% faster training time than the baseline. Lastly, we developed an enhanced skip-frame approach to improve the inference latency by finding an optimal ratio of frames for prediction at inference, where we could further reduce the inference latency by 57.15%. This study reveals that the vision transformer model is highly optimizable for inference latency while maintaining the model performance. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

18 pages, 4649 KiB  
Article
PCa-Clf: A Classifier of Prostate Cancer Patients into Patients with Indolent and Aggressive Tumors Using Machine Learning
by Yashwanth Karthik Kumar Mamidi, Tarun Karthik Kumar Mamidi, Md Wasi Ul Kabir, Jiande Wu, Md Tamjidul Hoque and Chindo Hicks
Mach. Learn. Knowl. Extr. 2023, 5(4), 1302-1319; https://doi.org/10.3390/make5040066 - 27 Sep 2023
Viewed by 1301
Abstract
A critical unmet medical need in prostate cancer (PCa) clinical management centers around distinguishing indolent from aggressive tumors. Traditionally, Gleason grading has been utilized for this purpose. However, tumor classification using Gleason Grade 7 is often ambiguous, as the clinical behavior of these [...] Read more.
A critical unmet medical need in prostate cancer (PCa) clinical management centers around distinguishing indolent from aggressive tumors. Traditionally, Gleason grading has been utilized for this purpose. However, tumor classification using Gleason Grade 7 is often ambiguous, as the clinical behavior of these tumors follows a variable clinical course. This study aimed to investigate the application of machine learning techniques (ML) to classify patients into indolent and aggressive PCas. We used gene expression data from The Cancer Genome Atlas and compared gene expression levels between indolent and aggressive tumors to identify features for developing and validating a range of ML and stacking algorithms. ML algorithms accurately distinguished indolent from aggressive PCas. With the accuracy of 96%, the stacking model was superior to individual ML algorithms when all samples with primary Gleason Grades 6 to 10 were used. Excluding samples with Gleason Grade 7 improved accuracy to 97%. This study shows that ML algorithms and stacking models are powerful approaches for the accurate classification of indolent versus aggressive PCas. Future implementation of this methodology may significantly impact clinical decision making and patient outcomes in the clinical management of prostate cancer. Full article
(This article belongs to the Special Issue Machine Learning for Biomedical Data Processing)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop