Deep Learning and Applications

A special issue of Machine Learning and Knowledge Extraction (ISSN 2504-4990). This special issue belongs to the section "Learning".

Deadline for manuscript submissions: closed (20 August 2023) | Viewed by 8369

Special Issue Editors


E-Mail Website
Guest Editor
IMT Nord Europe, Institut Mines-Télécom, University Lille, Centre for Digital Systems, F-59000 Lille, France
Interests: data mining; machine learning; self-adaptive evolving intelligent systems; big data challenges in energy

E-Mail Website
Guest Editor
Department of Computer Science, University of Kashmir, Srinagar 190006, India
Interests: artificial intelligence; machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor

E-Mail Website
Guest Editor
J. B. Speed School of Engineering, University of Louisville, Louisville, KY 40292, USA
Interests: data mining; crowdsourcing; concept drift; streaming data; social networks analysis

Special Issue Information

Dear Colleagues,

This Special Issue will consist of extended papers selected from papers presented at the 21st IEEE International Conference on Machine Learning and Applications (IEEE ICMLA 2022). Please visit the conference website for a detailed description: https://www.icmla-conference.org/icmla22/index.php

Each submission to this Special Issue should contain at least 40% new material, e.g., in the form of technical extensions, more in-depth evaluations, or additional use cases, and a change of title, abstract and keywords. These extended submissions will undergo a peer-review process according to the journal’s rules of action. At least two technical committees will act as reviewers for each extended article submitted to this Special Issue; if needed, additional external reviewers will be invited to guarantee a high-quality reviewing process.

Prof. Dr. Moamar Sayed-Mouchaweh
Prof. Dr. Mohd Arif Wani
Prof. Dr. Vasile Palade
Prof. Dr. Mehmed M. Kantardzic
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Machine Learning and Knowledge Extraction is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

20 pages, 878 KiB  
Article
Multi-Task Representation Learning for Renewable-Power Forecasting: A Comparative Analysis of Unified Autoencoder Variants and Task-Embedding Dimensions
by Chandana Priya Nivarthi, Stephan Vogt and Bernhard Sick
Mach. Learn. Knowl. Extr. 2023, 5(3), 1214-1233; https://doi.org/10.3390/make5030062 - 20 Sep 2023
Viewed by 1251
Abstract
Typically, renewable-power-generation forecasting using machine learning involves creating separate models for each photovoltaic or wind park, known as single-task learning models. However, transfer learning has gained popularity in recent years, as it allows for the transfer of knowledge from source parks to target [...] Read more.
Typically, renewable-power-generation forecasting using machine learning involves creating separate models for each photovoltaic or wind park, known as single-task learning models. However, transfer learning has gained popularity in recent years, as it allows for the transfer of knowledge from source parks to target parks. Nevertheless, determining the most similar source park(s) for transfer learning can be challenging, particularly when the target park has limited or no historical data samples. To address this issue, we propose a multi-task learning architecture that employs a Unified Autoencoder (UAE) to initially learn a common representation of input weather features among tasks and then utilizes a Task-Embedding layer in a Neural Network (TENN) to learn task-specific information. This proposed UAE-TENN architecture can be easily extended to new parks with or without historical data. We evaluate the performance of our proposed architecture and compare it to single-task learning models on six photovoltaic and wind farm datasets consisting of a total of 529 parks. Our results show that the UAE-TENN architecture significantly improves power-forecasting performance by 10 to 19% for photovoltaic parks and 5 to 15% for wind parks compared to baseline models. We also demonstrate that UAE-TENN improves forecast accuracy for a new park by 19% for photovoltaic parks, even in a zero-shot learning scenario where there is no historical data. Additionally, we propose variants of the Unified Autoencoder with convolutional and LSTM layers, compare their performance, and provide a comparison among architectures with different numbers of task-embedding dimensions. Finally, we demonstrate the utility of trained task embeddings for interpretation and visualization purposes. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

22 pages, 30347 KiB  
Article
Behavior-Aware Pedestrian Trajectory Prediction in Ego-Centric Camera Views with Spatio-Temporal Ego-Motion Estimation
by Phillip Czech, Markus Braun, Ulrich Kreßel and Bin Yang
Mach. Learn. Knowl. Extr. 2023, 5(3), 957-978; https://doi.org/10.3390/make5030050 - 03 Aug 2023
Cited by 2 | Viewed by 1459
Abstract
With the ongoing development of automated driving systems, the crucial task of predicting pedestrian behavior is attracting growing attention. The prediction of future pedestrian trajectories from the ego-vehicle camera perspective is particularly challenging due to the dynamically changing scene. Therefore, we present Behavior-Aware [...] Read more.
With the ongoing development of automated driving systems, the crucial task of predicting pedestrian behavior is attracting growing attention. The prediction of future pedestrian trajectories from the ego-vehicle camera perspective is particularly challenging due to the dynamically changing scene. Therefore, we present Behavior-Aware Pedestrian Trajectory Prediction (BA-PTP), a novel approach to pedestrian trajectory prediction for ego-centric camera views. It incorporates behavioral features extracted from real-world traffic scene observations such as the body and head orientation of pedestrians, as well as their pose, in addition to positional information from body and head bounding boxes. For each input modality, we employed independent encoding streams that are combined through a modality attention mechanism. To account for the ego-motion of the camera in an ego-centric view, we introduced Spatio-Temporal Ego-Motion Module (STEMM), a novel approach to ego-motion prediction. Compared to the related works, it utilizes spatial goal points of the ego-vehicle that are sampled from its intended route. We experimentally validated the effectiveness of our approach using two datasets for pedestrian behavior prediction in urban traffic scenes. Based on ablation studies, we show the advantages of incorporating different behavioral features for pedestrian trajectory prediction in the image plane. Moreover, we demonstrate the benefit of integrating STEMM into our pedestrian trajectory prediction method, BA-PTP. BA-PTP achieves state-of-the-art performance on the PIE dataset, outperforming prior work by 7% in MSE-1.5 s and CMSE as well as 9% in CFMSE. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

23 pages, 903 KiB  
Article
Autoencoder Feature Residuals for Network Intrusion Detection: One-Class Pretraining for Improved Performance
by Brian Lewandowski and Randy Paffenroth
Mach. Learn. Knowl. Extr. 2023, 5(3), 868-890; https://doi.org/10.3390/make5030046 - 31 Jul 2023
Cited by 1 | Viewed by 1044
Abstract
The proliferation of novel attacks and growing amounts of data has caused practitioners in the field of network intrusion detection to constantly work towards keeping up with this evolving adversarial landscape. Researchers have been seeking to harness deep learning techniques in efforts to [...] Read more.
The proliferation of novel attacks and growing amounts of data has caused practitioners in the field of network intrusion detection to constantly work towards keeping up with this evolving adversarial landscape. Researchers have been seeking to harness deep learning techniques in efforts to detect zero-day attacks and allow network intrusion detection systems to more efficiently alert network operators. The technique outlined in this work uses a one-class training process to shape autoencoder feature residuals for the effective detection of network attacks. Compared to an original set of input features, we show that autoencoder feature residuals are a suitable replacement, and often perform at least as well as the original feature set. This quality allows autoencoder feature residuals to prevent the need for extensive feature engineering without reducing classification performance. Additionally, it is found that without generating new data compared to an original feature set, using autoencoder feature residuals often improves classifier performance. Practical side effects from using autoencoder feature residuals emerge by analyzing the potential data compression benefits they provide. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

21 pages, 1211 KiB  
Article
Efficient Latent Space Compression for Lightning-Fast Fine-Tuning and Inference of Transformer-Based Models
by Ala Alam Falaki and Robin Gras
Mach. Learn. Knowl. Extr. 2023, 5(3), 847-867; https://doi.org/10.3390/make5030045 - 30 Jul 2023
Viewed by 1339
Abstract
This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The experiments reveal that reducing [...] Read more.
This paper presents a technique to reduce the number of parameters in a transformer-based encoder–decoder architecture by incorporating autoencoders. To discover the optimal compression, we trained different autoencoders on the embedding space (encoder’s output) of several pre-trained models. The experiments reveal that reducing the embedding size has the potential to dramatically decrease the GPU memory usage while speeding up the inference process. The proposed architecture was included in the BART model and tested for summarization, translation, and classification tasks. The summarization results show that a 60% decoder size reduction (from 96 M to 40 M parameters) will make the inference twice as fast and use less than half of GPU memory during fine-tuning process with only a 4.5% drop in R-1 score. The same trend is visible for translation and partially for classification tasks. Our approach reduces the GPU memory usage and processing time of large-scale sequence-to-sequence models for fine-tuning and inference. The implementation and checkpoints are available on GitHub. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

17 pages, 653 KiB  
Article
Low Cost Evolutionary Neural Architecture Search (LENAS) Applied to Traffic Forecasting
by Daniel Klosa and Christof Büskens
Mach. Learn. Knowl. Extr. 2023, 5(3), 830-846; https://doi.org/10.3390/make5030044 - 28 Jul 2023
Viewed by 1190
Abstract
Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. Deep learning techniques have gained traction in handling such complex datasets, but require expertise in neural architecture engineering, [...] Read more.
Traffic forecasting is an important task for transportation engineering as it helps authorities to plan and control traffic flow, detect congestion, and reduce environmental impact. Deep learning techniques have gained traction in handling such complex datasets, but require expertise in neural architecture engineering, often beyond the scope of traffic management decision-makers. Our study aims to address this challenge by using neural architecture search (NAS) methods. These methods, which simplify neural architecture engineering by discovering task-specific neural architectures, are only recently applied to traffic prediction. We specifically focus on the performance estimation of neural architectures, a computationally demanding sub-problem of NAS, that often hinders the real-world application of these methods. Extending prior work on evolutionary NAS (ENAS), our work evaluates the utility of zero-cost (ZC) proxies, recently emerged cost-effective evaluators of network architectures. These proxies operate without necessitating training, thereby circumventing the computational bottleneck, albeit at a slight cost to accuracy. Our findings indicate that, when integrated into the ENAS framework, ZC proxies can accelerate the search process by two orders of magnitude at a small cost of accuracy. These results establish the viability of ZC proxies as a practical solution to accelerate NAS methods while maintaining model accuracy. Our research contributes to the domain by showcasing how ZC proxies can enhance the accessibility and usability of NAS methods for traffic forecasting, despite potential limitations in neural architecture engineering expertise. This novel approach significantly aids in the efficient application of deep learning techniques in real-world traffic management scenarios. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

Review

Jump to: Research

19 pages, 1760 KiB  
Review
Gradient-Based Neural Architecture Search: A Comprehensive Evaluation
by Sarwat Ali and M. Arif Wani
Mach. Learn. Knowl. Extr. 2023, 5(3), 1176-1194; https://doi.org/10.3390/make5030060 - 14 Sep 2023
Viewed by 1374
Abstract
One of the challenges in deep learning involves discovering the optimal architecture for a specific task. This is effectively tackled through Neural Architecture Search (NAS). Neural Architecture Search encompasses three prominent approaches—reinforcement learning, evolutionary algorithms, and gradient descent—that have demonstrated noteworthy potential in [...] Read more.
One of the challenges in deep learning involves discovering the optimal architecture for a specific task. This is effectively tackled through Neural Architecture Search (NAS). Neural Architecture Search encompasses three prominent approaches—reinforcement learning, evolutionary algorithms, and gradient descent—that have demonstrated noteworthy potential in identifying good candidate architectures. However, approaches based on reinforcement learning and evolutionary algorithms often necessitate extensive computational resources, requiring hundreds of GPU days or more. Therefore, we confine this work to a gradient-based approach due to its lower computational resource demands. Our objective encompasses identifying the optimal gradient-based NAS method and pinpointing opportunities for future enhancements. To achieve this, a comprehensive evaluation of the use of four major Gradient descent-based architecture search methods for discovering the best neural architecture for image classification tasks is provided. An overview of these gradient-based methods, i.e., DARTS, PDARTS, Fair DARTS and Att-DARTS, is presented. A theoretical comparison, based on search spaces, continuous relaxation strategy and bi-level optimization, for deriving the best neural architecture is then provided. The strong and weak features of these methods are also listed. Experimental results for comparing the error rate and computational cost of these gradient-based methods are analyzed. These experiments involved using bench marking datasets CIFAR-10, CIFAR-100 and ImageNet. The results show that PDARTS is better and faster among the examined methods, making it a potent candidate for automating Neural Architecture Search. By effectively conducting a comparative analysis, our research provides valuable insights and future research directions to address the criticism and gaps in the literature. Full article
(This article belongs to the Special Issue Deep Learning and Applications)
Show Figures

Figure 1

Back to TopTop