Data Mining and Machine Learning with Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (15 August 2023) | Viewed by 25444

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


E-Mail Website
Guest Editor
Jiangsu Engineering Center of Network Monitoring, School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China
Interests: data mining; big data analytics; knowledge discovery; cloud computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the emergence of big data and the advances made in computing services, artificial intelligence (AI) has attracted increasing attention around the world. The role of AI is becoming more and more important in our daily lives in applications such as machine learning, pattern recognition, computer vision, data mining, human-machine interfaces, information retrieval, and natural language processing. To this end, an increasing number of researchers and engineers are already or will be involved in the AI field.

This topic aims to bring together leading scientists in deep learning and related areas within artificial intelligence, data mining, and machine learning with applications. Papers using advanced mathematical methods and statistical approaches in these areas are particularly welcome for publication in this Special Issue.

Prof. Dr. Wei Fang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Machine learning
  • Data mining
  • Statistical machine learning
  • Statistical classification
  • Statistical inference
  • Bayesian methods
  • Algorithms and architectures for big data searches, mining, and processing
  • Deep learning
  • Computer vision and image processing
  • Evolutionary computation
  • Knowledge discovery
  • Industrial and medical applications
  • Security applications
  • Applications of unsupervised learning
  • Industrial and medical applications

Related Special Issue

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 301 KiB  
Article
ACMKC: A Compact Associative Classification Model Using K-Modes Clustering with Rule Representations by Coverage
by Jamolbek Mattiev, Monte Davityan and Branko Kavsek
Mathematics 2023, 11(18), 3978; https://doi.org/10.3390/math11183978 - 19 Sep 2023
Viewed by 660
Abstract
The generation and analysis of vast amounts of data have become increasingly prevalent in diverse applications. In this study, we propose a novel approach to address the challenge of rule explosion in association rule mining by utilizing the coverage-based representations of clusters determined [...] Read more.
The generation and analysis of vast amounts of data have become increasingly prevalent in diverse applications. In this study, we propose a novel approach to address the challenge of rule explosion in association rule mining by utilizing the coverage-based representations of clusters determined by K-modes. We utilize the FP-Growth algorithm to generate class association rules (CARs). To further enhance the interpretability and compactness of the rule set, we employ the K-modes clustering algorithm with a distance metric that binarizes the rules. The optimal number of clusters is determined using the silhouette score. Representative rules are then selected based on their coverage within each cluster. To evaluate the effectiveness of our approach, we conducted experimental evaluations on both UCI and Kaggle datasets. The results demonstrate a significant reduction in the rule space (71 rules on average, which is the best result among all state-of-the-art rule-learning algorithms), aligning with our goal of producing compact classifiers. Our approach offers a promising solution for managing rule complexity in association rule mining, thereby facilitating improved rule interpretation and analysis, while maintaining a significantly similar classification accuracy (ACMKC: 80.0% on average) to other rule learners on most of the datasets. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

26 pages, 895 KiB  
Article
An Analysis of Climate Change Based on Machine Learning and an Endoreversible Model
by Sebastián Vázquez-Ramírez, Miguel Torres-Ruiz, Rolando Quintero, Kwok Tai Chui and Carlos Guzmán Sánchez-Mejorada
Mathematics 2023, 11(14), 3060; https://doi.org/10.3390/math11143060 - 11 Jul 2023
Viewed by 4641
Abstract
Several Sun models suggest a radioactive balance, where the concentration of greenhouse gases and the albedo effect are related to the Earth’s surface temperature. There is a considerable increment in greenhouse gases due to anthropogenic activities. Climate change correlates with this alteration in [...] Read more.
Several Sun models suggest a radioactive balance, where the concentration of greenhouse gases and the albedo effect are related to the Earth’s surface temperature. There is a considerable increment in greenhouse gases due to anthropogenic activities. Climate change correlates with this alteration in the atmosphere and an increase in surface temperature. Efficient forecasting of climate change and its impacts could be helpful to respond to the threat of c.c. and develop sustainably. Many studies have predicted temperature changes in the coming years. The global community has to create a model that can realize good predictions to ensure the best way to deal with this warming. Thus, we propose a finite-time thermodynamic (FTT) approach in the current work. FTT can solve problems such as the faint young Sun paradox. In addition, we use different machine learning models to evaluate our method and compare the experimental prediction and results. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

19 pages, 27566 KiB  
Article
NSNet: An N-Shaped Convolutional Neural Network with Multi-Scale Information for Image Denoising
by Yifen Li and Yuanyang Chen
Mathematics 2023, 11(12), 2772; https://doi.org/10.3390/math11122772 - 19 Jun 2023
Viewed by 1155
Abstract
Deep learning models with convolutional operators have received widespread attention for their good image denoising performance. However, since the convolutional operation prefers to extract local features, the extracted features may lose some global information, such as texture, structure, and color characteristics, when the [...] Read more.
Deep learning models with convolutional operators have received widespread attention for their good image denoising performance. However, since the convolutional operation prefers to extract local features, the extracted features may lose some global information, such as texture, structure, and color characteristics, when the object in the image is large. To address this issue, this paper proposes an N-shaped convolutional neural network with the ability to extract multi-scale features to capture more useful information and alleviate the problem of global information loss. The proposed network has two main parts: a multi-scale input layer and a multi-scale feature extraction layer. The former uses a two-dimensional Haar wavelet to create an image pyramid, which contains the corrupted image’s high- and low-frequency components at different scales. The latter uses a U-shaped convolutional network to extract features at different scales from this image pyramid. The method sets the mean-squared error as the loss function and uses the residual learning strategy to learn the image noise directly. Compared with some existing image denoising methods, the proposed method shows good performance in gray and color image denoising, especially in textures and contours. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

26 pages, 7510 KiB  
Article
Multi-Country and Multi-Horizon GDP Forecasting Using Temporal Fusion Transformers
by Juan Laborda, Sonia Ruano and Ignacio Zamanillo
Mathematics 2023, 11(12), 2625; https://doi.org/10.3390/math11122625 - 08 Jun 2023
Cited by 2 | Viewed by 2066
Abstract
This paper applies a new artificial intelligence architecture, the temporal fusion transformer (TFT), for the joint GDP forecasting of 25 OECD countries at different time horizons. This new attention-based architecture offers significant advantages over other deep learning methods. First, results are interpretable since [...] Read more.
This paper applies a new artificial intelligence architecture, the temporal fusion transformer (TFT), for the joint GDP forecasting of 25 OECD countries at different time horizons. This new attention-based architecture offers significant advantages over other deep learning methods. First, results are interpretable since the impact of each explanatory variable on each forecast can be calculated. Second, it allows for visualizing persistent temporal patterns and identifying significant events and different regimes. Third, it provides quantile regressions and permits training the model on multiple time series from different distributions. Results suggest that TFTs outperform regression models, especially in periods of turbulence such as the COVID-19 shock. Interesting economic interpretations are obtained depending on whether the country has domestic demand-led or export-led growth. In essence, TFT is revealed as a new tool that artificial intelligence provides to economists and policy makers, with enormous prospects for the future. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

24 pages, 1438 KiB  
Article
Securing IoT Devices Running PureOS from Ransomware Attacks: Leveraging Hybrid Machine Learning Techniques
by Tariq Ahamed Ahanger, Usman Tariq, Fadl Dahan, Shafique A. Chaudhry and Yasir Malik
Mathematics 2023, 11(11), 2481; https://doi.org/10.3390/math11112481 - 28 May 2023
Cited by 3 | Viewed by 2326
Abstract
Internet-enabled (IoT) devices are typically small, low-powered devices used for sensing and computing that enable remote monitoring and control of various environments through the Internet. Despite their usefulness in achieving a more connected cyber-physical world, these devices are vulnerable to ransomware attacks due [...] Read more.
Internet-enabled (IoT) devices are typically small, low-powered devices used for sensing and computing that enable remote monitoring and control of various environments through the Internet. Despite their usefulness in achieving a more connected cyber-physical world, these devices are vulnerable to ransomware attacks due to their limited resources and connectivity. To combat these threats, machine learning (ML) can be leveraged to identify and prevent ransomware attacks on IoT devices before they can cause significant damage. In this research paper, we explore the use of ML techniques to enhance ransomware defense in IoT devices running on the PureOS operating system. We have developed a ransomware detection framework using machine learning, which combines the XGBoost and ElasticNet algorithms in a hybrid approach. The design and implementation of our framework are based on the evaluation of various existing machine learning techniques. Our approach was tested using a dataset of real-world ransomware attacks on IoT devices and achieved high accuracy (90%) and low false-positive rates, demonstrating its effectiveness in detecting and preventing ransomware attacks on IoT devices running PureOS. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

30 pages, 5728 KiB  
Article
Log-Linear-Based Logic Mining with Multi-Discrete Hopfield Neural Network
by Gaeithry Manoharam, Mohd Shareduwan Mohd Kasihmuddin, Siti Noor Farwina Mohamad Anwar Antony, Nurul Atiqah Romli, Nur ‘Afifah Rusdi, Suad Abdeen and Mohd. Asyraf Mansor
Mathematics 2023, 11(9), 2121; https://doi.org/10.3390/math11092121 - 30 Apr 2023
Cited by 5 | Viewed by 1131
Abstract
Choosing the best attribute from a dataset is a crucial step in effective logic mining since it has the greatest impact on improving the performance of the induced logic. This can be achieved by removing any irrelevant attributes that could become a logical [...] Read more.
Choosing the best attribute from a dataset is a crucial step in effective logic mining since it has the greatest impact on improving the performance of the induced logic. This can be achieved by removing any irrelevant attributes that could become a logical rule. Numerous strategies are available in the literature to address this issue. However, these approaches only consider low-order logical rules, which limit the logical connection in the clause. Even though some methods produce excellent performance metrics, incorporating optimal higher-order logical rules into logic mining is challenging due to the large number of attributes involved. Furthermore, suboptimal logical rules are trained on an ineffective discrete Hopfield neural network, which leads to suboptimal induced logic. In this paper, we propose higher-order logic mining incorporating a log-linear analysis during the pre-processing phase, the multi-unit 3-satisfiability-based reverse analysis with a log-linear approach. The proposed logic mining also integrates a multi-unit discrete Hopfield neural network to ensure that each 3-satisfiability logic is learned separately. In this context, our proposed logic mining employs three unique optimization layers to improve the final induced logic. Extensive experiments are conducted on 15 real-life datasets from various fields of study. The experimental results demonstrated that our proposed logic mining method outperforms state-of-the-art methods in terms of widely used performance metrics. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

20 pages, 5126 KiB  
Article
Assisting Glaucoma Screening Process Using Feature Excitation and Information Aggregation Techniques in Retinal Fundus Images
by Ali Raza, Sharjeel Adnan, Muhammad Ishaq, Hyung Seok Kim, Rizwan Ali Naqvi and Seung-Won Lee
Mathematics 2023, 11(2), 257; https://doi.org/10.3390/math11020257 - 04 Jan 2023
Cited by 5 | Viewed by 2364
Abstract
The rapidly increasing trend of retinal diseases needs serious attention, worldwide. Glaucoma is a critical ophthalmic disease that can cause permanent vision impairment. Typically, ophthalmologists diagnose glaucoma using manual assessments which is an error-prone, subjective, and time-consuming approach. Therefore, the development of automated [...] Read more.
The rapidly increasing trend of retinal diseases needs serious attention, worldwide. Glaucoma is a critical ophthalmic disease that can cause permanent vision impairment. Typically, ophthalmologists diagnose glaucoma using manual assessments which is an error-prone, subjective, and time-consuming approach. Therefore, the development of automated methods is crucial to strengthen and assist the existing diagnostic methods. In fundus imaging, optic cup (OC) and optic disc (OD) segmentation are widely accepted by researchers for glaucoma screening assistance. Many research studies proposed artificial intelligence (AI) based decision support systems for glaucoma diagnosis. However, existing AI-based methods show serious limitations in terms of accuracy and efficiency. Variations in backgrounds, pixel intensity values, and object size make the segmentation challenging. Particularly, OC size is usually very small with unclear boundaries which makes its segmentation even more difficult. To effectively address these problems, a novel feature excitation-based dense segmentation network (FEDS-Net) is developed to provide accurate OD and OC segmentation. FEDS-Net employs feature excitation and information aggregation (IA) mechanisms for enhancing the OC and OD segmentation performance. FEDS-Net also uses rapid feature downsampling and efficient convolutional depth for diverse and efficient learning of the network, respectively. The proposed framework is comprehensively evaluated on three open databases: REFUGE, Drishti-GS, and Rim-One-r3. FEDS-Net achieved outperforming segmentation performance compared with state-of-the-art methods. A small number of required trainable parameters (2.73 million) also confirms the superior computational efficiency of our proposed method. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

24 pages, 4727 KiB  
Article
A Provable Secure Cybersecurity Mechanism Based on Combination of Lightweight Cryptography and Authentication for Internet of Things
by Adel A. Ahmed, Sharaf J. Malebary, Waleed Ali and Ahmed A. Alzahrani
Mathematics 2023, 11(1), 220; https://doi.org/10.3390/math11010220 - 01 Jan 2023
Cited by 4 | Viewed by 1594
Abstract
Internet of Things devices, platform programs, and network applications are all vulnerable to cyberattacks (digital attacks), which can be prevented at different levels by using cybersecurity protocol. In the Internet of Things (IoT), cyberattacks are specifically intended to retrieve or change/destroy sensitive information [...] Read more.
Internet of Things devices, platform programs, and network applications are all vulnerable to cyberattacks (digital attacks), which can be prevented at different levels by using cybersecurity protocol. In the Internet of Things (IoT), cyberattacks are specifically intended to retrieve or change/destroy sensitive information that may exceed the IoT’s advantages. Furthermore, the design of a lightweight cybersecurity mechanism experiences a critical challenge that would perfectly fit resource-constrained IoT devices. For instance, identifying the compromised devices and the users’ data and services protection are the general challenges of cybersecurity on an IoT system that should be considered. This paper proposes a secure cybersecurity system based on the integration of cryptography with authentication (ELCA) that utilizes elliptic curve Diffie–Hellman (ECDH) to undertake key distribution while the weak bits problem in the shared secret key is resolved. In this paper, three systems of integration are investigated, while ELCA proposes secure integration between authentication and encryption to facilitate confidentiality and authenticity transfer messages between IoT devices over an insecure communication channel. Furthermore, the security of ELCA is proven mathematically using the random oracle model and IoT adversary model. The findings of the emulation results show the effectiveness of ELCA performance in terms of a reduced CPU execution time by 50%, reduced storage cost by 32–19.6%, and reduced energy consumption by 41% compared to the baseline cryptographic algorithms. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Graphical abstract

13 pages, 1884 KiB  
Article
Health Status-Based Predictive Maintenance Decision-Making via LSTM and Markov Decision Process
by Pan Zheng, Wenqin Zhao, Yaqiong Lv, Lu Qian and Yifan Li
Mathematics 2023, 11(1), 109; https://doi.org/10.3390/math11010109 - 26 Dec 2022
Cited by 6 | Viewed by 1709
Abstract
Maintenance decision-making is essential to achieve safe and reliable operation with high performance for equipment. To avoid unexpected shutdown and increase machine life as well as system efficiency, it is fundamental to design an effective maintenance decision-making scheme for equipment. In this paper, [...] Read more.
Maintenance decision-making is essential to achieve safe and reliable operation with high performance for equipment. To avoid unexpected shutdown and increase machine life as well as system efficiency, it is fundamental to design an effective maintenance decision-making scheme for equipment. In this paper, we propose a novel maintenance decision-making method for equipment based on Long Short-Term Memory (LSTM) and Markov decision process, which can provide specific maintenance strategies in different degradation stages of the system. Specifically, the LSTM model is firstly applied to predict the remaining service life of equipment to distinguish its health state quantitatively. Then, based on the bearing residual life prediction curve, the degradation process model is constructed, and the corresponding parameters of the model are identified. Finally, the bearing degradation curve is obtained by the degradation process model, based on which the Markov decision process model is constructed to provide accurate maintenance strategies for different health conditions of system. To demonstrate the effectiveness of the proposed method, an experimental study with the full life cycle data set of rolling bearings is carried out. The experimental results show that the proposed method can achieve efficient maintenance decisions for bearings under different health states, which provides a feasible solution for the maintenance of bearing systems. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

25 pages, 3303 KiB  
Article
Performance Analysis of Feature Subset Selection Techniques for Intrusion Detection
by Yousef Almaghthawi, Iftikhar Ahmad and Fawaz E. Alsaadi
Mathematics 2022, 10(24), 4745; https://doi.org/10.3390/math10244745 - 14 Dec 2022
Cited by 3 | Viewed by 1537
Abstract
An intrusion detection system is one of the main defense lines used to provide security to data, information, and computer networks. The problems of this security system are the increased processing time, high false alarm rate, and low detection rate that occur due [...] Read more.
An intrusion detection system is one of the main defense lines used to provide security to data, information, and computer networks. The problems of this security system are the increased processing time, high false alarm rate, and low detection rate that occur due to the large amount of data containing various irrelevant and redundant features. Therefore, feature selection can solve this problem by reducing the number of features. Choosing appropriate feature selection methods that can reduce the number of features without a negative effect on the classification accuracy is a major challenge. This challenge motivated us to investigate the application of different wrapper feature selection techniques in intrusion detection. The performance of the selected techniques, such as the genetic algorithm (GA), sequential forward selection (SFS), and sequential backward selection (SBS), were analyzed, addressed, and compared to the existing techniques. The efficiency of the three feature selection techniques with two classification methods, including support vector machine (SVM) and multi perceptron (MLP), was compared. The CICIDS2017, CSE-CIC-IDS218, and NSL-KDD datasets were considered for the experiments. The efficiency of the proposed models was proved in the experimental results, which indicated that it had highest accuracy in the selected datasets. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

15 pages, 5676 KiB  
Article
A General Framework for Flight Maneuvers Automatic Recognition
by Jing Lu, Hongjun Chai and Ruchun Jia
Mathematics 2022, 10(7), 1196; https://doi.org/10.3390/math10071196 - 06 Apr 2022
Cited by 8 | Viewed by 2026
Abstract
Flight Maneuver Recognition (FMR) refers to the automatic recognition of a series of aircraft flight patterns and is a key technology in many fields. The chaotic nature of its input data and the professional complexity of the identification process make it difficult and [...] Read more.
Flight Maneuver Recognition (FMR) refers to the automatic recognition of a series of aircraft flight patterns and is a key technology in many fields. The chaotic nature of its input data and the professional complexity of the identification process make it difficult and expensive to identify, and none of the existing models have general generalization capabilities. A general framework is proposed in this paper, which can be used for all kinds of flight tasks, independent of the aircraft type. We first preprocessed the raw data with unsupervised clustering method, segmented it into maneuver sequences, then reconstructed the sequences in phase space, calculated their approximate entropy, quantitatively characterized the sequence complexity, and distinguished the flight maneuvers. Experiments on a real flight training dataset have shown that the framework can quickly and correctly identify various flight maneuvers for multiple aircraft types with minimal human intervention. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

Review

Jump to: Research

22 pages, 4940 KiB  
Review
Survey on the Application of Artificial Intelligence in ENSO Forecasting
by Wei Fang, Yu Sha and Victor S. Sheng
Mathematics 2022, 10(20), 3793; https://doi.org/10.3390/math10203793 - 14 Oct 2022
Cited by 6 | Viewed by 2605
Abstract
Climate disasters such as floods and droughts often bring heavy losses to human life, national economy, and public safety. El Niño/Southern Oscillation (ENSO) is one of the most important inter-annual climate signals in the tropics and has a global impact on atmospheric circulation [...] Read more.
Climate disasters such as floods and droughts often bring heavy losses to human life, national economy, and public safety. El Niño/Southern Oscillation (ENSO) is one of the most important inter-annual climate signals in the tropics and has a global impact on atmospheric circulation and precipitation. To address the impact of climate change, accurate ENSO forecasts can help prevent related climate disasters. Traditional prediction methods mainly include statistical methods and dynamic methods. However, due to the variability and diversity of the temporal and spatial evolution of ENSO, traditional methods still have great uncertainty in predicting ENSO. In recent years, with the rapid development of artificial intelligence technology, it has gradually penetrated into all aspects of people’s lives, and the climate field has also benefited. For example, deep learning methods in artificial intelligence can automatically learn and train from a large amount of sample data, obtain excellent feature representation, and effectively improve the performance of various learning tasks. It is widely used in computer vision, natural language processing, and other fields. In 2019, Ham et al. used a convolutional neural network (CNN) model in ENSO forecasting 18 months in advance, and the winter ENSO forecasting skill could reach 0.64, far exceeding the dynamic model with a forecasting skill of 0.5. The research results were regarded as the pioneering work of deep learning in the field of weather forecasting. This paper introduces the traditional ENSO forecasting methods and focuses on summarizing the various latest artificial intelligence methods and their forecasting effects for ENSO forecasting, so as to provide useful reference for future research by researchers. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning with Applications)
Show Figures

Figure 1

Back to TopTop