Deep Learning for Data Mining: Theory, Methods, and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 16 June 2024 | Viewed by 3701

Special Issue Editors

School of Information Science and Technology, Shandong Normal University, Jinan 25035, China
Interests: data mining; recommender system; social media mining
School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China
Interests: manchine learning, data mining, computer vision
Special Issues, Collections and Topics in MDPI journals
Department of Computer Science and Technology, Ocean University of China, Qingdao 266005, China
Interests: data mining; machine learning; database systems
Special Issues, Collections and Topics in MDPI journals
Tongda College, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
Interests: data mining; recommendation systems

Special Issue Information

Dear Colleagues,

Deep learning has succeeded significantly in many application areas, such as object recognition, natural language processing, and information retrieval. Due to the limitations of shallow methods in mining knowledge from data, deep learning technologies are also frequently adopted for data mining tasks, such as classification, prediction, time-series analysis, association, and clustering, which significantly improves the development of the data mining area.

This Special Issue aims to provide an academic platform to publish high-quality research papers on deep learning methods and their applications to data mining, including (but not limited to) extended versions of the outstanding SDAI2023 (https://www.sdaai.org.cn/sdai2023) papers.

Potential topics of interest for this Special Issue include:

  • Deep learning theory;
  • Deep learning algorithms;
  • Graph neural networks;
  • Deep reinforcement learning;
  • Classification methods;
  • Association rule mining;
  • User behavior modeling;
  • Click-through rate prediction;
  • Behavior pattern mining;
  • Recommendation methods;
  • Clustering algorithms;
  • Time-series analysis;
  • Spatial data mining;
  • Other deep learning applications for data mining.

Dr. Lei Guo
Prof. Dr. Xiushan Nie
Prof. Dr. Yanwei Yu
Dr. Yonghong Yu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • data mining
  • application

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 1507 KiB  
Article
A Personalized Federated Learning Method Based on Clustering and Knowledge Distillation
by Jianfei Zhang and Yongqiang Shi
Electronics 2024, 13(5), 857; https://doi.org/10.3390/electronics13050857 - 23 Feb 2024
Viewed by 463
Abstract
Federated learning (FL) is a distributed machine learning paradigm under privacy preservation. However, data heterogeneity among clients leads to the shared global model obtained after training, which cannot fit the distribution of each client’s dataset, and the performance of the model degrades. To [...] Read more.
Federated learning (FL) is a distributed machine learning paradigm under privacy preservation. However, data heterogeneity among clients leads to the shared global model obtained after training, which cannot fit the distribution of each client’s dataset, and the performance of the model degrades. To address this problem, we proposed a personalized federated learning method based on clustering and knowledge distillation, called pFedCK. In this algorithm, each client has an interactive model that participates in global training and a personalized model that is only trained locally. Both of the models perform knowledge distillation with each other through the feature representation of the middle layer and the soft prediction of the model. In addition, in order to make an interaction model only obtaining the model information from the client, which has similar data distribution and avoids the interference of other heterogeneous information, the server will cluster the clients according to the similarity of the amount of parameter variation uploaded by different interaction models during every training round. By clustering clients, interaction models with similar data distributions can cooperate with each other to better fit the local dataset distribution. Thereby, the performance of personalized model can be improved by obtaining more valuable information indirectly. Finally, we conduct simulation experiments on three benchmark datasets under different data heterogeneity scenarios. Compared to the single model algorithms, the accuracy of pFedCK improved by an average of 23.4% and 23.8% over FedAvg and FedProx, respectively; compared to typical personalization algorithms, the accuracy of pFedCK improved by an average of 0.8% and 1.3%, and a maximum of 1.0% and 2.9% over FedDistill and FML. Full article
(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)
Show Figures

Figure 1

15 pages, 3482 KiB  
Article
An Evolved Transformer Model for ADME/Tox Prediction
by Changheng Shao, Fengjing Shao, Song Huang, Rencheng Sun and Tao Zhang
Electronics 2024, 13(3), 624; https://doi.org/10.3390/electronics13030624 - 02 Feb 2024
Viewed by 617
Abstract
Drug discovery aims to keep fueling new medicines to cure and palliate many ailments and some untreatable diseases that still afflict humanity. The ADME/Tox (absorption, distribution, metabolism, excretion/toxicity) properties of candidate drug molecules are key factors that determine the safety, uptake, elimination, metabolic [...] Read more.
Drug discovery aims to keep fueling new medicines to cure and palliate many ailments and some untreatable diseases that still afflict humanity. The ADME/Tox (absorption, distribution, metabolism, excretion/toxicity) properties of candidate drug molecules are key factors that determine the safety, uptake, elimination, metabolic behavior and effectiveness of drug research and development. The predictive technique of ADME/Tox drastically reduces the fraction of pharmaceutics-related failure in the early stages of drug development. Driven by the expectation of accelerated timelines, reduced costs and the potential to reveal hidden insights from vast datasets, artificial intelligence techniques such as Graphormer are showing increasing promise and usefulness to perform custom models for molecule modeling tasks. However, Graphormer and other transformer-based models do not consider the molecular fingerprint, as well as the physicochemicals that have been proved effective in traditional computational drug research. Here, we propose an enhanced model based on Graphormer which uses a tree model that fully integrates some known information and achieves better prediction and interpretability. More importantly, the model achieves new state-of-the-art results on ADME/Tox properties prediction benchmarks, surpassing several challenging models. Experimental results demonstrate an average SMAPE (Symmetric Mean Absolute Percentage Error) of 18.9 and a PCC (Pearson Correlation Coefficient) of 0.86 on ADME/Tox prediction test sets. These findings highlight the efficacy of our approach and its potential to enhance drug discovery processes. By leveraging the strengths of Graphormer and incorporating additional molecular descriptors, our model offers improved predictive capabilities, thus contributing to the advancement of ADME/Tox prediction in drug development. The integration of various information sources further enables better interpretability, aiding researchers in understanding the underlying factors influencing the predictions. Overall, our work demonstrates the potential of our enhanced model to expedite drug discovery, reduce costs, and enhance the success rate of our pharmaceutical development efforts. Full article
(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)
Show Figures

Figure 1

14 pages, 933 KiB  
Article
Self-Supervised Clustering Models Based on BYOL Network Structure
by Xuehao Chen, Jin Zhou, Yuehui Chen, Shiyuan Han, Yingxu Wang, Tao Du, Cheng Yang and Bowen Liu
Electronics 2023, 12(23), 4723; https://doi.org/10.3390/electronics12234723 - 21 Nov 2023
Viewed by 613
Abstract
Contrastive-based clustering models usually rely on a large number of negative pairs to capture uniform representations, which requires a large batch size and high computational complexity. In contrast, some self-supervised methods perform non-contrastive learning to capture discriminative representations only with positive pairs, but [...] Read more.
Contrastive-based clustering models usually rely on a large number of negative pairs to capture uniform representations, which requires a large batch size and high computational complexity. In contrast, some self-supervised methods perform non-contrastive learning to capture discriminative representations only with positive pairs, but suffer from the collapse of clustering. To solve these issues, a novel end-to-end self-supervised clustering model is proposed in this paper. The basic self-supervised learning network is first modified, followed by the incorporation of a Softmax layer to obtain cluster assignments as data representation. Then, adversarial learning on the cluster assignments is integrated into the methods to further enhance discrimination across different clusters and mitigate the collapse between clusters. To further encourage clustering-oriented guidance, a new cluster-level discrimination is assembled to promote clustering performance by measuring the self-correlation between the learned cluster assignments. Experimental results on real-world datasets exhibit better performance of the proposed model compared with the existing deep clustering methods. Full article
(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)
Show Figures

Figure 1

14 pages, 1476 KiB  
Article
Property Analysis of Gateway Refinement of Object-Oriented Petri Net with Inhibitor-Arcs-Based Representation for Embedded Systems
by Chuanliang Xia, Mengying Qin, Yan Sun and Maibo Guo
Electronics 2023, 12(18), 3977; https://doi.org/10.3390/electronics12183977 - 21 Sep 2023
Viewed by 574
Abstract
This paper focuses on embedded system modeling, proposing a solution to obtain a refined net via the refinement operation of an extended Petri net. Object-oriented technology and Petri net with inhibitor-arcs-based representation for embedded systems (PIRES+) are combined to obtain an object-oriented PIRES+ [...] Read more.
This paper focuses on embedded system modeling, proposing a solution to obtain a refined net via the refinement operation of an extended Petri net. Object-oriented technology and Petri net with inhibitor-arcs-based representation for embedded systems (PIRES+) are combined to obtain an object-oriented PIRES+ (OOPIRES+). A gateway refinement method of OOPIRES+ is proposed, and the preservation of the liveness, boundedness, reachability, functionality, and timing of the refined net system is investigated. The modeling analysis of a smart home system is taken as an example to verify the effectiveness of the refinement method. The results can provide an effective way for the investigation of the refined properties of a Petri net system and a favorable means for large-scale complex embedded system modeling, which has broad application prospects. Full article
(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)
Show Figures

Figure 1

21 pages, 9487 KiB  
Article
An Algorithm Based on DAF-Net++ Model for Wood Annual Rings Segmentation
by Zhedong Ge, Ziheng Zhang, Liming Shi, Shuai Liu, Yisheng Gao, Yucheng Zhou and Qiang Sun
Electronics 2023, 12(14), 3009; https://doi.org/10.3390/electronics12143009 - 09 Jul 2023
Viewed by 818
Abstract
The semantic segmentation of annual rings is a research topic of interest in wood chronology. To solve the problem of wood annual rings being difficult to segment in dense areas and being greatly affected by defects such as cracks and wormholes, this paper [...] Read more.
The semantic segmentation of annual rings is a research topic of interest in wood chronology. To solve the problem of wood annual rings being difficult to segment in dense areas and being greatly affected by defects such as cracks and wormholes, this paper builds a DAF-Net++ model which is based on U-Net whose backbone network is VGG16 and filled with dense jump links, CBAM and DCAM. In this model, VGG16 is used to enhance the extraction ability of image features, dense jump links are used to fuse semantic information of different levels, DCAM provides weighting guidance for shallow features, and CBAM solves the loss of down-sampling information. Taking a Chinese fir wood as the experimental object, 1700 CT images of wood transverse section were obtained by medical CT equipment and 120 of them were randomly selected as the dataset, which was expanded by cropping and rotation, among others. DAF-Net++ was used for training the model and segmentation of the annual rings, and finally the performance of the model was evaluated. The training method is freeze training followed by thaw training, and takes Focal Loss as the loss function, ReLU as the activation function, and Adam as the optimizer. The experimental results show that, in the segmentation of CT images of Chinese fir annual rings, the MIoU of DAF-Net++ is 93.67%, the MPA is 96.76%, the PA is 96.63%, and the Recall is 96.76%. Compared with other semantic segmentation models such as U-Net, U-Net++, DeepLabv3+, etc., DAF-Net++ has better segmentation performance. Full article
(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)
Show Figures

Figure 1

Back to TopTop