Submit to Applied Sciences Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Intelligent Data Mining, Analysis and Modeling Based on Machine Learning

Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (31 March 2024) | Viewed by 4049

Share This Special Issue

Special Issue Editors

Dr. Danhuai Guo

E-Mail Website
Guest Editor

College of Information Sciences and Technology, Beijing University of Chemical Technology, Beijing 100029, China
Interests: spatio-temporal big data analysis; artificial intelligence; deep learning; geographic Information science

Dr. Zhi Cai

E-Mail Website
Guest Editor

School of Computer Science, Beijing University of Technology, Beijing 100124, China
Interests: spatio-temporal data analysis and positioning algorithms; geosocial data mining; information retrieval
Special Issues, Collections and Topics in MDPI journals

Dr. Yuping Lai

E-Mail Website
Guest Editor

School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
Interests: information security; machine learning

Special Issue Information

Dear Colleagues,

In the realm of data-driven exploration, algorithms seamlessly intertwine with the digital landscape. Our focus converges at the forefront of Intelligent Data Mining, Analysis, and Modeling. This theme delves into the profound integration of machine learning techniques with the domains of data excavation, analysis, and model construction. Within this sphere, we embrace and surmount novel challenges. This Special Issue aims to present pioneering ideas and experimental outcomes in the domain of machine learning-based data mining, spanning from design, services, and theory to practical applications. It serves as a platform for the unveiling of breakthrough concepts and empirical discoveries, encompassing foundational theories to real-world implementations. Join us in exploring the transformative potential of machine learning within the realm of intelligent data exploration.

This Special Issue will publish high-quality and original research papers in the overlapping fields of:

Artificial intelligence;
Machine learning and deep learning;
Computational and data science;
Data integration and preprocessing；
Modeling methods and techniques；
Big data applications and algorithms；
Physics-informed neural network;
Spatiotemporal big data.

Dr. Danhuai Guo
Dr. Zhi Cai
Dr. Yuping Lai
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

artificial intelligence
machine learning and deep learning
computational and data science
data integration and preprocessing
modeling methods and techniques
big data applications and algorithms
physics-informed neural network
spatiotemporal big data

Published Papers (5 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

17 pages, 1335 KiB

Open AccessArticle

Link Prediction Based on Data Augmentation and Metric Learning Knowledge Graph Embedding

by Lijuan Duan, Shengwen Han, Wei Jiang, Meng He and Yuanhua Qiao

Appl. Sci. 2024, 14(8), 3412; https://doi.org/10.3390/app14083412 - 18 Apr 2024

Viewed by 302

Abstract

A knowledge graph is a repository that represents a vast amount of information in the form of triplets. In the training process of completing the knowledge graph, the knowledge graph only contains positive examples, which makes reliable link prediction difficult, especially in the setting of complex relations. At the same time, current techniques that rely on distance models encapsulate entities within Euclidean space, limiting their ability to depict nuanced relationships and failing to capture their semantic importance. This research offers a unique strategy based on Gibbs sampling and connection embedding to improve the model’s competency in handling link prediction within complex relationships. Gibbs sampling is initially used to obtain high-quality negative samples. Following that, the triplet entities are mapped onto a hyperplane defined by the connection. This procedure produces complicated relationship embeddings loaded with semantic information. Through metric learning, this process produces complex relationship embeddings imbued with semantic meaning. Finally, the method’s effectiveness is demonstrated on three link prediction benchmark datasets FB15k-237, WN11RR and FB15k. Full article

(This article belongs to the Special Issue Intelligent Data Mining, Analysis and Modeling Based on Machine Learning)

► Show Figures

Figure 1

20 pages, 1590 KiB

Open AccessArticle

Query Optimization in Distributed Database Based on Improved Artificial Bee Colony Algorithm

by Yan Du, Zhi Cai and Zhiming Ding

Appl. Sci. 2024, 14(2), 846; https://doi.org/10.3390/app14020846 - 19 Jan 2024

Viewed by 894

Abstract

Query optimization is one of the key factors affecting the performance of database systems that aim to enact the query execution plan with minimum cost. Particularly in distributed database systems, due to the multiple copies of the data that are stored in different data nodes, resulting in the dramatic increase in the feasible query execution plans for a query statement. Because of the increasing volume of stored data, the cluster size of distributed databases also increases, resulting in poor performance of current query optimization algorithms. In this case, a dynamic perturbation-based artificial bee colony algorithm is proposed to solve the query optimization problem in distributed database systems. The improved artificial bee colony algorithm improves the global search capability by combining the selection, crossover, and mutation operators of the genetic algorithm to overcome the problem of falling into the local optimal solution easily. At the same time, the dynamic perturbation factor is introduced so that the algorithm parameters can be dynamically varied along with the process of iteration as well as the convergence degree of the whole population to improve the convergence efficiency of the algorithm. Finally, comparative experiments conducted to assess the average execution cost of Top-k query plans generated by the algorithms and the convergence speed of algorithms under the conditions of query statements in six different dimension sets. The results demonstrate that the Top-k query plans generated by the proposed method have a lower execution cost and a faster convergence speed, which can effectively improve the query efficiency. However, this method requires more execution time. Full article

(This article belongs to the Special Issue Intelligent Data Mining, Analysis and Modeling Based on Machine Learning)

► Show Figures

Figure 1

13 pages, 729 KiB

Open AccessArticle

Hybrid Clustering Algorithm Based on Improved Density Peak Clustering

by Limin Guo, Weijia Qin, Zhi Cai and Xing Su

Appl. Sci. 2024, 14(2), 715; https://doi.org/10.3390/app14020715 - 15 Jan 2024

Viewed by 748

Abstract

In the era of big data, unsupervised learning algorithms such as clustering are particularly prominent. In recent years, there have been significant advancements in clustering algorithm research. The Clustering by Density Peaks algorithm is known as Clustering by Fast Search and Find of Density Peaks (density peak clustering). This clustering algorithm, proposed in Science in 2014, automatically finds cluster centers. It is simple, efficient, does not require iterative computation, and is suitable for large-scale and high-dimensional data. However, DPC and most of its refinements have several drawbacks. The method primarily considers the overall structure of the data, often resulting in the oversight of many clusters. The choice of truncation distance affects the calculation of local density values, and varying dataset sizes may necessitate different computational methods, impacting the quality of clustering results. In addition, the initial assignment of labels can cause a ‘chain reaction’, i.e., if one data point is incorrectly labeled, it may lead to more subsequent data points being incorrectly labeled. In this paper, we propose an improved density peak clustering method, DPC-MS, which uses the mean-shift algorithm to find local density extremes, making the accuracy of the algorithm independent of the parameter dc. After finding the local density extreme points, the allocation strategy of the DPC algorithm is employed to assign the remaining points to appropriate local density extreme points, forming the final clusters. The robustness of this method in handling uncertain dataset sizes adds some application value, and several experiments were conducted on synthetic and real datasets to evaluate the performance of the proposed method. The results show that the proposed method outperforms some of the more recent methods in most cases. Full article

(This article belongs to the Special Issue Intelligent Data Mining, Analysis and Modeling Based on Machine Learning)

► Show Figures

Figure 1

22 pages, 23761 KiB

Open AccessArticle

Robust Ranking Kernel Support Vector Machine via Manifold Regularized Matrix Factorization for Multi-Label Classification

by Heping Song, Yiming Zhou, Ebenezer Quayson, Qian Zhu and Xiangjun Shen

Appl. Sci. 2024, 14(2), 638; https://doi.org/10.3390/app14020638 - 11 Jan 2024

Viewed by 521

Abstract

Multi-label classification has been extensively researched and utilized for several decades. However, the performance of these methods is highly susceptible to the presence of noisy data samples, resulting in a significant decrease in accuracy when noise levels are high. To address this issue, we propose a robust ranking support vector machine (Rank-SVM) method that incorporates manifold regularized matrix factorization. Unlike traditional Rank-SVM methods, our approach integrates feature selection and multi-label learning into a unified framework. Within this framework, we employ matrix factorization to learn a low-rank robust subspace within the input space, thereby enhancing the robustness of data representation in high-noise conditions. Additionally, we incorporate manifold structure regularization into the framework to preserve manifold relationships among low-rank samples, which further improves the robustness of the low-rank representation. Leveraging on this robust low-rank representation, we extract a resilient low-rank features and employ them to construct a more effective classifier. Finally, the proposed framework is extended to derive a kernelized ranking approach, for the creation of nonlinear multi-label classifiers. To effectively solve this non-convex kernelized method, we employ the augmented Lagrangian multiplier (ALM) and alternating direction method of multipliers (ADMM) techniques to obtain the optimal solution. Experimental evaluations conducted on various datasets demonstrate that our framework achieves superior classification results and significantly enhances performance in high-noise scenarios. Full article

(This article belongs to the Special Issue Intelligent Data Mining, Analysis and Modeling Based on Machine Learning)

► Show Figures

Figure 1

29 pages, 7450 KiB

Open AccessArticle

Efficient Diagnosis of Autism Spectrum Disorder Using Optimized Machine Learning Models Based on Structural MRI

by Reem Ahmed Bahathiq, Haneen Banjar, Salma Kammoun Jarraya, Ahmed K. Bamaga and Rahaf Almoallim

Appl. Sci. 2024, 14(2), 473; https://doi.org/10.3390/app14020473 - 05 Jan 2024

Viewed by 881

Abstract

Autism spectrum disorder (ASD) affects approximately 1.4% of the population and imposes significant social and economic burdens. Because its etiology is unknown, effective diagnosis is challenging. Advancements in structural magnetic resonance imaging (sMRI) allow for the objective assessment of ASD by examining structural brain changes. Recently, machine learning (ML)-based diagnostic systems have emerged to expedite and enhance the diagnostic process. However, the expected success in ASD was not yet achieved. This study evaluates and compares the performance of seven optimized ML models to identify sMRI-based biomarkers for early and accurate detection of ASD in children aged 5 to 10 years. The effect of using hyperparameter tuning and feature selection techniques are investigated using two public datasets from Autism Brain Imaging Data Exchange Initiative. Furthermore, these models are tested on a local Saudi dataset to verify their generalizability. The integration of the grey wolf optimizer with a support vector machine achieved the best performance with an average accuracy of 71% (with further improvement to 71% after adding personal features) using 10-fold Cross-validation. The optimized models identified relevant biomarkers for diagnosis, lending credence to their truly generalizable nature and advancing scientific understanding of neurological changes in ASD. Full article

(This article belongs to the Special Issue Intelligent Data Mining, Analysis and Modeling Based on Machine Learning)

► Show Figures

Journal Menu

Journal Browser

Intelligent Data Mining, Analysis and Modeling Based on Machine Learning

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (5 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI