Submit to Special Issue Submit Abstract to Special Issue Review for Information Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Advances in Machine Learning and Intelligent Information Systems

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers
Planned Papers

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Systems".

Deadline for manuscript submissions: 31 July 2024 | Viewed by 23573

Share This Special Issue

Special Issue Editors

Dr. Eftim Zdravevski

E-Mail Website
Guest Editor

Faculty of Computer Science and Engineering, University Ss. Cyril and Methodius, Skopje, North Macedonia
Interests: big data; stream processing; machine learning; time series analysis; data warehouses
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Petre Lameski

E-Mail Website
Guest Editor

Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, North Macedonia
Interests: machine learning; multivariate time series data analysis; deep learning; software architectures

Prof. Dr. Ivan Miguel Pires

E-Mail Website
Guest Editor

Instituto de Telecomunicações, Universidade da Beira Interior, Covilhã, Portugal
Interests: ambient assisted living; data classification; data processing; data fusion; volume; image processing; medical imaging
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

At present, the success of companies is defined by their ability to cope and adapt to new needs and upcoming trends. This includes new and everchanging patterns and requirements in data generation, data acquisition, data processing, data understanding, and data visualization. Furthermore, extracting meaningful knowledge is paramount and challenging in such a dynamic, data-driven world. To help the industries cope with these needs, there have been numerous technological developments in recent years in the fields of big data processing, machine learning on streaming data, cloud data warehouses and data lakes, intelligent decision support systems, etc.

This Special Issue encourages the submission of papers presenting state-of-the-art research and application of machine learning approaches in various industrial settings. Topics of interest include (but are not limited to) the following subject categories:

Big data.
Streaming data.
Stream processing.
Scalable cloud infrastructures.
Deep learning and machine learning (DL/ML) on big data.
Real-time analytics.
Multi-variate time series.
Data fusion.
Cloud data warehouses.
Data lakes.
Multi-cloud data processing architectures.
Application of ML in medicine and health informatics.
Application of ML in retail.
Application of ML in banking, financial services, and insurance (BFSI).
Data Fabric & Data Mesh architectures

Dr. Eftim Zdravevski
Prof. Dr. Petre Lameski
Prof. Dr. Ivan Miguel Pires
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

big data
machine learning
industrial applications
data lakes
data fusion
streaming data
real-time analytics

Published Papers (10 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

16 pages, 359 KiB

Open AccessArticle

Automated Trace Clustering Pipeline Synthesis in Process Mining

by Iuliana Malina Grigore, Gabriel Marques Tavares, Matheus Camilo da Silva, Paolo Ceravolo and Sylvio Barbon Junior

Information 2024, 15(4), 241; https://doi.org/10.3390/info15040241 - 20 Apr 2024

Viewed by 404

Abstract

Business processes have undergone a significant transformation with the advent of the process-oriented view in organizations. The increasing complexity of business processes and the abundance of event data have driven the development and widespread adoption of process mining techniques. However, the size and noise of event logs pose challenges that require careful analysis. The inclusion of different sets of behaviors within the same business process further complicates data representation, highlighting the continued need for innovative solutions in the evolving field of process mining. Trace clustering is emerging as a solution to improve the interpretation of underlying business processes. Trace clustering offers benefits such as mitigating the impact of outliers, providing valuable insights, reducing data dimensionality, and serving as a preprocessing step in robust pipelines. However, designing an appropriate clustering pipeline can be challenging for non-experts due to the complexity of the process and the number of steps involved. For experts, it can be time-consuming and costly, requiring careful consideration of trade-offs. To address the challenge of pipeline creation, the paper proposes a genetic programming solution for trace clustering pipeline synthesis that optimizes a multi-objective function matching clustering and process quality metrics. The solution is applied to real event logs, and the results demonstrate improved performance in downstream tasks through the identification of sub-logs. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

26 pages, 7467 KiB

Open AccessArticle

Exploring Key Issues in Cybersecurity Data Breaches: Analyzing Data Breach Litigation with ML-Based Text Analytics

by Dominik Molitor, Wullianallur Raghupathi, Aditya Saharia and Viju Raghupathi

Information 2023, 14(11), 600; https://doi.org/10.3390/info14110600 - 05 Nov 2023

Viewed by 3313

Abstract

While data breaches are a frequent and universal phenomenon, the characteristics and dimensions of data breaches are unexplored. In this novel exploratory research, we apply machine learning (ML) and text analytics to a comprehensive collection of data breach litigation cases to extract insights from the narratives contained within these cases. Our analysis shows stakeholders (e.g., litigants) are concerned about major topics related to identity theft, hacker, negligence, FCRA (Fair Credit Reporting Act), cybersecurity, insurance, phone device, TCPA (Telephone Consumer Protection Act), credit card, merchant, privacy, and others. The topics fall into four major clusters: “phone scams”, “cybersecurity”, “identity theft”, and “business data breach”. By utilizing ML, text analytics, and descriptive data visualizations, our study serves as a foundational piece for comprehensively analyzing large textual datasets. The findings hold significant implications for both researchers and practitioners in cybersecurity, especially those grappling with the challenges of data breaches. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

13 pages, 397 KiB

Open AccessArticle

Knowledge Graph Based Recommender for Automatic Playlist Continuation

by Aleksandar Ivanovski, Milos Jovanovik, Riste Stojanov and Dimitar Trajanov

Information 2023, 14(9), 510; https://doi.org/10.3390/info14090510 - 16 Sep 2023

Viewed by 1552

Abstract

In this work, we present a state-of-the-art solution for automatic playlist continuation through a knowledge graph-based recommender system. By integrating representational learning with graph neural networks and fusing multiple data streams, the system effectively models user behavior, leading to accurate and personalized recommendations. We provide a systematic and thorough comparison of our results with existing solutions and approaches, demonstrating the remarkable potential of graph-based representation in improving recommender systems. Our experiments reveal substantial enhancements over existing approaches, further validating the efficacy of this novel approach. Additionally, through comprehensive evaluation, we highlight the robustness of our solution in handling dynamic user interactions and streaming data scenarios, showcasing its practical viability and promising prospects for next-generation recommender systems. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

18 pages, 5379 KiB

Open AccessArticle

AttG-BDGNets: Attention-Guided Bidirectional Dynamic Graph IndRNN for Non-Intrusive Load Monitoring

by Zuoxin Wang and Xiaohu Zhao

Information 2023, 14(7), 383; https://doi.org/10.3390/info14070383 - 04 Jul 2023

Viewed by 1096

Abstract

Most current non-intrusive load monitoring methods focus on traditional load characteristic analysis and algorithm optimization, lack knowledge of users’ electricity consumption behavior habits, and have poor accuracy. We propose a novel attention-guided bidirectional dynamic graph IndRNN approach. The method first extends sequence or multidimensional data to a topological graph structure. It effectively utilizes the global context by following an adaptive graph topology derived from each set of data content. Then, the bidirectional Graph IndRNN network (Graph IndRNN) encodes the aggregated signals into different graph nodes, which use node information transfer and aggregation based on the entropy measure, power attribute characteristics, and the time-related structural characteristics of the corresponding device signals. The function dynamically incorporates local and global contextual interactions from positive and negative directions to learn the neighboring node information for non-intrusive load decomposition. In addition, using the sequential attention mechanism as a guide while eliminating redundant information facilitates flexible reasoning and establishes good vertex relationships. Finally, we conducted experimental evaluations on multiple open source data, proving that the method has good robustness and accuracy. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

20 pages, 4094 KiB

Open AccessArticle

Towards Safe Cyber Practices: Developing a Proactive Cyber-Threat Intelligence System for Dark Web Forum Content by Identifying Cybercrimes

by Kanti Singh Sangher, Archana Singh, Hari Mohan Pandey and Vivek Kumar

Information 2023, 14(6), 349; https://doi.org/10.3390/info14060349 - 18 Jun 2023

Cited by 2 | Viewed by 2794

Abstract

The untraceable part of the Deep Web, also known as the Dark Web, is one of the most used “secretive spaces” to execute all sorts of illegal and criminal activities by terrorists, cybercriminals, spies, and offenders. Identifying actions, products, and offenders on the Dark Web is challenging due to its size, intractability, and anonymity. Therefore, it is crucial to intelligently enforce tools and techniques capable of identifying the activities of the Dark Web to assist law enforcement agencies as a support system. Therefore, this study proposes four deep learning architectures (RNN, CNN, LSTM, and Transformer)-based classification models using the pre-trained word embedding representations to identify illicit activities related to cybercrimes on Dark Web forums. We used the Agora dataset derived from the DarkNet market archive, which lists 109 activities by category. The listings in the dataset are vaguely described, and several data points are untagged, which rules out the automatic labeling of category items as target classes. Hence, to overcome this constraint, we applied a meticulously designed human annotation scheme to annotate the data, taking into account all the attributes to infer the context. In this research, we conducted comprehensive evaluations to assess the performance of our proposed approach. Our proposed BERT-based classification model achieved an accuracy score of 96%. Given the unbalancedness of the experimental data, our results indicate the advantage of our tailored data preprocessing strategies and validate our annotation scheme. Thus, in real-world scenarios, our work can be used to analyze Dark Web forums and identify cybercrimes by law enforcement agencies and can pave the path to develop sophisticated systems as per the requirements. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

19 pages, 624 KiB

Open AccessArticle

Artificially Intelligent Readers: An Adaptive Framework for Original Handwritten Numerical Digits Recognition with OCR Methods

by Parth Hasmukh Jain, Vivek Kumar, Jim Samuel, Sushmita Singh, Abhinay Mannepalli and Richard Anderson

Information 2023, 14(6), 305; https://doi.org/10.3390/info14060305 - 26 May 2023

Cited by 3 | Viewed by 3199

Abstract

Advanced artificial intelligence (AI) techniques have led to significant developments in optical character recognition (OCR) technologies. OCR applications, using AI techniques for transforming images of typed text, handwritten text, or other forms of text into machine-encoded text, provide a fair degree of accuracy for general text. However, even after decades of intensive research, creating OCR with human-like abilities has remained evasive. One of the challenges has been that OCR models trained on general text do not perform well on localized or personalized handwritten text due to differences in the writing style of alphabets and digits. This study aims to discuss the steps needed to create an adaptive framework for OCR models, with the intent of exploring a reasonable method to customize an OCR solution for a unique dataset of English language numerical digits were developed for this study. We develop a digit recognizer by training our model on the MNIST dataset with a convolutional neural network and contrast it with multiple models trained on combinations of the MNIST and custom digits. Using our methods, we observed results comparable with the baseline and provided recommendations for improving OCR accuracy for localized or personalized handwritten text. This study also provides an alternative perspective to generating data using conventional methods, which can serve as a gold standard for custom data augmentation to help address the challenges of scarce data and data imbalance. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

18 pages, 8972 KiB

Open AccessArticle

Emoji, Text, and Sentiment Polarity Detection Using Natural Language Processing

by Shelley Gupta, Archana Singh and Vivek Kumar

Information 2023, 14(4), 222; https://doi.org/10.3390/info14040222 - 05 Apr 2023

Cited by 2 | Viewed by 3028

Abstract

Virtual users generate a gigantic volume of unbalanced sentiments over various online crowd-sourcing platforms which consist of text, emojis, or a combination of both. Its accurate analysis brings profits to various industries and their services. The state-of-art detects sentiment polarity using common sense with text only. The research work proposes an emoji-based framework for cognitive–conceptual–affective computing of sentiment polarity based on the linguistic patterns of text and emojis. The proposed emoji and text-based parser articulates sentiments with proposed linguistic features along with a combination of different emojis to generate the part of speech into n-gram patterns. In this paper, the sentiments of 650 world-famous personages consisting of 1,68,548 tweets have been downloaded from across the world. The results illustrate that the proposed natural language processing framework shows that the existence of emojis in sentiments many times seems to change the overall polarity of the sentiment. By extension, the CLDR name of the emoji is utilized to evaluate the accurate polarity of emoji patterns, and a dictionary of sentiments is adopted for evaluating the polarity of text. Eventually, the performances of three ML classifiers (SVM, DT, and Naïve Bayes) are evaluated for proposed distinctive linguistic features. The robust experiments indicate that the proposed approach outperforms the SVM classifier as compared to other ML classifiers. The proposed polarity detection generator has achieved an exceptional perspective of sentiments presented in the sentence by employing the flow of concept established, based on linguistic features, polarity inversion, coordination, and discourse patterns, surpassing the performance of extant state-of-the-art approaches. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

19 pages, 9708 KiB

Open AccessArticle

Tracking Unauthorized Access Using Machine Learning and PCA for Face Recognition Developments

by Vasile-Daniel Păvăloaia and George Husac

Information 2023, 14(1), 25; https://doi.org/10.3390/info14010025 - 30 Dec 2022

Cited by 3 | Viewed by 2933

Abstract

In the last two decades there has been obtained tremendous improvements in the field of artificial intelligence (AI) especially in the sector of face/facial recognition (FR). Over the years, the world obtained remarkable progress in the technology that enhanced the face detection techniques use on common PCs and smartphones. Moreover, the steadily progress of programming languages, libraries, frameworks, and tools combined with the great passion of developers and researchers worldwide contribute substantially to open-source AI materials that produced machine learning (ML) algorithms available to any scholar with the will to build the software of tomorrow. The study aims to analyze the specialized literature starting from the first prototype delivered by Cambridge University until the most recent discoveries in FR. The purpose is to identify the most proficient algorithms, and the existing gap in the specialized literature. The research builds a FR application based on simplicity and efficiency of code that facilitates a person’s face detection using a real time photo and validate the access by querying a given database. The paper brings contribution to the field throughout the literature review analysis as well as by the customized code in Phyton, using ML with Principal Component Analysis (PCA), AdaBoost and MySQL for a myriad of application’s development in a variety of domains. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

15 pages, 5720 KiB

Open AccessArticle

The Use of Random Forest Regression for Estimating Leaf Nitrogen Content of Oil Palm Based on Sentinel 1-A Imagery

by Sirojul Munir, Kudang Boro Seminar, Sudradjat, Heru Sukoco and Agus Buono

Information 2023, 14(1), 10; https://doi.org/10.3390/info14010010 - 26 Dec 2022

Cited by 4 | Viewed by 2326

Abstract

For obtaining a spatial map of the distribution of nitrogen nutrients from oil palm plantations, a quite complex Leaf Sampling Unit (LSU) is required. In addition, sample analysis in the laboratory is time consuming and quite expensive, especially for large plantation areas. Monitoring the nutrition of oil palm plants can be achieved using remote-sensing technology. The main obstacles of using passive sensors in multispectral imagery are cloud cover and shadow noise. This research used C-SAR Sentinel equipped with active sensors that can overcome cloud barriers. A model to estimate leaf nitrogen nutrient status was constructed using random forest regression (RFR) based on multiple polarization (VV-VH) and local incidence angle (LIA) data on Sentinel-1A imagery. A sample of 1116 LSU data from different islands (i.e., Sumatra, Java, and Kalimantan) was used to develop the proposed estimation model. The performance evaluation of the model obtained the averaged MAPE, correctness, and MSE of 9.68%, 90.32% and 11.03%, respectively. Spatial maps of the distribution of nitrogen values in certain oil palm areas can be produced and visualized on the web so that they can be accessed easily and quickly for various purposes of oil palm management such as fertilization planning, recommendations, and monitoring. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

19 pages, 396 KiB

Open AccessArticle

Regularized Mixture Rasch Model

by Alexander Robitzsch

Information 2022, 13(11), 534; https://doi.org/10.3390/info13110534 - 10 Nov 2022

Cited by 4 | Viewed by 1516

Abstract

The mixture Rasch model is a popular mixture model for analyzing multivariate binary data. The drawback of this model is that the number of estimated parameters substantially increases with an increasing number of latent classes, which, in turn, hinders the interpretability of model parameters. This article proposes regularized estimation of the mixture Rasch model that imposes some sparsity structure on class-specific item difficulties. We illustrate the feasibility of the proposed modeling approach by means of one simulation study and two simulated case studies. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Intelligent Information Systems)

► Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Displaying articles 1-10

Planned Papers

The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.

Title: Design, building and deployment of smart applications for predicting Remaining Useful Life (RUL) in industrial case uses
Authors: Marta Zorrilla
Affiliation: Department of Computer Science and Electronics, University of Cantabria, Avda. Los Castros s/n, Santander, 39005, Spain
Abstract: This paper presents a comparative analysis of deep learning techniques for predicting Remaining Useful Life (RUL) . We explore various deep learning architectures on distinct datasets, including recurrent neural networks (RNNs, LSTMs and GRUs), convolutional neural networks (CNNs) and Transformers, to assess their effectiveness in RUL estimation. Furthermore, we employ explainability techniques to elucidate the decision-making processes of these models and evaluate their interpretability. By analysing the inner workings of the models, we aim at providing insights into the factors influencing RUL predictions . Through comprehensive experimentation and analysis, this study contributes to the understanding of deep learning methodologies for RUL prediction and underscores the importance of model interpretability in critical applications such as prognostics and health management. On the other hand, we specify the smart system using the RAI4.0 Metamodel, meant for designing, configuring and automatically deploying distributed stream-based industrial applications. Our findings will offer valuable guidance for practitioners seeking to deploy deep learning techniques effectively in predictive maintenance systems, facilitating informed decision-making and enhancing reliability and efficiency in industrial operations.

Journal Menu

Journal Browser

Advances in Machine Learning and Intelligent Information Systems

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (10 papers)

Research

Planned Papers

Further Information

Guidelines

MDPI Initiatives

Follow MDPI