Second Edition of Predictive Analytics and Data Science

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information and Communications Technology".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 8539

Special Issue Editors


E-Mail
Guest Editor
Department of Computer Science and Systems Technology, University of Pannonia, 8200 Veszprém, Hungary
Interests: artificial intelligence; machine learning; data mining; health informatics; network analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
MTA-PE Lendület Complex Systems Monitoring Research Group, Department of Process Engineering, University of Pannonia, H-8200 Veszprém, Hungary
Interests: chemical engineering; complex systems; computational intelligence; network science; process engineering
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The development and maintenance of predictive-data-driven models poses several challenges, such as feature selection, model structure optimisation, sensitivity analysis, model validation, model maintenance, transfer learning and adaptation, model deployment, and evaluation of the benefit of the application of the models.

This Special Issue solicits papers covering the development, validation, application, and maintenance of predictive analytics models and presenting real-life applications. The potential topics include, but are not limited to:

  • Classification-based prediction models;
  • Regression-based prediction models;
  • Forecast using deep learning methods and algorithms;
  • Managing the uncertainty and missing data in forecast;
  • The life cycle of predictive models, and maintaining predictive models;
  • Development and validation of online predictive models;
  • Self-learning predictive models;
  • Predictive analytics in Industry 4.0 (application of sensors, historical experience);
  • Predictive analysis in healthcare and economy (e.g., patient pathway prediction, predicting complications, customer relationship management, risk reduction, churn prevention, market trend and analysis, credit scoring);
  • Social media and text-analysis-based predictive models and systems.

Dr. Agnes Vathy-Fogarassy
Prof. Dr. János Abonyi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification
  • regression
  • deep learning
  • uncertainty
  • validation and maintenance
  • self-learning
  • real-life applications

Related Special Issue

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 449 KiB  
Article
A Proactive Decision-Making Model for Evaluating the Reliability of Infrastructure Assets of a Railway System
by Daniel O. Aikhuele and Shahryar Sorooshian
Information 2024, 15(4), 219; https://doi.org/10.3390/info15040219 - 13 Apr 2024
Viewed by 409
Abstract
Railway infrastructure is generally classified as either fixed or movable infrastructure assets. Failure in any of the assets could lead to the complete shutdown and disruption of the entire system, economic loss, inconvenience to passengers and the train operating company(s), and can sometimes [...] Read more.
Railway infrastructure is generally classified as either fixed or movable infrastructure assets. Failure in any of the assets could lead to the complete shutdown and disruption of the entire system, economic loss, inconvenience to passengers and the train operating company(s), and can sometimes result in death or injury in the event of the derailment of the rolling stock. Considering the importance of the railway infrastructure assets, it is only necessary to continuously explore their behavior, reliability, and safety. In this paper, a proactive multi-criteria decision-making model that is based on an interval-valued intuitionistic fuzzy set and some reliability quantitative parameters has been proposed for the evaluation of the reliability of the infrastructure assets. Results from the evaluation show that the failure mode ‘Broken and defective rails’ has the most risk and reliability concerns. Hence, priority should be given to the failure mode to avoid a total system collapse. Full article
(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)
Show Figures

Figure 1

19 pages, 574 KiB  
Article
Generally Applicable Q-Table Compression Method and Its Application for Constrained Stochastic Graph Traversal Optimization Problems
by Tamás Kegyes, Alex Kummer, Zoltán Süle and János Abonyi
Information 2024, 15(4), 193; https://doi.org/10.3390/info15040193 - 31 Mar 2024
Viewed by 519
Abstract
We analyzed a special class of graph traversal problems, where the distances are stochastic, and the agent is restricted to take a limited range in one go. We showed that both constrained shortest Hamiltonian pathfinding problems and disassembly line balancing problems belong to [...] Read more.
We analyzed a special class of graph traversal problems, where the distances are stochastic, and the agent is restricted to take a limited range in one go. We showed that both constrained shortest Hamiltonian pathfinding problems and disassembly line balancing problems belong to the class of constrained shortest pathfinding problems, which can be represented as mixed-integer optimization problems. Reinforcement learning (RL) methods have proven their efficiency in multiple complex problems. However, researchers concluded that the learning time increases radically by growing the state- and action spaces. In continuous cases, approximation techniques are used, but these methods have several limitations in mixed-integer searching spaces. We present the Q-table compression method as a multistep method with dimension reduction, state fusion, and space compression techniques that project a mixed-integer optimization problem into a discrete one. The RL agent is then trained using an extended Q-value-based method to deliver a human-interpretable model for optimal action selection. Our approach was tested in selected constrained stochastic graph traversal use cases, and comparative results are shown to the simple grid-based discretization method. Full article
(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)
Show Figures

Figure 1

32 pages, 1285 KiB  
Article
Comparative Analysis of NLP-Based Models for Company Classification
by Maryan Rizinski, Andrej Jankov, Vignesh Sankaradas, Eugene Pinsky, Igor Mishkovski and Dimitar Trajanov
Information 2024, 15(2), 77; https://doi.org/10.3390/info15020077 - 31 Jan 2024
Viewed by 2004
Abstract
The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow, costly, and vendor-specific assignments. Therefore, we investigate recent natural [...] Read more.
The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow, costly, and vendor-specific assignments. Therefore, we investigate recent natural language processing (NLP) advancements to automate the company classification process. In particular, we employ and evaluate various NLP-based models, including zero-shot learning, One-vs-Rest classification, multi-class classifiers, and ChatGPT-aided classification. We conduct a comprehensive comparison among these models to assess their effectiveness in the company classification task. The evaluation uses the Wharton Research Data Services (WRDS) dataset, consisting of textual descriptions of publicly traded companies. Our findings reveal that the RoBERTa and One-vs-Rest classifiers surpass the other methods, achieving F1 scores of 0.81 and 0.80 on the WRDS dataset, respectively. These results demonstrate that deep learning algorithms offer the potential to automate, standardize, and continuously update classification systems in an efficient and cost-effective way. In addition, we introduce several improvements to the multi-class classification techniques: (1) in the zero-shot methodology, we TF-IDF to enhance sector representation, yielding improved accuracy in comparison to standard zero-shot classifiers; (2) next, we use ChatGPT for dataset generation, revealing potential in scenarios where datasets of company descriptions are lacking; and (3) we also employ K-Fold to reduce noise in the WRDS dataset, followed by conducting experiments to assess the impact of noise reduction on the company classification results. Full article
(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)
Show Figures

Figure 1

20 pages, 8983 KiB  
Article
An Effective Ensemble Convolutional Learning Model with Fine-Tuning for Medicinal Plant Leaf Identification
by Mohd Asif Hajam, Tasleem Arif, Akib Mohi Ud Din Khanday and Mehdi Neshat
Information 2023, 14(11), 618; https://doi.org/10.3390/info14110618 - 18 Nov 2023
Cited by 2 | Viewed by 2658
Abstract
Accurate and efficient medicinal plant image classification is of utmost importance as these plants produce a wide variety of bioactive compounds that offer therapeutic benefits. With a long history of medicinal plant usage, different parts of plants, such as flowers, leaves, and roots, [...] Read more.
Accurate and efficient medicinal plant image classification is of utmost importance as these plants produce a wide variety of bioactive compounds that offer therapeutic benefits. With a long history of medicinal plant usage, different parts of plants, such as flowers, leaves, and roots, have been recognized for their medicinal properties and are used for plant identification. However, leaf images are extensively used due to their convenient accessibility and are a major source of information. In recent years, transfer learning and fine-tuning, which use pre-trained deep convolutional networks to extract pertinent features, have emerged as an extremely effective approach for image-identification problems. This study leveraged the power by three-component deep convolutional neural networks, namely VGG16, VGG19, and DenseNet201, to derive features from the input images of the medicinal plant dataset, containing leaf images of 30 classes. The models were compared and ensembled to make four hybrid models to enhance the predictive performance by utilizing the averaging and weighted averaging strategies. Quantitative experiments were carried out to evaluate the models on the Mendeley Medicinal Leaf Dataset. The resultant ensemble of VGG19+DensNet201 with fine-tuning showcased an enhanced capability in identifying medicinal plant images with an improvement of 7.43% and 5.8% compared with VGG19 and VGG16. Furthermore, VGG19+DensNet201 can outperform its standalone counterparts by achieving an accuracy of 99.12% on the test set. A thorough assessment with metrics such as accuracy, recall, precision, and the F1-score firmly established the effectiveness of the ensemble strategy. Full article
(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)
Show Figures

Figure 1

30 pages, 2295 KiB  
Article
An Integrated GIS-Based Reinforcement Learning Approach for Efficient Prediction of Disease Transmission in Aquaculture
by Aristeidis Karras, Christos Karras, Spyros Sioutas, Christos Makris, George Katselis, Ioannis Hatzilygeroudis, John A. Theodorou and Dimitrios Tsolis
Information 2023, 14(11), 583; https://doi.org/10.3390/info14110583 - 24 Oct 2023
Viewed by 2270
Abstract
This study explores the design and capabilities of a Geographic Information System (GIS) incorporated with an expert knowledge system, tailored for tracking and monitoring the spread of dangerous diseases across a collection of fish farms. Specifically targeting the aquacultural regions of Greece, the [...] Read more.
This study explores the design and capabilities of a Geographic Information System (GIS) incorporated with an expert knowledge system, tailored for tracking and monitoring the spread of dangerous diseases across a collection of fish farms. Specifically targeting the aquacultural regions of Greece, the system captures geographical and climatic data pertinent to these farms. A feature of this system is its ability to calculate disease transmission intervals between individual cages and broader fish farm entities, providing crucial insights into the spread dynamics. These data then act as an entry point to our expert system. To enhance the predictive precision, we employed various machine learning strategies, ultimately focusing on a reinforcement learning (RL) environment. This RL framework, enhanced by the Multi-Armed Bandit (MAB) technique, stands out as a powerful mechanism for effectively managing the flow of virus transmissions within farms. Empirical tests highlight the efficiency of the MAB approach, which, in direct comparisons, consistently outperformed other algorithmic options, achieving an impressive accuracy rate of 96%. Looking ahead to future work, we plan to integrate buffer techniques and delve deeper into advanced RL models to enhance our current system. The results set the stage for future research in predictive modeling within aquaculture health management, and we aim to extend our research even further. Full article
(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)
Show Figures

Figure 1

Back to TopTop