Submit to Special Issue Submit Abstract to Special Issue Review for Electronics Propose a Special Issue

Journal Menu

Journal Browser

Knowledge Engineering and Data Mining Volume II

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: 20 June 2024 | Viewed by 13628

Share This Special Issue

Special Issue Editors

Dr. Agnieszka Konys

E-Mail Website
Guest Editor

Faculty of Computer Science and Information Technology, West Pomeranian University of Technology Szczecin, Zolnierska 49, 71-210 Szczecin, Poland
Interests: ontology; knowledge representation; semantic web technologies; OWL; RDF; knowledge engineering; knowledge bases; knowledge management; reasoning; information extraction; ontology learning; sustainability; sustainability assessment; ontology evaluation
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Agnieszka Nowak-Brzezińska

E-Mail Website
Guest Editor

Institute of Computer Science, Faculty of Science and Technology, University of Silesia, ul. Będzińska 39, 41-200 Sosnowiec, Poland
Interests: knowledge representation and reasoning; rule-based knowledge bases; outliers mining; expert systems; decision support systems; information retrieval systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Extracting knowledge from data is a fundamental process in creating intelligent information retrieval systems, decision support, and knowledge management. Among the welcome topics of work, we seek research on data mining methods, multidimensional data analysis, supervised and unsupervised learning methods, methods of knowledge base management, language ontologies, ontology learning, and others. We encourage you to present new algorithms and work on practical solutions, i.e., applications/systems presenting the actually created applications of the proposed research achievements.

The Special Issue covers the entire knowledge engineering pipeline: from data acquisition and data mining to knowledge extraction and exploitation. For this reason, we have conceived this Special Issue, the purpose of which is to gather the many researchers operating in the field to contribute to a collective effort in understanding the trends and future questions in the field of knowledge engineering and data mining. Topics include, but are not limited to:

knowledge acquisition and engineering;
data mining methods;
big knowledge analytics;
data mining, knowledge discovery, and machine learning;
knowledge modeling and processing;
knowledge acquisition and engineering;
query and natural language processing;
data and information modeling;
data and information semantics;
data-intensive applications;
knowledge representation and reasoning;
decision support systems;
decision-making;
group decision-making;
rules mining;
outliers mining;
data exploration;
data science;
semantic web data and linked data;
ontologies and controlled vocabularies;
data acquisition;
multidimensional data analysis;
supervised and unsupervised learning methods;
parallel processing and modeling;
languages based on parallel programming and data mining.

Dr. Agnieszka Konys
Prof. Dr. Agnieszka Nowak-Brzezińska
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

knowledge engineering
knowledge representation and reasoning
decision support systems
knowledge acquisition
outliers mining
decision making
data mining
data science
data exploration
multidimensional data analysis
supervised and unsupervised learning methods
ontology
knowledge-based systems
ontology learning
methods of knowledge base management
parallel processing and modeling
languages based on parallel programming and data mining

Published Papers (12 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

14 pages, 415 KiB

Open AccessArticle

GPT-Driven Source-to-Source Transformation for Generating Compilable Parallel CUDA Code for Nussinov’s Algorithm

by Marek Palkowski and Mateusz Gruzewski

Electronics 2024, 13(3), 488; https://doi.org/10.3390/electronics13030488 - 24 Jan 2024

Viewed by 850

Abstract

Designing automatic optimizing compilers is an advanced engineering process requiring a great deal of expertise, programming, testing, and experimentation. Maintaining the approach and adapting it to evolving libraries and environments is a time-consuming effort. In recent years, OpenAI has presented the GPT model, which is designed for many fields like computer science, image processing, linguistics, and medicine. It also supports automatic programming and translation between programming languages, as well as human languages. This article will verify the usability of the commonly known LLM model, GPT, for the non-trivial NPDP Nussinov’s parallel algorithm code within the OpenMP standard to create a parallel equivalent of CUDA for NVIDIA graphics cards. The goal of this approach is to avoid creating any post-processing scripts and writing any lines of target code. To validate the output code, we compare the resulting arrays with the ones calculated by the optimized code for the CPU generated employing the polyhedral compilers. Finally, the code will be checked for scalability and performance. We will concentrate on assessing the capabilities of GPT, highlighting common challenges that can be refined during future learning processes. This will enhance code generation for various platforms by leveraging the outcomes from polyhedral optimizers. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

12 pages, 1070 KiB

Open AccessArticle

Prioritization of Scheduled Surgeries Using Fuzzy Decision Support and Risk Assessment Methods

by Luiza Fabisiak

Electronics 2024, 13(1), 90; https://doi.org/10.3390/electronics13010090 - 25 Dec 2023

Viewed by 530

Abstract

The aim of this study was to develop a method to minimize the risk of cancellation of planned surgery in hospital orthopedic departments. The paper proposes a method that combines multi-criteria and multi-faceted risk assessment. Two data sources are used: a fuzzy FTOPSIS method, combined with FMEA assessment. The FMEA method presented in this paper uses the technique of prioritizing preferences according to FTOPSIS similarity to the ideal solution and belief structure, in order to overcome the shortcomings of traditional FMEA indicators. Finally, a numerical case study of process optimization for elective surgery in a Polish clinic is presented. The focus was on planned hip replacements. The effectiveness of the method in assessing the main factors influencing cancellation of elective surgery is demonstrated. A high accuracy of the results and wide adaptability of the method to other areas are features of the combination of the abovementioned methods. The problem addressed in this publication is the high rate of cancellation of elective surgery. The selection of relevant criteria, their importance, and the preferences of the patients were studied. The results of the method provide a viable action plan for the proposed research problem. The proposed method is multifaceted and can be part of an information system to support reorganization, restructuring, and modification of an operational process. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

14 pages, 9268 KiB

Open AccessArticle

A Knowledge Graph Method towards Power System Fault Diagnosis and Classification

by Cheng Li and Bo Wang

Electronics 2023, 12(23), 4808; https://doi.org/10.3390/electronics12234808 - 28 Nov 2023

Viewed by 1062

Abstract

As the scale and complexity of electrical grids continue to expand, the necessity for robust fault detection techniques becomes increasingly urgent. This paper seeks to address the limitations in traditional fault detection approaches, such as the dependence on human experience, low efficiency, and a lack of logical relationships. In response, this study presents a cascaded model that leverages the Random Forest classifier in combination with knowledge reasoning. The proposed method exhibits a high efficiency and accuracy in identifying six basic fault types. This approach not only simplifies fault detection and handling processes but also improves their interpretability. The paper begins by constructing a power fault simulation model, which is based on the IEEE 14-bus system. Subsequently, a Random Forest classification model is developed and compared with other commonly used models such as Support Vector Machines (SVMs), k-Nearest Neighbor (KNN), and Naïve Bayes, using metrics such as the F1-score, accuracy, and confusion matrices. Our results reveal that the Random Forest classifier outperforms the other models, particularly in small-sample datasets, with an accuracy of 90%. Then, we apply knowledge mining technology to create a comprehensive knowledge graph of power faults. At last, we use the transE model for knowledge reasoning to enhance the interpretability to assist decision making and to validate its reliability. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

21 pages, 1202 KiB

Open AccessArticle

How Far Have We Progressed in the Sampling Methods for Imbalanced Data Classification? An Empirical Study

by Zhongbin Sun, Jingqi Zhang, Xiaoyan Zhu and Donghong Xu

Electronics 2023, 12(20), 4232; https://doi.org/10.3390/electronics12204232 - 13 Oct 2023

Viewed by 644

Abstract

Imbalanced data are ubiquitous in many real-world applications, and they have drawn a significant amount of attention in the field of data mining. A variety of methods have been proposed for imbalanced data classification, and data sampling methods are more prevalent due to their independence from classification algorithms. However, due to the increasing number of sampling methods, there is no consensus about which sampling method performs best, and contradictory conclusions have been obtained. Therefore, in the present study, we conducted an extensive comparison of 16 different sampling methods with four popular classification algorithms, using 75 imbalanced binary datasets from several different application domains. In addition, four widely-used measures were employed to evaluate the corresponding classification performance. The experimental results showed that none of the employed sampling methods performed the best and stably across all the used classification algorithms and evaluation measures. Furthermore, we also found that the performance of the different sampling methods was usually affected by the classification algorithms employed. Therefore, it is important for practitioners and researchers to simultaneously select appropriate sampling methods and classification algorithms, for handling the imbalanced data problems at hand. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

19 pages, 582 KiB

Open AccessArticle

Electrical Power Edge-End Interaction Modeling with Time Series Label Noise Learning

by Zhenshang Wang, Mi Zhou, Yuming Zhao, Fan Zhang, Jing Wang, Bin Qian, Zhen Liu, Peitian Ma and Qianli Ma

Electronics 2023, 12(18), 3987; https://doi.org/10.3390/electronics12183987 - 21 Sep 2023

Cited by 1 | Viewed by 797

Abstract

In the context of electrical power systems, modeling the edge-end interaction involves understanding the dynamic relationship between different components and endpoints of the system. However, the time series of electrical power obtained by user terminals often suffer from low-quality issues such as missing values, numerical anomalies, and noisy labels. These issues can easily reduce the robustness of data mining results for edge-end interaction models. Therefore, this paper proposes a time–frequency noisy label classification (TF-NLC) model, which improves the robustness of edge-end interaction models in dealing with low-quality issues. Specifically, we employ two deep neural networks that are trained concurrently, utilizing both the time and frequency domains. The two networks mutually guide each other’s classification training by selecting clean labels from batches within small loss data. To further improve the robustness of the classification of time and frequency domain feature representations, we introduce a time–frequency domain consistency contrastive learning module. By classifying the selection of clean labels based on time–frequency representations for mutually guided training, TF-NLC can effectively mitigate the negative impact of noisy labels on model training. Extensive experiments on eight electrical power and ten other different realistic scenario time series datasets show that our proposed TF-NLC achieves advanced classification performance under different noisy label scenarios. Also, the ablation and visualization experiments further demonstrate the robustness of our proposed method. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

14 pages, 296 KiB

Open AccessArticle

Time and Energy Benefits of Using Automatic Optimization Compilers for NPDP Tasks

by Marek Palkowski and Mateusz Gruzewski

Electronics 2023, 12(17), 3579; https://doi.org/10.3390/electronics12173579 - 24 Aug 2023

Cited by 1 | Viewed by 725

Abstract

In this article, we analyze the program codes generated automatically using three advanced optimizers: Pluto, Traco, and Dapt, which are specifically tailored for the NPDP benchmark set. This benchmark set comprises ten program loops, predominantly from the field of bioinformatics. The codes exemplify dynamic programming, a challenging task for well-known tools used in program loop optimization. Given the intricacy involved, we opted for three automatic compilers based on the polyhedral model and various loop-tiling strategies. During our evaluation of the code’s performance, we meticulously considered locality and concurrency to accurately estimate time and energy efficiency. Notably, we dedicated significant attention to the latest Dapt compiler, which applies space–time loop tiling to generate highly efficient code for the NPDP benchmark suite loops. By employing the aforementioned optimizers and conducting an in-depth analysis, we aim to demonstrate the effectiveness and potential of automatic transformation techniques in enhancing the performance and energy efficiency of dynamic programming codes. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

17 pages, 559 KiB

Open AccessArticle

A New Method for Graph-Based Representation of Text in Natural Language Processing

by Barbara Probierz, Anita Hrabia and Jan Kozak

Electronics 2023, 12(13), 2846; https://doi.org/10.3390/electronics12132846 - 27 Jun 2023

Cited by 1 | Viewed by 1806

Abstract

Natural language processing is still an emerging field in machine learning. Access to more and more data sets in textual form, new applications for artificial intelligence and the need for simple communication with operating systems all simultaneously affect the importance of natural language processing in evolving artificial intelligence. Traditional methods of textual representation, such as Bag-of-Words, have some limitations that result from the lack of consideration of semantics and dependencies between words. Therefore, we propose a new approach based on graph representations, which takes into account both local context and global relationships between words, allowing for a more expressive textual representation. The aim of the paper is to examine the possibility of using graph representations in natural language processing and to demonstrate their use in text classification. An innovative element of the proposed approach is the use of common cliques in graphs representing documents to create a feature vector. Experiments confirm that the proposed approach can improve classification efficiency. The use of a new text representation method to predict book categories based on the analysis of its content resulted in accuracy, precision, recall and an F1-score of over 90%. Moving from traditional approaches to a graph-based approach could make a big difference in natural language processing and text analysis and could open up new opportunities in the field. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

17 pages, 1213 KiB

Open AccessArticle

Knowledge Discovery in Databases for a Football Match Result

by Szymon Głowania, Jan Kozak and Przemysław Juszczuk

Electronics 2023, 12(12), 2712; https://doi.org/10.3390/electronics12122712 - 17 Jun 2023

Viewed by 1219

Abstract

The analysis of sports data and the possibility of using machine learning in the prediction of sports results is an increasingly popular topic of research and application. The main problem, apart from choosing the right algorithm, is to obtain data that allow for effective prediction. The article presents a comprehensive KDD (Knowledge Discovery in Databases) approach that allows for the appropriate preparation of data for sports prediction on sports data. The first part of the article covers the subject of KDD and sports data. The next section presents an approach to developing a dataset on top football leagues. The developed datasets are the main purpose of the article and have been made publicly available to the research community. In the latter part of the article, an experiment with the results based on heterogeneous groups of classifiers and the developed datasets is presented. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

11 pages, 598 KiB

Open AccessArticle

Handling the Complexity of Computing Maximal Consistent Blocks

by Teresa Mroczek

Electronics 2023, 12(10), 2295; https://doi.org/10.3390/electronics12102295 - 19 May 2023

Viewed by 745

Abstract

The maximal consistent blocks technique, adopted from discrete mathematics, describes the maximal collection of objects, in which all objects are indiscernible in terms of available information. In this paper, we estimate the total possible number of maximal consistent blocks and prove that the number of such blocks may grow exponentially with respect to the number of attributes for incomplete data with “do not care” conditions. Results indicate that the time complexity of some known algorithms for computing maximal consistent blocks has been underestimated so far. Taking into account the complexity, for the practical usage of such blocks, we propose a performance improvement involving the parallelization of the maximal consistent blocks construction method. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

18 pages, 628 KiB

Open AccessArticle

Householder Transformation-Based Temporal Knowledge Graph Reasoning

by Xiaojuan Zhao, Aiping Li, Rong Jiang, Kai Chen and Zhichao Peng

Electronics 2023, 12(9), 2001; https://doi.org/10.3390/electronics12092001 - 26 Apr 2023

Cited by 1 | Viewed by 1502

Abstract

Knowledge graphs’ reasoning is of great significance for the further development of artificial intelligence and information retrieval, especially for reasoning over temporal knowledge graphs. The rotation-based method has been shown to be effective at modeling entities and relations on a knowledge graph. However, due to the lack of temporal information representation capability, existing approaches can only model partial relational patterns and they cannot handle temporal combination reasoning. In this regard, we propose HTTR: Householder Transformation-based Temporal knowledge graph Reasoning, which focuses on the characteristics of relations that evolve over time. HTTR first fuses the relation and temporal information in the knowledge graph, then uses the Householder transformation to obtain an orthogonal matrix about the fused information, and finally defines the orthogonal matrix as the rotation of the head-entity to the tail-entity and calculates the similarity between the rotated vector and the vector representation of the tail entity. In addition, we compare three methods for fusing relational and temporal information. We allow other fusion methods to replace the current one as long as the dimensionality satisfies the requirements. We show that HTTR is able to outperform state-of-the-art methods in temporal knowledge graph reasoning tasks and has the ability to learn and infer all of the four relational patterns over time: symmetric reasoning, antisymmetric reasoning, inversion reasoning, and temporal combination reasoning. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

25 pages, 1646 KiB

Open AccessArticle

The Impact of Data Quality on Software Testing Effort Prediction

by Łukasz Radliński

Electronics 2023, 12(7), 1656; https://doi.org/10.3390/electronics12071656 - 31 Mar 2023

Cited by 1 | Viewed by 1314

Abstract

Background: This paper investigates the impact of data quality on the performance of models predicting effort on software testing. Data quality was reflected by training data filtering strategies (data variants) covering combinations of Data Quality Rating, UFP Rating, and a threshold of valid cases. Methods: The experiment used the ISBSG dataset and 16 machine learning models. A process of three-fold cross-validation repeated 20 times was used to train and evaluate each model with each data variant. Model performance was assessed using absolute errors of prediction. A ‘win–tie–loss’ procedure, based on the Wilcoxon signed-rank test, was applied to identify the best models and data variants. Results: Most models, especially the most accurate, performed the best on a complete dataset, even though it contained cases with low data ratings. The detailed results include the rankings of the following: (1) models for particular data variants, (2) data variants for particular models, and (3) the best-performing combinations of models and data variants. Conclusions: Arbitrary and restrictive data selection to only projects with Data Quality Rating and UFP Rating of ‘A’ or ‘B’, commonly used in the literature, does not seem justified. It is recommended not to exclude cases with low data ratings to achieve better accuracy of most predictive models for testing effort prediction. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Figure 1

23 pages, 9327 KiB

Open AccessArticle

Influence of Different Data Interpolation Methods for Sparse Data on the Construction Accuracy of Electric Bus Driving Cycle

by Xingxing Wang, Peilin Ye, Yelin Deng, Yinnan Yuan, Yu Zhu and Hongjun Ni

Electronics 2023, 12(6), 1377; https://doi.org/10.3390/electronics12061377 - 13 Mar 2023

Cited by 3 | Viewed by 1382

Abstract

Battery electric vehicles (BEVs) are one of the most promising new energy models for industrialization and marketization at this stage, which is an important way to solve the current urban haze air pollution, high fuel cost and sustainable development of the automobile industry. This paper takes pure electric buses as the research object and relies on the operation information management platform of new energy buses in Nantong city to propose an electric bus cycle construction method based on the mixed interpolation method to process sparse data. Three different interpolation methods, linear interpolation, step interpolation and mixed interpolation, were used to preprocess the collected data. The principal component analysis method and K-means clustering algorithm were used to reduce and classify the eigen parameter matrix. According to the clustering results, different categories of moving section and idle section libraries were established. According to the length of time and the correlation among various types, several moving sections and idle sections were selected to form a representative driving cycle of Nantong city buses. The results show that the mixed interpolation method, based on linear interpolation and cubic spline interpolation, has a good processing effect. The average relative error between the synthesized working conditions and the measured data are 15.71%, and the relative error of the seven characteristic parameters is less than 10%, which meets the development requirements. In addition, the comparison and analysis with the characteristic parameters of the world typical cycle conditions (NEDC, WLTC) show that the constructed cycle conditions of Nantong city are reasonable and reliable to represent the driving conditions of pure electric buses in Nantong city, which can provide a reference for the optimization of the bus energy control strategy. Full article

(This article belongs to the Special Issue Knowledge Engineering and Data Mining Volume II)

► Show Figures

Journal Menu

Journal Browser

Knowledge Engineering and Data Mining Volume II

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (12 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI