Machine Learning and Knowledge Extraction

24 pages, 1534 KiB

Open AccessFeature PaperArticle

Fairness and Explanation in AI-Informed Decision Making

by Alessa Angerschmid, Jianlong Zhou, Kevin Theuermann, Fang Chen and Andreas Holzinger

Mach. Learn. Knowl. Extr. 2022, 4(2), 556-579; https://doi.org/10.3390/make4020026 - 16 Jun 2022

Cited by 52 | Viewed by 10413

AI-assisted decision-making that impacts individuals raises critical questions about transparency and fairness in artificial intelligence (AI). Much research has highlighted the reciprocal relationships between the transparency/explanation and fairness in AI-assisted decision-making. Thus, considering their impact on user trust or perceived fairness simultaneously benefits [...] Read more.

AI-assisted decision-making that impacts individuals raises critical questions about transparency and fairness in artificial intelligence (AI). Much research has highlighted the reciprocal relationships between the transparency/explanation and fairness in AI-assisted decision-making. Thus, considering their impact on user trust or perceived fairness simultaneously benefits responsible use of socio-technical AI systems, but currently receives little attention. In this paper, we investigate the effects of AI explanations and fairness on human-AI trust and perceived fairness, respectively, in specific AI-based decision-making scenarios. A user study simulating AI-assisted decision-making in two health insurance and medical treatment decision-making scenarios provided important insights. Due to the global pandemic and restrictions thereof, the user studies were conducted as online surveys. From the participant’s trust perspective, fairness was found to affect user trust only under the condition of a low fairness level, with the low fairness level reducing user trust. However, adding explanations helped users increase their trust in AI-assisted decision-making. From the perspective of perceived fairness, our work found that low levels of introduced fairness decreased users’ perceptions of fairness, while high levels of introduced fairness increased users’ perceptions of fairness. The addition of explanations definitely increased the perception of fairness. Furthermore, we found that application scenarios influenced trust and perceptions of fairness. The results show that the use of AI explanations and fairness statements in AI applications is complex: we need to consider not only the type of explanations and the degree of fairness introduced, but also the scenarios in which AI-assisted decision-making is used. Full article

(This article belongs to the Special Issue Fairness and Explanation for Trustworthy AI)

► Show Figures

Figure 1

14 pages, 1132 KiB

Open AccessArticle

Benefits from Variational Regularization in Language Models

by Cornelia Ferner and Stefan Wegenkittl

Mach. Learn. Knowl. Extr. 2022, 4(2), 542-555; https://doi.org/10.3390/make4020025 - 09 Jun 2022

Cited by 3 | Viewed by 2201

Abstract

Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying [...] Read more.

Representations from common pre-trained language models have been shown to suffer from the degeneration problem, i.e., they occupy a narrow cone in latent space. This problem can be addressed by enforcing isotropy in latent space. In analogy with variational autoencoders, we suggest applying a token-level variational loss to a Transformer architecture and optimizing the standard deviation of the prior distribution in the loss function as the model parameter to increase isotropy. The resulting latent space is complete and interpretable: any given point is a valid embedding and can be decoded into text again. This allows for text manipulations such as paraphrase generation directly in latent space. Surprisingly, features extracted at the sentence level also show competitive results on benchmark classification tasks. Full article

(This article belongs to the Special Issue Large Language Models: Methods and Applications)

► Show Figures

Figure 1

23 pages, 6643 KiB

Open AccessArticle

Quality Criteria and Method of Synthesis for Adversarial Attack-Resistant Classifiers

by Anastasia Gurina and Vladimir Eliseev

Mach. Learn. Knowl. Extr. 2022, 4(2), 519-541; https://doi.org/10.3390/make4020024 - 05 Jun 2022

Cited by 1 | Viewed by 2025

Abstract

The actual problem of adversarial attacks on classifiers, mainly implemented using deep neural networks, is considered. This problem is analyzed with a generalization to the case of any classifiers synthesized by machine learning methods. The imperfection of generally accepted criteria for assessing the [...] Read more.

The actual problem of adversarial attacks on classifiers, mainly implemented using deep neural networks, is considered. This problem is analyzed with a generalization to the case of any classifiers synthesized by machine learning methods. The imperfection of generally accepted criteria for assessing the quality of classifiers, including those used to confirm the effectiveness of protection measures against adversarial attacks, is noted. The reason for the appearance of adversarial examples and other errors of classifiers based on machine learning is investigated. A method for modeling adversarial attacks with a demonstration of the main effects observed during the attack is proposed. It is noted that it is necessary to develop quality criteria for classifiers in terms of potential susceptibility to adversarial attacks. To assess resistance to adversarial attacks, it is proposed to use the multidimensional EDCAP criterion (Excess, Deficit, Coating, Approx, Pref). We also propose a method for synthesizing a new EnAE (Ensemble of Auto-Encoders) multiclass classifier based on an ensemble of quality-controlled one-class classifiers according to EDCAP criteria. The EnAE classification algorithm implements a hard voting approach and can detect anomalous inputs. The proposed criterion, synthesis method and classifier are tested on several data sets with a medium dimension of the feature space. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

17 pages, 2269 KiB

Open AccessArticle

Machine and Deep Learning Applications to Mouse Dynamics for Continuous User Authentication

by Nyle Siddiqui, Rushit Dave, Mounika Vanamala and Naeem Seliya

Mach. Learn. Knowl. Extr. 2022, 4(2), 502-518; https://doi.org/10.3390/make4020023 - 19 May 2022

Cited by 15 | Viewed by 4983

Abstract

Static authentication methods, like passwords, grow increasingly weak with advancements in technology and attack strategies. Continuous authentication has been proposed as a solution, in which users who have gained access to an account are still monitored in order to continuously verify that the [...] Read more.

Static authentication methods, like passwords, grow increasingly weak with advancements in technology and attack strategies. Continuous authentication has been proposed as a solution, in which users who have gained access to an account are still monitored in order to continuously verify that the user is not an imposter who had access to the user credentials. Mouse dynamics is the behavior of a user’s mouse movements and is a biometric that has shown great promise for continuous authentication schemes. This article builds upon our previous published work by evaluating our dataset of 40 users using three machine learning and three deep learning algorithms. Two evaluation scenarios are considered: binary classifiers are used for user authentication, with the top performer being a 1-dimensional convolutional neural network (1D-CNN) with a peak average test accuracy of 85.73% across the top-10 users. Multi-class classification is also examined using an artificial neural network (ANN) which reaches an astounding peak accuracy of 92.48%, the highest accuracy we have seen for any classifier on this dataset. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

14 pages, 1138 KiB

Open AccessArticle

TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks

by Amirarsalan Rajabi and Ozlem Ozmen Garibay

Mach. Learn. Knowl. Extr. 2022, 4(2), 488-501; https://doi.org/10.3390/make4020022 - 16 May 2022

Cited by 14 | Viewed by 4226

Abstract

With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model [...] Read more.

With the increasing reliance on automated decision making, the issue of algorithmic fairness has gained increasing importance. In this paper, we propose a Generative Adversarial Network for tabular data generation. The model includes two phases of training. In the first phase, the model is trained to accurately generate synthetic data similar to the reference dataset. In the second phase we modify the value function to add fairness constraint, and continue training the network to generate data that is both accurate and fair. We test our results in both cases of unconstrained, and constrained fair data generation. We show that using a fairly simple architecture and applying quantile transformation of numerical attributes the model achieves promising performance. In the unconstrained case, i.e., when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats the state-of-the-art GANs proposed in the literature to produce synthetic tabular data. Furthermore, in the constrained case in which the first phase of training is followed by the second phase, we train the network and test it on four datasets studied in the fairness literature and compare our results with another state-of-the-art pre-processing method, and present the promising results that it achieves. Comparing to other studies utilizing GANs for fair data generation, our model is comparably more stable by using only one critic, and also by avoiding major problems of original GAN model, such as mode-dropping and non-convergence. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

14 pages, 805 KiB

Open AccessArticle

The Case of Aspect in Sentiment Analysis: Seeking Attention or Co-Dependency?

by Anastazia Žunić, Padraig Corcoran and Irena Spasić

Mach. Learn. Knowl. Extr. 2022, 4(2), 474-487; https://doi.org/10.3390/make4020021 - 13 May 2022

Cited by 2 | Viewed by 2638

Abstract

(1) Background: Aspect-based sentiment analysis (SA) is a natural language processing task, the aim of which is to classify the sentiment associated with a specific aspect of a written text. The performance of SA methods applied to texts related to health and well-being [...] Read more.

(1) Background: Aspect-based sentiment analysis (SA) is a natural language processing task, the aim of which is to classify the sentiment associated with a specific aspect of a written text. The performance of SA methods applied to texts related to health and well-being lags behind that of other domains. (2) Methods: In this study, we present an approach to aspect-based SA of drug reviews. Specifically, we analysed signs and symptoms, which were extracted automatically using the Unified Medical Language System. This information was then passed onto the BERT language model, which was extended by two layers to fine-tune the model for aspect-based SA. The interpretability of the model was analysed using an axiomatic attribution method. We performed a correlation analysis between the attribution scores and syntactic dependencies. (3) Results: Our fine-tuned model achieved accuracy of approximately

95 %

on a well-balanced test set. It outperformed our previous approach, which used syntactic information to guide the operation of a neural network and achieved an accuracy of approximately

82 %

. (4) Conclusions: We demonstrated that a BERT-based model of SA overcomes the negative bias associated with health-related aspects and closes the performance gap against the state-of-the-art in other domains. Full article

(This article belongs to the Special Issue Large Language Models: Methods and Applications)

► Show Figures

Figure 1

28 pages, 3530 KiB

Open AccessReview

Machine Learning in Disaster Management: Recent Developments in Methods and Applications

by Vasileios Linardos, Maria Drakaki, Panagiotis Tzionas and Yannis L. Karnavas

Mach. Learn. Knowl. Extr. 2022, 4(2), 446-473; https://doi.org/10.3390/make4020020 - 07 May 2022

Cited by 55 | Viewed by 20070

Abstract

Recent years include the world’s hottest year, while they have been marked mainly, besides the COVID-19 pandemic, by climate-related disasters, based on data collected by the Emergency Events Database (EM-DAT). Besides the human losses, disasters cause significant and often catastrophic socioeconomic impacts, including [...] Read more.

Recent years include the world’s hottest year, while they have been marked mainly, besides the COVID-19 pandemic, by climate-related disasters, based on data collected by the Emergency Events Database (EM-DAT). Besides the human losses, disasters cause significant and often catastrophic socioeconomic impacts, including economic losses. Recent developments in artificial intelligence (AI) and especially in machine learning (ML) and deep learning (DL) have been used to better cope with the severe and often catastrophic impacts of disasters. This paper aims to provide an overview of the research studies, presented since 2017, focusing on ML and DL developed methods for disaster management. In particular, focus has been given on studies in the areas of disaster and hazard prediction, risk and vulnerability assessment, disaster detection, early warning systems, disaster monitoring, damage assessment and post-disaster response as well as cases studies. Furthermore, some recently developed ML and DL applications for disaster management have been analyzed. A discussion of the findings is provided as well as directions for further research. Full article

► Show Figures

Figure 1

14 pages, 675 KiB

Open AccessArticle

Knowledgebra: An Algebraic Learning Framework for Knowledge Graph

by Tong Yang, Yifei Wang, Long Sha, Jan Engelbrecht and Pengyu Hong

Mach. Learn. Knowl. Extr. 2022, 4(2), 432-445; https://doi.org/10.3390/make4020019 - 05 May 2022

Viewed by 2549

Abstract

Knowledge graph (KG) representation learning aims to encode entities and relations into dense continuous vector spaces such that knowledge contained in a dataset could be consistently represented. Dense embeddings trained from KG datasets benefit a variety of downstream tasks such as KG completion [...] Read more.

Knowledge graph (KG) representation learning aims to encode entities and relations into dense continuous vector spaces such that knowledge contained in a dataset could be consistently represented. Dense embeddings trained from KG datasets benefit a variety of downstream tasks such as KG completion and link prediction. However, existing KG embedding methods fell short to provide a systematic solution for the global consistency of knowledge representation. We developed a mathematical language for KG based on an observation of their inherent algebraic structure, which we termed as Knowledgebra. By analyzing five distinct algebraic properties, we proved that the semigroup is the most reasonable algebraic structure for the relation embedding of a general knowledge graph. We implemented an instantiation model, SemE, using simple matrix semigroups, which exhibits state-of-the-art performance on standard datasets. Moreover, we proposed a regularization-based method to integrate chain-like logic rules derived from human knowledge into embedding training, which further demonstrates the power of the developed language. As far as we know, by applying abstract algebra in statistical learning, this work develops the first formal language for general knowledge graphs, and also sheds light on the problem of neural-symbolic integration from an algebraic perspective. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

14 pages, 5010 KiB

Open AccessArticle

Estimating the Best Time to View Cherry Blossoms Using Time-Series Forecasting Method

by Tomonari Horikawa, Munenori Takahashi, Masaki Endo, Shigeyoshi Ohno, Masaharu Hirota and Hiroshi Ishikawa

Mach. Learn. Knowl. Extr. 2022, 4(2), 418-431; https://doi.org/10.3390/make4020018 - 30 Apr 2022

Cited by 2 | Viewed by 2513

Abstract

In recent years, tourist information collection using the internet has become common. Tourists are increasingly using internet resources to obtain tourist information. Social network service (SNS) users share tourist information of various kinds. Twitter, one SNS, has been used for many studies. We [...] Read more.

In recent years, tourist information collection using the internet has become common. Tourists are increasingly using internet resources to obtain tourist information. Social network service (SNS) users share tourist information of various kinds. Twitter, one SNS, has been used for many studies. We are pursuing research supporting a method using Twitter to help tourists obtain information: estimates of the best time to view cherry blossoms. Earlier studies have proposed a low-cost moving average method using geotagged tweets related to location information. Geotagged tweets are helpful as social sensors for real-time estimation and for the acquisition of local tourist information because the information can reflect real-world situations. Earlier studies have used weighted moving averages, indicating that a person can estimate the best time to view cherry blossoms in each prefecture. This study proposes a time-series prediction method using SNS data and machine learning as a new method for estimating the best times for viewing for a certain period. Combining the time-series forecasting method and the low-cost moving average method yields an estimate of the best time to view cherry blossoms. This report describes results confirming the usefulness of the proposed method by experimentation with estimation of the best time to view beautiful cherry blossoms in each prefecture and municipality. Full article

► Show Figures

Figure 1

21 pages, 6580 KiB

Open AccessArticle

Missing Data Estimation in Temporal Multilayer Position-Aware Graph Neural Network (TMP-GNN)

by Bahareh Najafi, Saeedeh Parsaeefard and Alberto Leon-Garcia

Mach. Learn. Knowl. Extr. 2022, 4(2), 397-417; https://doi.org/10.3390/make4020017 - 30 Apr 2022

Cited by 3 | Viewed by 2938

Abstract

GNNs have been proven to perform highly effectively in various node-level, edge-level, and graph-level prediction tasks in several domains. Existing approaches mainly focus on static graphs. However, many graphs change over time and their edge may disappear, or the node/edge attribute may alter [...] Read more.

GNNs have been proven to perform highly effectively in various node-level, edge-level, and graph-level prediction tasks in several domains. Existing approaches mainly focus on static graphs. However, many graphs change over time and their edge may disappear, or the node/edge attribute may alter from one time to the other. It is essential to consider such evolution in the representation learning of nodes in time-varying graphs. In this paper, we propose a Temporal Multilayer Position-Aware Graph Neural Network (TMP-GNN), a node embedding approach for dynamic graphs that incorporates the interdependence of temporal relations into embedding computation. We evaluate the performance of TMP-GNN on two different representations of temporal multilayered graphs. The performance is assessed against the most popular GNNs on a node-level prediction task. Then, we incorporate TMP-GNN into a deep learning framework to estimate missing data and compare the performance with their corresponding competent GNNs from our former experiment, and a baseline method. Experimental results on four real-world datasets yield up to

58 %

lower

ROC AUC

for the pair-wise node classification task, and

96 %

lower MAE in missing feature estimation, particularly for graphs with a relatively high number of nodes and lower mean degree of connectivity. Full article

(This article belongs to the Section Network)

► Show Figures

Figure 1

26 pages, 2768 KiB

Open AccessArticle

VloGraph: A Virtual Knowledge Graph Framework for Distributed Security Log Analysis

by Kabul Kurniawan, Andreas Ekelhart, Elmar Kiesling, Dietmar Winkler, Gerald Quirchmayr and A Min Tjoa

Mach. Learn. Knowl. Extr. 2022, 4(2), 371-396; https://doi.org/10.3390/make4020016 - 11 Apr 2022

Cited by 3 | Viewed by 4618

Abstract

The integration of heterogeneous and weakly linked log data poses a major challenge in many log-analytic applications. Knowledge graphs (KGs) can facilitate such integration by providing a versatile representation that can interlink objects of interest and enrich log events with background knowledge. Furthermore, [...] Read more.

The integration of heterogeneous and weakly linked log data poses a major challenge in many log-analytic applications. Knowledge graphs (KGs) can facilitate such integration by providing a versatile representation that can interlink objects of interest and enrich log events with background knowledge. Furthermore, graph-pattern based query languages, such as SPARQL, can support rich log analyses by leveraging semantic relationships between objects in heterogeneous log streams. Constructing, materializing, and maintaining centralized log knowledge graphs, however, poses significant challenges. To tackle this issue, we propose VloGraph—a distributed and virtualized alternative to centralized log knowledge graph construction. The proposed approach does not involve any a priori parsing, aggregation, and processing of log data, but dynamically constructs a virtual log KG from heterogeneous raw log sources across multiple hosts. To explore the feasibility of this approach, we developed a prototype and demonstrate its applicability to three scenarios. Furthermore, we evaluate the approach in various experimental settings with multiple heterogeneous log sources and machines; the encouraging results from this evaluation suggest that the approach can enable efficient graph-based ad-hoc log analyses in federated settings. Full article

(This article belongs to the Special Issue Selected Papers from CD-MAKE 2021 and ARES 2021)

► Show Figures

Figure 1

21 pages, 1663 KiB

Open AccessArticle

An Attention-Based ConvLSTM Autoencoder with Dynamic Thresholding for Unsupervised Anomaly Detection in Multivariate Time Series

by Tareq Tayeh, Sulaiman Aburakhia, Ryan Myers and Abdallah Shami

Mach. Learn. Knowl. Extr. 2022, 4(2), 350-370; https://doi.org/10.3390/make4020015 - 02 Apr 2022

Cited by 17 | Viewed by 6137

Abstract

As a substantial amount of multivariate time series data is being produced by the complex systems in smart manufacturing (SM), improved anomaly detection frameworks are needed to reduce the operational risks and the monitoring burden placed on the system operators. However, building such [...] Read more.

As a substantial amount of multivariate time series data is being produced by the complex systems in smart manufacturing (SM), improved anomaly detection frameworks are needed to reduce the operational risks and the monitoring burden placed on the system operators. However, building such frameworks is challenging, as a sufficiently large amount of defective training data is often not available and frameworks are required to capture both the temporal and contextual dependencies across different time steps while being robust to noise. In this paper, we propose an unsupervised Attention-Based Convolutional Long Short-Term Memory (ConvLSTM) Autoencoder with Dynamic Thresholding (ACLAE-DT) framework for anomaly detection and diagnosis in multivariate time series. The framework starts by pre-processing and enriching the data, before constructing feature images to characterize the system statuses across different time steps by capturing the inter-correlations between pairs of time series. Afterwards, the constructed feature images are fed into an attention-based ConvLSTM autoencoder, which aims to encode the constructed feature images and capture the temporal behavior, followed by decoding the compressed knowledge representation to reconstruct the feature images’ input. The reconstruction errors are then computed and subjected to a statistical-based, dynamic thresholding mechanism to detect and diagnose the anomalies. Evaluation results conducted on real-life manufacturing data demonstrate the performance strengths of the proposed approach over state-of-the-art methods under different experimental settings. Full article

(This article belongs to the Section Privacy)

► Show Figures

Figure 1

34 pages, 829 KiB

Open AccessArticle

Counterfactual Models for Fair and Adequate Explanations

by Nicholas Asher, Lucas De Lara, Soumya Paul and Chris Russell

Mach. Learn. Knowl. Extr. 2022, 4(2), 316-349; https://doi.org/10.3390/make4020014 - 31 Mar 2022

Cited by 4 | Viewed by 2771

Abstract

Recent efforts have uncovered various methods for providing explanations that can help interpret the behavior of machine learning programs. Exact explanations with a rigorous logical foundation provide valid and complete explanations, but they have an epistemological problem: they are often too complex for [...] Read more.

Recent efforts have uncovered various methods for providing explanations that can help interpret the behavior of machine learning programs. Exact explanations with a rigorous logical foundation provide valid and complete explanations, but they have an epistemological problem: they are often too complex for humans to understand and too expensive to compute even with automated reasoning methods. Interpretability requires good explanations that humans can grasp and can compute. We take an important step toward specifying what good explanations are by analyzing the epistemically accessible and pragmatic aspects of explanations. We characterize sufficiently good, or fair and adequate, explanations in terms of counterfactuals and what we call the conundra of the explainee, the agent that requested the explanation. We provide a correspondence between logical and mathematical formulations for counterfactuals to examine the partiality of counterfactual explanations that can hide biases; we define fair and adequate explanations in such a setting. We provide formal results about the algorithmic complexity of fair and adequate explanations. We then detail two sophisticated counterfactual models, one based on causal graphs, and one based on transport theories. We show transport based models have several theoretical advantages over the competition as explanation frameworks for machine learning algorithms. Full article

(This article belongs to the Special Issue Selected Papers from CD-MAKE 2021 and ARES 2021)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Mach. Learn. Knowl. Extr., Volume 4, Issue 2 (June 2022) – 13 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI