entropy-logo

Journal Browser

Journal Browser

Improving Predictive Models with Expert Knowledge

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Multidisciplinary Applications".

Deadline for manuscript submissions: closed (30 October 2022) | Viewed by 21438

Special Issue Editor


E-Mail Website
Guest Editor
Department of Statistics, TU Dortmund University, 44227 Dortmund, Germany
Interests: asymptotic and nonparametric statistics; multivariate analysis; resampling and statistical/machine learning in theory and practice; survival and time series analysis

Special Issue Information

Dear Colleagues,

Big Data has changed many aspects of present-day statistics: In many fields, predictive models are developed in a purely data-driven way, often even using the directive 'the more data the better', building on Peter Norvig’s quote 'We don’t have better algorithms. We just have more data.' However, the success of Google and Amazon is not one-to-one transferable to other areas. In fact, Big Data is not equal to good data, and for most applications in sciences or industry, accurate and reasonable predictions require additional insights. This expert or domain knowledge may be given by simple physical constraints of the output or knowledge about underlying relations, dependencies, or causalities. Unfortunately, there are not many studies on methods for the inclusion of expert knowledge, let alone its (information and predictive) effect. This is where the current Issue proposal kicks in, for which we envision systematic studies (simulation and theory) on this topic covering application areas from natural science models to industrial time series forecasting. For example, we envision studies that analyze and quantify the improvement of additional information for machine-learning methods in terms of entropy or other information-theoretic, information-quality, or importance measures. Other potential examples may cover the influence of the chosen experimental design, the connection to AutoML, missings, penalization, and distributional discrepancies (e.g., measured via Kullback–Leibler), etc.

Prof. Dr. Markus Pauly
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • classification
  • deep learning
  • domain knowledge
  • feature engineering
  • information gain
  • machine learning
  • regression
  • regularization
  • statistical learning

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 3316 KiB  
Article
Deep Spatio-Temporal Graph Network with Self-Optimization for Air Quality Prediction
by Xue-Bo Jin, Zhong-Yao Wang, Jian-Lei Kong, Yu-Ting Bai, Ting-Li Su, Hui-Jun Ma and Prasun Chakrabarti
Entropy 2023, 25(2), 247; https://doi.org/10.3390/e25020247 - 30 Jan 2023
Cited by 35 | Viewed by 3529
Abstract
The environment and development are major issues of general concern. After much suffering from the harm of environmental pollution, human beings began to pay attention to environmental protection and started to carry out pollutant prediction research. A large number of air pollutant predictions [...] Read more.
The environment and development are major issues of general concern. After much suffering from the harm of environmental pollution, human beings began to pay attention to environmental protection and started to carry out pollutant prediction research. A large number of air pollutant predictions have tried to predict pollutants by revealing their evolution patterns, emphasizing the fitting analysis of time series but ignoring the spatial transmission effect of adjacent areas, leading to low prediction accuracy. To solve this problem, we propose a time series prediction network with the self-optimization ability of a spatio-temporal graph neural network (BGGRU) to mine the changing pattern of the time series and the spatial propagation effect. The proposed network includes spatial and temporal modules. The spatial module uses a graph sampling and aggregation network (GraphSAGE) in order to extract the spatial information of the data. The temporal module uses a Bayesian graph gated recurrent unit (BGraphGRU), which applies a graph network to the gated recurrent unit (GRU) so as to fit the data’s temporal information. In addition, this study used Bayesian optimization to solve the problem of the model’s inaccuracy caused by inappropriate hyperparameters of the model. The high accuracy of the proposed method was verified by the actual PM2.5 data of Beijing, China, which provided an effective method for predicting the PM2.5 concentration. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

24 pages, 1075 KiB  
Article
Estimating Gaussian Copulas with Missing Data with and without Expert Knowledge
by Maximilian Kertel and Markus Pauly
Entropy 2022, 24(12), 1849; https://doi.org/10.3390/e24121849 - 19 Dec 2022
Cited by 3 | Viewed by 1215
Abstract
In this work, we present a rigorous application of the Expectation Maximization algorithm to determine the marginal distributions and the dependence structure in a Gaussian copula model with missing data. We further show how to circumvent a priori assumptions on the marginals with [...] Read more.
In this work, we present a rigorous application of the Expectation Maximization algorithm to determine the marginal distributions and the dependence structure in a Gaussian copula model with missing data. We further show how to circumvent a priori assumptions on the marginals with semiparametric modeling. Further, we outline how expert knowledge on the marginals and the dependency structure can be included. A simulation study shows that the distribution learned through this algorithm is closer to the true distribution than that obtained with existing methods and that the incorporation of domain knowledge provides benefits. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

14 pages, 2387 KiB  
Article
A Dual-Stage Attention Model for Tool Wear Prediction in Dry Milling Operation
by Yongrui Qin, Jiangfeng Li, Chenxi Zhang, Qinpei Zhao and Xiaofeng Ma
Entropy 2022, 24(12), 1733; https://doi.org/10.3390/e24121733 - 28 Nov 2022
Cited by 2 | Viewed by 1158
Abstract
The intelligent monitoring of tool wear status and wear prediction are important factors affecting the intelligent development of the modern machinery industry. Many scholars have used deep learning methods to achieve certain results in tool wear prediction. However, due to the instability and [...] Read more.
The intelligent monitoring of tool wear status and wear prediction are important factors affecting the intelligent development of the modern machinery industry. Many scholars have used deep learning methods to achieve certain results in tool wear prediction. However, due to the instability and variability of the signal data, some neural network models may have gradient decay between layers. Most methods mainly focus on feature selection of the input data but ignore the influence degree of different features to tool wear. In order to solve these problems, this paper proposes a dual-stage attention model for tool wear prediction. A CNN-BiGRU-attention network model is designed, which introduces the self-attention to extract deep features and embody more important features. The IndyLSTM is used to construct a stable network to solve the gradient decay problem between layers. Moreover, the attention mechanism is added to the network to obtain the important information of output sequence, which can improve the accuracy of the prediction. Experimental study is carried out for tool wear prediction in a dry milling operation to demonstrate the viability of this method. Through the experimental comparison and analysis with regression prediction evaluation indexes, it proves the proposed method can effectively characterize the degree of tool wear, reduce the prediction errors, and achieve good prediction results. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

16 pages, 1214 KiB  
Article
Using Background Knowledge from Preceding Studies for Building a Random Forest Prediction Model: A Plasmode Simulation Study
by Lorena Hafermann, Nadja Klein, Geraldine Rauch, Michael Kammer and Georg Heinze
Entropy 2022, 24(6), 847; https://doi.org/10.3390/e24060847 - 20 Jun 2022
Cited by 1 | Viewed by 2365
Abstract
There is an increasing interest in machine learning (ML) algorithms for predicting patient outcomes, as these methods are designed to automatically discover complex data patterns. For example, the random forest (RF) algorithm is designed to identify relevant predictor variables out of a large [...] Read more.
There is an increasing interest in machine learning (ML) algorithms for predicting patient outcomes, as these methods are designed to automatically discover complex data patterns. For example, the random forest (RF) algorithm is designed to identify relevant predictor variables out of a large set of candidates. In addition, researchers may also use external information for variable selection to improve model interpretability and variable selection accuracy, thereby prediction quality. However, it is unclear to which extent, if at all, RF and ML methods may benefit from external information. In this paper, we examine the usefulness of external information from prior variable selection studies that used traditional statistical modeling approaches such as the Lasso, or suboptimal methods such as univariate selection. We conducted a plasmode simulation study based on subsampling a data set from a pharmacoepidemiologic study with nearly 200,000 individuals, two binary outcomes and 1152 candidate predictor (mainly sparse binary) variables. When the scope of candidate predictors was reduced based on external knowledge RF models achieved better calibration, that is, better agreement of predictions and observed outcome rates. However, prediction quality measured by cross-entropy, AUROC or the Brier score did not improve. We recommend appraising the methodological quality of studies that serve as an external information source for future prediction model development. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

15 pages, 299 KiB  
Article
Are Experts Well-Calibrated? An Equivalence-Based Hypothesis Test
by Gayan Dharmarathne, Anca M. Hanea and Andrew Robinson
Entropy 2022, 24(6), 757; https://doi.org/10.3390/e24060757 - 27 May 2022
Cited by 1 | Viewed by 1620
Abstract
Estimates based on expert judgements of quantities of interest are commonly used to supplement or replace measurements when the latter are too expensive or impossible to obtain. Such estimates are commonly accompanied by information about the uncertainty of the estimate, such as a [...] Read more.
Estimates based on expert judgements of quantities of interest are commonly used to supplement or replace measurements when the latter are too expensive or impossible to obtain. Such estimates are commonly accompanied by information about the uncertainty of the estimate, such as a credible interval. To be considered well-calibrated, an expert’s credible intervals should cover the true (but unknown) values a certain percentage of time, equal to the percentage specified by the expert. To assess expert calibration, so-called calibration questions may be asked in an expert elicitation exercise; these are questions with known answers used to assess and compare experts’ performance. An approach that is commonly applied to assess experts’ performance by using these questions is to directly compare the stated percentage cover with the actual coverage. We show that this approach has statistical drawbacks when considered in a rigorous hypothesis testing framework. We generalize the test to an equivalence testing framework and discuss the properties of this new proposal. We show that comparisons made on even a modest number of calibration questions have poor power, which suggests that the formal testing of the calibration of experts in an experimental setting may be prohibitively expensive. We contextualise the theoretical findings with a couple of applications and discuss the implications of our findings. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

16 pages, 3322 KiB  
Article
Dynamic Risk Prediction via a Joint Frailty-Copula Model and IPD Meta-Analysis: Building Web Applications
by Takeshi Emura, Hirofumi Michimae and Shigeyuki Matsui
Entropy 2022, 24(5), 589; https://doi.org/10.3390/e24050589 - 22 Apr 2022
Cited by 12 | Viewed by 2759
Abstract
Clinical risk prediction formulas for cancer patients can be improved by dynamically updating the formulas by intermediate events, such as tumor progression. The increased accessibility of individual patient data (IPD) from multiple studies has motivated the development of dynamic prediction formulas accounting for [...] Read more.
Clinical risk prediction formulas for cancer patients can be improved by dynamically updating the formulas by intermediate events, such as tumor progression. The increased accessibility of individual patient data (IPD) from multiple studies has motivated the development of dynamic prediction formulas accounting for between-study heterogeneity. A joint frailty-copula model for overall survival and time to tumor progression has the potential to develop a dynamic prediction formula of death from heterogenous studies. However, the process of developing, validating, and publishing the prediction formula is complex, which has not been sufficiently described in the literature. In this article, we provide a tutorial in order to build a web-based application for dynamic risk prediction for cancer patients on the basis of the R packages joint.Cox and Shiny. We demonstrate the proposed methods using a dataset of breast cancer patients from multiple clinical studies. Following this tutorial, we demonstrate how one can publish web applications available online, which can be manipulated by any user through a smartphone or personal computer. After learning this tutorial, developers acquire the ability to build an online web application using their own datasets. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

25 pages, 2826 KiB  
Article
On the Relation between Prediction and Imputation Accuracy under Missing Covariates
by Burim Ramosaj, Justus Tulowietzki and Markus Pauly
Entropy 2022, 24(3), 386; https://doi.org/10.3390/e24030386 - 09 Mar 2022
Cited by 7 | Viewed by 1862
Abstract
Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the use of modern Machine-Learning algorithms for imputation. This originates from their capability of showing favorable prediction [...] Read more.
Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the use of modern Machine-Learning algorithms for imputation. This originates from their capability of showing favorable prediction accuracy in different learning problems. In this work, we analyze through simulation the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when Machine-Learning-based methods for both imputation and prediction are used. We see that even a slight decrease in imputation accuracy can seriously affect the prediction accuracy. In addition, we explore imputation performance when using statistical inference procedures in prediction settings, such as the coverage rates of (valid) prediction intervals. Our analysis is based on empirical datasets provided by the UCI Machine Learning repository and an extensive simulation study. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

17 pages, 4785 KiB  
Article
A Variational Bayesian Deep Network with Data Self-Screening Layer for Massive Time-Series Data Forecasting
by Xue-Bo Jin, Wen-Tao Gong, Jian-Lei Kong, Yu-Ting Bai and Ting-Li Su
Entropy 2022, 24(3), 335; https://doi.org/10.3390/e24030335 - 25 Feb 2022
Cited by 68 | Viewed by 5114
Abstract
Compared with mechanism-based modeling methods, data-driven modeling based on big data has become a popular research field in recent years because of its applicability. However, it is not always better to have more data when building a forecasting model in practical areas. Due [...] Read more.
Compared with mechanism-based modeling methods, data-driven modeling based on big data has become a popular research field in recent years because of its applicability. However, it is not always better to have more data when building a forecasting model in practical areas. Due to the noise and conflict, redundancy, and inconsistency of big time-series data, the forecasting accuracy may reduce on the contrary. This paper proposes a deep network by selecting and understanding data to improve performance. Firstly, a data self-screening layer (DSSL) with a maximal information distance coefficient (MIDC) is designed to filter input data with high correlation and low redundancy; then, a variational Bayesian gated recurrent unit (VBGRU) is used to improve the anti-noise ability and robustness of the model. Beijing’s air quality and meteorological data are conducted in a verification experiment of 24 h PM2.5 concentration forecasting, proving that the proposed model is superior to other models in accuracy. Full article
(This article belongs to the Special Issue Improving Predictive Models with Expert Knowledge)
Show Figures

Figure 1

Back to TopTop