Next Article in Journal
Inference and Local Influence Assessment in a Multifactor Skew-Normal Linear Mixed Model
Next Article in Special Issue
Impact of Astrocytic Coverage of Synapses on the Short-Term Memory of a Computational Neuron-Astrocyte Network
Previous Article in Journal
Performance Comparison of Numerical Methods in a Predictive Controller for an AC–DC Power Converter
Previous Article in Special Issue
A Dynamic Mechanistic Model of Perceptual Binding
 
 
Article
Peer-Review Record

Explainable Machine Learning Methods for Classification of Brain States during Visual Perception

Mathematics 2022, 10(15), 2819; https://doi.org/10.3390/math10152819
by Robiul Islam 1, Andrey V. Andreev 1,2, Natalia N. Shusharina 2 and Alexander E. Hramov 2,3,*
Reviewer 2:
Reviewer 3: Anonymous
Mathematics 2022, 10(15), 2819; https://doi.org/10.3390/math10152819
Submission received: 9 June 2022 / Revised: 22 July 2022 / Accepted: 5 August 2022 / Published: 8 August 2022

Round 1

Reviewer 1 Report

The authors presented their work with several experiments with various optimizers. Well, I  have some major concerns about this study.

1. I wonder how explainable AI helping in this study. The authors used different optimizers in multi-layer perceptron (MLP) and experimented. Please describe what is the role of explainable AI in this study?

2. Please explain the reasons why the authors did not use a convolutional neural network. Because MLP generates a large number of parameters and demands computational costs.

3.  So explain the computational time and other configurations of systems used in this study.

4. It would be good to compare the study with the CNN method and show the reasons why MLP is better for this study.

Author Response

Comment

  1. I wonder how explainable AI helping in this study. The authors used different optimizers in multi-layer perceptron (MLP) and experimented. Please describe what is the role of explainable AI in this study?

Answer

It is well-known that one of the central points of the application of artificial intelligence methods to medical and biological tasks is the interpretability of such approaches [1-3]. This is important for the creation of various assistive medical decision support systems [4,5], when a medical professional must understand and interpret the decision obtained using explainable artificial intelligence methods. In this regard, in neuroscience it is of great interest to develop and analyze various approaches for the diagnosis of neuroimaging data that are interpretable. Especially important is the interpretation of those features on the basis of which we create a particular machine learning system for application in biology and medicine.

In this paper we apply different machine-learning methods for classification of brain states and focus on interpretability of the results. To estimate the influence of different feature on classification process and make the method more interpretable we use the SHAP's library technique. We introduce four models with different combinations of the ML method parameters. The contribution of the study can be formulated as

- We analyze the complex EEG dataset by using machine-learning techniques and find which optimization method is suitable for out dataset;}

- Apply SHAP for estimation of the influence of different features to make the ML model more interpretable;}

- Find the best optimizer that works well for different models.

We added additional discussion in Introduction Section.

 

  1. Stiglic, G.; Kocbek, P.; Fijacko, N.; Zitnik, M.; Verbert, K.; Cilar, L. Interpretability of machine learning-based prediction models in healthcare. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2020, 10, e1379.
  2. Wang, F.; Kaushal, R.; Khullar, D. Should health care demand interpretable artificial intelligence or accept “black box” medicine? 2020.
  3. Kundu, S. AI in medicine must be explainable. Nature Medicine 2021, 27, 1328–1328.
  4. Belle, V.; Papantonis, I. Principles and practice of explainable machine learning. Frontiers in big Data 2021, p. 39.
  5. Kaur, S.; Singla, J.; Nkenyereye, L.; Jha, S.; Prashar, D.; Joshi, G.P.; El-Sappagh, S.; Islam, M.S.; Islam, S.R. Medical diagnostic systems using artificial intelligence (ai) algorithms: Principles and perspectives. IEEE Access 2020, 8, 228049–228069.

 

Comment

  1. Please explain the reasons why the authors did not use a convolutional neural network. Because MLP generates a large number of parameters and demands computational costs.

Answer

The convolutional neural networks are commonly used and demonstrate a great success in image classification, natural language processing, computer vision, etc. [Gu J. et al. Recent advances in convolutional neural networks. Pattern recognition. 77 (2018) 354-377; Li Z. et al. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems. (2021)]. Should be noted, that application to brain signals for achieving good accuracy requires a very deep CNN which leads to large number of parameters and high computational costs [Xu G. et al. A deep transfer convolutional neural network framework for EEG signal classification. IEEE Access. 7 (2019) 112767-112776; Jiao Z. et al. Deep convolutional neural networks for mental load classification based on EEG data. Pattern Recognition. 76 (2018) 582-595]. At the same time, deep learning methods are widely used for classifying EEG signals [Abbasi S. F. et al. EEG-based neonatal sleep-wake classification using multilayer perceptron neural network. IEEE Access. 8 (2020) 183025-183034; Sharma R., Kim M., Gupta A. Motor imagery classification in brain-machine interface with machine learning algorithms: Classical approach to multi-layer perceptron model. Biomedical Signal Processing and Control. 71 (2022) 103101]. We added the corresponding discussion and references in the Introduction.

 

Comment

  1. So explain the computational time and other configurations of systems used in this study.

Answer

The configuration of the computing system we used to perform ML is following:

  • RAM: 503 GB
  • CPU: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
  • OS: Ubuntu 18.04.5 LTS 64 bit
  • GPU: NULL

The computation time for Case I is 155 - 162 seconds per model (1 dataset, 1 frequency) for all EEG channels, and 74 to 82 s per model for the EEG channels from only left or right hemisphere. For Case II the computation time is strongly depends on the activation function and optimization method, it varies from 288 to 6559 seconds per model. Average computational time for Case II is 2274 sec. We added the corresponding information in the manuscript. We also added the plot with comparison of the computation time between all models and optimization methods for Case II (Figure 7, Section 4.2).

 

Comment

  1. It would be good to compare the study with the CNN method and show the reasons why MLP is better for this study.

Answer

In our previous work [Kuc A. et al. Combining statistical analysis and machine learning for EEG scalp topograms classification. Frontiers in Systems Neuroscience 15 (2021)] we have used CNN for classification of brain responses to the visual ambiguous stimuli by EEG signals. We have found that CNN trained on 19 subjects could classify data of a new participant with 74% accuracy. In our current study we use MLP for classification of brain states during visual perception and get the classification accuracy much higher than we get with CNN. We have added corresponding discussion in Section 4 and Table 17.

Reviewer 2 Report

This study aimed to propose a machine learning architecture for the classification of brain states during visual perception using the EEG signal. I have the following suggestions.

What is the novelty of this study although several machine learning architectures for classification of brain states during visual perception using the EEG signal have been proposed earlier?

The abstract should be rewritten and improved by combining the objectives, short methodology, main findings, and prospective application.

Please write down the contribution of the study at the end part of the Introduction section in bulleted form.

Authors should include conceptual figure of their proposed approach with more details and model parametrization.

Authors should introduce the EEG numerous applications in broad ranges, such as mental workload, disease prediction. Machine-learning approaches are utilized for stroke prediction in article, healthsos: real-time health monitoring system for stroke prognostics; and in article, quantitative evaluation of task-induced neurological outcome after stroke.

EEG is highly sensitive to the powerline, muscular and cardiac artifacts. In EEG data preprocessing, authors need to mention how you handle AC power, ECG, and EMG artifacts in EEG signals. Do the authors think that their proposed method is robust to such kinds of artifacts?

Authors should discuss the case studies of EEG biomarkers, such as Brain Stimulation for different neurological workloads in article, quantifying physiological biomarkers of a microwave brain stimulation device; and in article, quantitative evaluation of eeg-biomarkers for prediction of sleep stages.

Authors should report the SHAP output visualizations in the appendix. Authors should compare their Explain. AI results with other tools, such as LIME.

Authors need to mention the model parameters or hyperparameters of the proposed ML model.

Authors should present the training and validation accuracy and error graphs of the proposed model.

How did the authors deal with dataset class imbalance challenges in classification?

Authors should report more performance measures of their model, such as, accuracy, sensitivity, specificity, precision.

Both training and testing ROC curves need to be shown. What ML model validation method authors used?

The discussion section need to be improved. Authors must make discussion on the advantages and drawbacks of their proposed method with other recent studies adding a table in a discussion section.

From the writing point of view, the manuscript must be checked for typos and the grammatical issues should be improved.

Author Response

Comment

  1. What is the novelty of this study although several machine learning architectures for classification of brain states during visual perception using the EEG signal have been proposed earlier?

Answer

The novelty of this study is building an interpretable deep-learning classifier which will be able to work with EEG data for classification of brain states during visual perception and finding the features with the most influence on classification process that allows making the decisions obtained by AI methods more understandable and interpretable. In the long run, this is important and relevant for various medical AI applications.  [Kundu, S. AI in medicine must be explainable. Nature Medicine 2021, 27, 1328–1328].

 

Comment

  1. The abstract should be rewritten and improved by combining the objectives, short methodology, main findings, and prospective application.

Answer

We have rewritten the abstract according to the reviewer’s comment. The new variant of abstract: “The aim of this work is to find good mathematical model for classification of brain states during visual perception with focus on interpretability of the results. To achieve it we use the deep-learning models with different activation functions and optimization methods for their comparison and finding the best model for the considered dataset of 31 EEG channels trials. To estimate the influence of different feature on classification process and make the method more interpretable we use the SHAP's library technique. We find that the best optimization method is adagrad and the worst one is ftril. Also, we find that only adagrad works well for both linear and tangent models. The results could be useful for EEG-based brain-computer interfaces (BCIs) in part of choosing the appropriate machine learning methods and features for the correct training of the BCI intelligent system.”

 

Comment

  1. Please write down the contribution of the study at the end part of the Introduction section in bulleted form.

Answer

We have added the contribution of the study section in bulleted form at the end part of the Introduction section.

 

Comment

  1. Authors should include conceptual figure of their proposed approach with more details and model parametrization.

Answer

Thank you for the valuable comment. We added Figure 3 with overall structure where we explain the proposed approach in detail and illustrate model parameters.

 

Comment

  1. Authors should introduce the EEG numerous applications in broad ranges, such as mental workload, disease prediction. Machine-learning approaches are utilized for stroke prediction in article, healthsos: real-time health monitoring system for stroke prognostics; and in article, quantitative evaluation of task-induced neurological outcome after stroke.

Answer

We added a small review of EEG numerous applications of machine-learning in the Introduction section.

 

Comment

  1. EEG is highly sensitive to the powerline, muscular and cardiac artifacts. In EEG data preprocessing, authors need to mention how you handle AC power, ECG, and EMG artifacts in EEG signals. Do the authors think that their proposed method is robust to such kinds of artifacts?

Answer

After experimental registration the EEG signals were filtered with a fourth-order Butterworth (1-100)-Hz bandpass filter and a 50-Hz notch filter. In addition, an independent component analysis (ICA) was performed to remove eye blinking and heartbeat artifacts.  It should be noted that this was done three years ago when the experiments were conducted. In this study, we did not conduct any experimental studies and only used previously recorded data that had already been cleared of artifacts by the above procedures, and with which we did not perform any additional manipulations.

 

Comment

  1. Authors should discuss the case studies of EEG biomarkers, such as Brain Stimulation for different neurological workloads in article, quantifying physiological biomarkers of a microwave brain stimulation device; and in article, quantitative evaluation of eeg-biomarkers for prediction of sleep stages.

Answer

We added a small discussion of applications of machine-learning for finding EEG biomarkers and predicting sleep stages in the Introduction section.

 

Comment

  1. Authors should report the SHAP output visualizations in the appendix. Authors should compare their Explain. AI results with other tools, such as LIME.

Answer

According to the Review comment we have moves the SHAP output visualizations to the appendix.

The aim of our work is to find good mathematical model for classification of brain states during visual perception and focus on interpretability of the results. To estimate the influence of different feature on classification process and make the method more interpretable we use the SHAP's library technique. We do not consider using LIME because it does not help to find out the overall performance of the model. LIME [Ramon, Y., Martens, D., Provost, F., Evgeniou, T. A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: SEDC, LIME-C and SHAP-C. Advances in Data Analysis and Classification, 14(4) (2020) 801-819] is mainly used to find out single class performance but failed to calculate overall performance of the model.

 

Comment

  1. Authors need to mention the model parameters or hyperparameters of the proposed ML model.

Answer

Thank you, we mentioned the model’s parameters in the description of the models in Section 2.5.1 and 2.5.2 and in the tables 2 and 3.

 

Comment

  1. Authors should present the training and validation accuracy and error graphs of the proposed model.

Answer

We have added the graphs for illustrating training and validation accuracy and loss of the proposed models (Figure 5).

 

Comment

  1. How did the authors deal with dataset class imbalance challenges in classification?

Answer

We don’t need to deal with dataset class imbalance challenges in classification because our datasets are balanced, the number of points is the same for each parameter (intensity).

 

Comment

  1. Authors should report more performance measures of their model, such as, accuracy, sensitivity, specificity, precision.

Answer

We have measured precision, recall (sensitivity), f1-score (accuracy), and specificity for our models and plotted the figure 8 for comparison of the models.

 

Comment

  1. Both training and testing ROC curves need to be shown. What ML model validation method authors used?

Answer

We have added ROC curves for Cases I and II in Fig. 4. As validation we default validation build in Keras Sequential model: we choose 10% of the original dataset as data on which to evaluate the loss and any model metrics at the end of each epoch. The model is not trained on this data. This data is only used for tuning hyper-parameters to make the model eligible for working well on unknown data.

 

Comment

  1. The discussion section need to be improved. Authors must make discussion on the advantages and drawbacks of their proposed method with other recent studies adding a table in a discussion section.

Answer

We added the discussion on the advantages and drawbacks of their proposed method with other recent studies adding a table in a discussion section. We also added additional Table 17 which demonstrates a comparative study of methodologies and results between the current work and previous machine learning-based EEG studies aimed at classifying brain states.

 

Comment

  1. From the writing point of view, the manuscript must be checked for typos and the grammatical issues should be improved.

Answer

Thank you, we have checked the manuscript for typos and the grammatical issues and corrected them.

Reviewer 3 Report

The study presents several case studies by using explainable machine learning methods for the classification of brain states during visual perception. This reviewer will recommend approval after a few corrections.
- It is very difficult to find contributions in the manuscript. This means that the storytelling of the paper is insufficient, and it is recommended that the manuscript provide a picture of the structure of the overall system.
- It is recommended that photos of the experimental scene or environment be provided.

- It seems necessary to explain the overall flow and flowchart of the machine learning methods and structure proposed in this manuscript.

Author Response

Comment

  1. It is very difficult to find contributions in the manuscript. This means that the storytelling of the paper is insufficient, and it is recommended that the manuscript provide a picture of the structure of the overall system.

Answer

Thank you for the recommendation. We added Fig. 3 with the structure of the overall system and improved the structure of the manuscript by adding the contribution of the study at the end part of the Introduction section and comparison graphs in Section 4.

 

Comment

  1. It is recommended that photos of the experimental scene or environment be provided.

Answer

We added Fig. 2 with the experimental paradigm of the study. The experiments were conducted several years ago and we used the available data here to build a machine learning model.

 

Comment

  1. It seems necessary to explain the overall flow and flowchart of the machine learning methods and structure proposed in this manuscript.

Answer

We added Fig. 3 on which we show the machine learning methods we use to work with the datasets with short information about it. The detailed structure of deep learning structure is presented in Tables 2 and 3.

Round 2

Reviewer 1 Report

The explanations of all questions by the authors are not convincing. Because according to the answer to the

1. first question, the features are not explainable AI. These are texture features that are invented in several conventional studies by different methods. So I suggest you change the title of your manuscript. I cannot agree this is explainable AI.

2. And the answers for the 2, and 4 by the authors are just showing references.  Authors should justify from the experiment why MLP is better than CNN. Because the computer vision community already knows MLP has a large number of parameters. Showing references is not useful to prove the efficiency of this study.

3. Why does CNN show 74% accuracy and MLP shows higher? This is interesting to know. Please justify it.

4. Figure 7 explanation is difficult to understand.

Author Response

Comment

  1. first question, the features are not explainable AI. These are texture features that are invented in several conventional studies by different methods. So I suggest you change the title of your manuscript. I cannot agree this is explainable AI.

Answer

Explainable AI algorithms are considered to follow the principle of explainability [Adadi A., Berrada M. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access. 2018, 6, 52138-52160]. A concept of explainability doesn’t have a joint definition yet, so, there is a number of interpretations [Roscher R. et al. Explainable Machine Learning for Scientific Insights and Discoveries. IEEE Access. 2020, 8, 42200–42216]. One of them is that explainability in ML can be considered as “the collection of features of the interpretable domain, that have contributed for a given example to produce a decision (e.g., classification or regression)” [Montavon G., Samek W., Müller K. R. Methods for interpreting and understanding deep neural networks. Digital signal processing. 2018, 73, 1-15]. So, we use such definition in our research and consider our ML methods as explainable ones because we analyze the influence of the features on classification process. We added the corresponding part in Introduction section for a better understanding of readers.

 

Comment

  1. And the answers for the 2, and 4 by the authors are just showing references. Authors should justify from the experiment why MLP is better than CNN. Because the computer vision community already knows MLP has a large number of parameters. Showing references is not useful to prove the efficiency of this study.

Answer

We thank the Referee for the suggestion of doing the experimental comparing of MLP and CNN in classification of visual stimuli by EEG signals with our dataset. It is very interesting and complex task, that requires performing additional extensive research. Given that we have investigated long EEG recordings during perception, this requires some new paradigm for building a machine learning model. We will do it in our next work.

 

Comment

  1. Why does CNN show 74% accuracy and MLP shows higher? This is interesting to know. Please justify it.

Answer

In the Ref. [Kuc A. et al. Combining Statistical Analysis and Machine Learning for EEG Scalp Topograms Classification. Frontiers in Systems Neuroscience 2021, 15] the authors used CNN for visual stimuli classification (i.e., event-related decision-making task) by two-dimensional EEG scalp topograms. The dataset and the stimuli are different from the ones we have used in the current work. Moreover, we analyze a state of the brain during the visual perception (without any decision-making), but the authors analyzed evoked potentials. Based in the literature, CNN usually does not classify evoked potentials with accuracy more then 80-90% [Xing J. et al. A CNN-based comparing network for the detection of steady-state visual evoked potential responses. Neurocomputing 2020, 403, 452-461; Waytowich N. et al. Compact convolutional neural networks for classification of asynchronous steady-state visual evoked potentials. Journal of neural engineering. 2018, 15(6), 066031; Ravi A. et al. Enhanced System Robustness of Asynchronous BCI in Augmented Reality Using Steady-State Motion Visual Evoked Potential. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2022, 30, 85-95]. In our work, using deep learning models, it is possible to significantly improve the quality of classification of brain states based on long EEG segments

 

Comment

  1. Figure 7 explanation is difficult to understand.

Answer

We have rewritten the Figure 7 explanation.

Reviewer 2 Report

Section 2.1 should be divided into two parts; dataset and EEG pre-processing. 

Author Response

Comment

  1. Section 2.1 should be divided into two parts; dataset and EEG pre-processing.

Answer

Thank you, we have divided Section 2.1 into two parts: 2.1 Datasets Description and 2.2 Preprocessing.

Round 3

Reviewer 1 Report

The first and third question replies from the authors are not convincing.

Back to TopTop