Next Article in Journal
The Development of Suitable Inequalities and Their Application to Systems of Logical Equations
Next Article in Special Issue
Convolutional Neural Network for Closed-Set Identification from Resting State Electroencephalography
Previous Article in Journal
Tell Me Why I Do Not Like Mondays
Previous Article in Special Issue
GATSMOTE: Improving Imbalanced Node Classification on Graphs via Attention and Homophily
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Neural Networks to Uncover the Relationship between Highly Variable Behavior and EEG during a Working Memory Task with Distractors

1
Neuromedical Control Systems Laboratory, Institute for Computational Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
2
U1077 INSERM-EPHE-UNICAEN, 14032 Caen, France
3
Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD 21218, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
These authors contributed equally to this work.
Mathematics 2022, 10(11), 1848; https://doi.org/10.3390/math10111848
Submission received: 5 March 2022 / Revised: 16 May 2022 / Accepted: 18 May 2022 / Published: 27 May 2022
(This article belongs to the Special Issue From Brain Science to Artificial Intelligence)

Abstract

:
Value-driven attention capture (VDAC) occurs when previously rewarded stimuli capture attention and impair goal-directed behavior. In a working memory (WM) task with VDAC-related distractors, we observe behavioral variability both within and across individuals. Individuals differ in their ability to maintain relevant information and ignore distractions. These cognitive components shift over time with changes in motivation and attention, making it difficult to identify underlying neural mechanisms of individual differences. In this study, we develop the first participant-specific feedforward neural network models of reaction time from neural data during a VDAC WM task. We used short epochs of electroencephalography (EEG) data from 16 participants to develop the feedforward neural network (NN) models of RT aimed at understanding both WM and VDAC. Using general linear models (GLM), we identified 20 EEG features to predict RT across participants ( r = 0.53 ± 0.08 ). The linear model was compared to the NN model, which improved the predicted trial-by-trial RT for all participants ( r = 0.87 ± 0.04 ). We found that right frontal gamma-band activity and fronto-posterior functional connectivity in the alpha, beta, and gamma bands explain individual differences. Our study shows that NN models can link neural activity to highly variable behavior and can identify potential new targets for neuromodulation interventions.
MSC:
92B20; 92C55; 62J12

1. Introduction

Generally, we find that goals, stimuli, and rewards are intertwined, meaning that stimuli that were previously associated with reward are often the most important items that deserve our attention and help us achieve goals. However, they may also capture attention when they are not relevant to current goals, which can result in impaired goal-directed behavior. This phenomenon is referred to as value-driven attention capture (VDAC) [1,2]. To override such distraction and steer our attention to the currently relevant information, we need cognitive control to ignore previous associations and habits, because what we pay attention to influences both current and future actions.
Individuals differ in their ability to maintain relevant information, their capability to ignore distraction, and the degree to which learned reward associations bias attention. Individual differences in these cognitive processes have been associated with everything from academic success [3] to drug addiction [4]. However, it can be difficult to identify the underlying neural mechanisms of such individual differences, because performance in a single task is dependent on multiple cognitive components. To add further complexity, the strength of each of these components can shift due to context, motivation, and experience for each individual over time. As a result, such differences can manifest in highly variable behavior (i.e., reaction time), within a single participant over a session, as well as across multiple participants in a study.
Various noninvasive neuroimaging techniques such as electroencephalography (EEG), functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG) can be used to start linking observed behaviors to dynamic neural activity at various spatial and temporal scales [5]. Additionally, neuroimaging has been used across a wide variety of applications, including quantifying mental workload [6], identifying biomarkers to assist disease diagnosis, and using real-time feedback to optimize brain stimulation [7,8]. Moreover, EEG is the primary modality used for brain–computer interfaces, that have been developed for numerous applications (e.g., rehabilitation [9] and gaming [10]).
Currently, neuroimaging studies using EEG, fMRI, and MEG have been used to start untangling these neural complexities observed during VDAC tasks [11,12,13,14]. Previous work on the present dataset [11] revealed that attentional capture by previously rewarded, but no longer relevant stimuli was associated with worse working memory (WM) performance, lower posterior alpha power, and larger event-related potential (ERP) magnitude. The strength of these effects varied with the type of information held in WM [11]. Other studies have focused on other aspects of variability in performance during tasks that also require inhibition of irrelevant information. For example, it has been shown that errors can result in greater top-down, proactive control, both enhancing relevant information processing and inhibiting irrelevant information and slowing responses on subsequent trials, with these processes apparently mediated by different regions in the prefrontal cortex [15].
However, identifying the associations between a vast amount of neuroimaging data and behavior can be challenging, due to the multiple dimensions of variability observed during such tasks, both within and across participants. Therefore, one approach is to apply mathematical modeling techniques to the neural data to track the behavior (i.e., trial-by-trial reaction time) during a WM task. Previous approaches have used multiple function linear modeling to predict WM ability [16], and machine learning for classifying mental load [17,18], decoding for a visual working memory task [19], and predicting reaction time during a change in visual stimulus task [20] and during a lane-keeping task [21]. Currently, no mathematical model exists that is able to predict reaction time using only EEG data during a VDAC task. Multidimensional modeling may be particularly useful for understanding the relationship between distraction control and working memory maintenance because there is variability in each cognitive process and their interaction, across time and across individuals.
Therefore, in this work, we developed the first participant-specific feedforward neural network models of reaction time during a VDAC WM task. We hypothesize that the models can highlight relationships between the neural activity recorded via scalp EEG and the participants’ reaction times during the task that might not be evident using typical analysis. Specifically, the models were identified using an EEG dataset from a previous study [11] in which the reaction times for all participants were highly variable across a session. For this work, EEG features were extracted for each trial, and then a feature selection algorithm based on a general linear model (GLM) was used to identify a subset of 20 EEG features to predict each participant’s trial-by-trial reaction time. Next, we fit the sixteen participant-specific neural network models and compared the results to those of the GLM, which was fit using the same features. Overall, using only 20 EEG features, we were able to predict trial-by-trial RT for all participants, but observed an improvement in the correlation between the predicted RT and recorded RT for the neural network ( r = 0.87 ± 0.04 ) when compared to the GLM ( r = 0.53 ± 0.08 ).
A sensitivity analysis was completed for both the GLM and NN models to determine which EEG features were most important for RT prediction. The NN showed higher sensitivity than the GLM model, which indicates the NN has added value. Based on a sensitivity analysis of the GLM and NN models, we found that right frontal low gamma (35–55 Hz) was the most important EEG feature for predicting RT. Additionally, groups of EEG features in the high beta (20–35 Hz) and low gamma (35–55 Hz) bands modulated with RT. Overall, by using a machine learning approach, we identified EEG features that may help us to better understand which areas and connections in the brain underlie participants’ RT variability related to both distraction control and working memory maintenance.

Significance of Research

  • We have developed the first participant-specific neural network models of reaction time during a working memory task with value-driven distractors based on EEG data features.
  • The model highlights new EEG features that might not have been found by traditional analysis and could be used to identify new targets for neuromodulation intervention.

2. Materials and Methods

2.1. Participants

Twenty-six participants were recruited. All participants had normal or corrected-to-normal vision and gave written informed consent, which was approved by the Institutional Review Board of Johns Hopkins University. In the original study by our group [11], we excluded seven participants from the analyzes because of poor performance (<60% accuracy on the task), technical difficulties, or unusable data because of artifacts (<70% of epochs remaining after preprocessing). For this work, we excluded an additional three participants because a substantial number of trials did not have EEG data recorded. As a result, we analyzed a dataset of sixteen participants (fourteen females, mean age 20.9 ± 2.8 years), which included both behavioral and EEG data.

2.2. Task Design

The task required individuals to both maintain information in working memory and inhibit distractors, as shown in Figure 1. In addition to measuring variability in working memory maintenance ability, the task also measured variability in attention capture that depended on how well the individual had learned a previously relevant reward association. To do well on the task, participants had to learn to no longer attend to the stimulus features that were recently relevant. Participants were cued regarding whether they were to remember a location or a relative spatial relationship, and then they were presented with two squares on either side of fixation, which were the sample stimuli for which the cued information was to be remembered. Following the sample presentation and Delay 1, a distractor appeared (i.e., a set of six circles colored differently for each hemifield), which could include the previously rewarded distractor color being either on the same (congruent) or opposite (incongruent) side as the subsequent WM test stimulus. After Delay 2, the participant had 1200 ms to indicate whether the test stimulus matched the remembered sample stimulus according to the relevant spatial dimension.
In total, each participant completed 270 location and 270 relation trials, which were presented in a pseudo-random order. Additionally, throughout the entirety of the testing period, 32 channels of EEG were recorded at 512 Hz using an ActiCHamp System (Brain Products, Gilching, Germany), referenced to Cz. The session took place in an electromagnetically shielded room to reduce noise. Additionally, the participants were instructed to remain as still as possible and to try to blink during the inter-trial period.
The lines that are shown for the location trials in Figure 1 were not present in the stimulus display, but are shown here to illustrate the location that participants would maintain in working memory. For the relation task, participants remembered which of the two squares in each pair was above the other. For the distractors, the color that was paired with a reward from a previous training session was presented in either the right or left visual hemifield. Working memory was tested regarding memory for the location or relation on either the right or left side of fixation. On average, participants tended to be slower and less accurate when the previously rewarded distractor color appeared in the hemifield opposite the one in which working memory was tested, indicating attention capture and its detrimental effect on memory. See [10] for more details.

2.3. EEG Data Processing

To develop a relationship between the EEG and RT data for each participant, we preprocessed the EEG data using a standard processing pipeline in EEGlab (version 14.4.2b) [22]. First, the data were bandpass filtered (1650th order FIR filter) between 1 Hz and 55 Hz to account for drift and powerline noise. Then, each EEG channel was re-referenced to the average and subtracted the baseline to center the timeseries around zero. Eye-blink and other physiological artifacts were then removed using the automatic continuous rejection feature of EEGLAB. The default independent component analysis algorithm [23] in EEGLAB used a logistic infomax-based ICA approach [24]. After, each component was plotted onto a 2D map of the scalp, and components associated with eye blinks were selected manually and removed by projecting the sum of selected non-artifactual components back onto the scalp. Then, for each trial and electrode, we filtered the cleaned EEG signal, using another 1650th order FIR filter, into five different frequency bands: theta (4–8 Hz), alpha (8–12 Hz), low beta (12–20 Hz), high beta (20–30 Hz), and low gamma (30–55 Hz). Finally, the time period of interest, 1400 ms prior to the distractor (Delay 1), was extracted for each filtered signal.

2.4. Computing EEG Functional Connectivity Using Phase-Locking Value (PLV)

A common measure of functional connectivity is phase-locking value (PLV), which measures the consistency of the phase difference between two signals. If the phase difference varies little across trials, PLV is close to 1 (i.e., high synchrony between regions); with large variability in the phase difference, PLV is close to zero [25]. For our analysis, we created a pair-wise PLV matrix for each frequency band, for each trial.
To compute the PLV, the entire timeseries was converted into an analytic signal using the Hilbert transform [26]. The instantaneous phase of an analytic signal (in radians) of an electrode channel h was denoted by ϕ h t . Subsequently, the phase difference between channels h and i was computed with the following equation,
ϕ h i t = ϕ h t ϕ i t   mod   2 π
Next, the beginning and ending 10% of timeseries values were removed to offset the edge effects from the Hilbert transform. Lastly, PLV was used to compare all pairs of channels against each other via the following equation,
PLV h i = 1 N k = 1 N exp j θ h i k Δ t                 h , i = 1 ,   2 , ,   n ,
where n was the total number of electrode channels, j = 1 was the imaginary unit, and Δ t = T N where T was the timeseries duration, and N was the total number of discrete time steps.

2.5. PLV Feature Extraction

For each participant, we computed the PLV matrix for each frequency band and trial, using the cleaned epoch of EEG data that was 1400 ms prior to the distractor. Then, we split the brain into six regions which were frontal-left (Fp1, F7, F3), frontal-right (Fp2, F4, F8), temporal-left (FT9, FC5, T7, C3, TP9, CP5), temporal-right (FC6, FT10, C4, T8, CP6, TP10), parietal-left (P7, P3, O1), parietal-right (P4, P8, O2).
From these six regions of the brain, we were interested in the “between region” and “within region” PLV features. To obtain the within region features, we averaged all of the PLV connections within each of the six regions described above. For example, to compute the frontal left within region feature value, we averaged the Fp1-F7, F7-F3, Fp1-F3 PLV connections together. To compute the within region features, we found the average connection between each of the six regions from above. For example, to find the frontal-left to frontal-right connection, we averaged the Fp1-Fp2, Fp1-F4, Fp1-F8, F7-Fp2, F7-F4, F7-F8, F3-Fp2, F3-F4, F3-F8 connections together. For this work, we excluded the frontal-right to parietal-left connection and the frontal-left to parietal-right connection. In total, we computed the within and between region features for each trial, frequency band, and participant. For each participant, we obtained 30 within region features and 40 between region features, which produced a 540 trial by 70 EEG feature matrix. To compare results across participants, we smoothed (smoothing spline factor was 0.15) and normalized the data so each EEG feature and RT timeseries ranged between −1 and 1.

2.6. EEG Feature Selection Algorithm Using General Linear Models (GLM)

A critical step toward identifying the participant-specific models was to reduce the complete set of EEG features extracted ( n = 70) in order to prevent over-fitting. The ultimate goal of the algorithm was to find a small number of EEG features that produced a low amount of error across all participants. To test our models, we selected every 5th trial to be a testing trial, which resulted in a training (80%) and testing (20%) dataset split. For the feature selection algorithm, we chose to use a GLM to predict RT, which is described by,
log R T ^ = i n β i F i t
where n was the total number of EEG features used to estimate the reaction time, R T ^ , on trial t , and β i was the weight that multiplied each of the EEG feature vectors, F i t . We fit the GLM by applying Matlab’s glmfit function to the training dataset. A key advantage of the linear model was that they were easily interpretable to identify which features may be more significant than others.
Therefore, for the feature selection algorithm, we first started with all of the EEG features ( n = 70 ) in the GLM to predict reaction time. Then, we systematically removed one feature from the training dataset at a time to fit a total of 70 separate GLM models. For each fitted GLM, we computed the root-mean-squared error (RMSE) between the RT prediction and recorded RT. Using the results of the 70 separate GLMs, we found the feature which produced the smallest increase in RMSE when removed. Then, the identified feature was removed from the pool of features for the next iteration, so that only 69 features remained. Each iteration resulted in removing a singular feature, and this process was repeated until five features remained.
To further refine our selected features, we swapped features in and out in order to determine the best combination of features for our model. First, the current mean RMSE across all participants based on the initial five features was saved. Next, for each participant, we swapped out each of the current features for a feature that had been previously removed. The resulting RMSE from each iteration was saved, creating a matrix of RMSE values for each swap. Then, for each feature swap in the RMSE matrix, we found the mean value over all participants. The minimum index of this resulting matrix was then compared to the previously saved mean, and, if it was lower, the features were swapped. Then, the algorithm continued until there was no longer a mean RMSE value that was lower than the previous saved step. Then, the entire feature reduction algorithm was repeated multiple times to identify the top 6 features to the top 30 features.

2.7. Neural Network Model of EEG Data to Predict RT

We fit a feedforward neural network for each participant in order to predict their trial-by-trial RT using EEG features from the epoch 1400 ms prior to the distractor. Of the many variations of neural networks, including long short-term memory (LSTM) networks, we chose to use a feedforward network, as it does not have feedback connections, and therefore is more similar to a GLM for comparison.
Figure 2 shows an illustration of the structure of the feedforward neural network used for this application. We utilized the feedforwardnet function in Matlab to initialize the training network. For this application, we were interested in maximizing the performance of the network, while still maintaining a degree of interpretability. Therefore, the input to the network was the subset of selected EEG features, and then the features passed through one hidden layer with five neurons, which was chosen to balance both model performance and reduce over-fitting. Finally, the single output layer predicts the trial-by-trial RT which was based solely on the EEG features. To fit the neural network, we applied the same training dataset that was used previously for the GLM.
In order to avoid over-fitting the model, we used Bayesian regularization backpropagation when training the model (trainbr Matlab function [27,28]). Specifically, the network training function updated the weight and bias values according to Levenberg–Marquardt optimization [29] to produce a network that generalizes well. The algorithm minimized a combination of squared errors and weights to determine the best combinations. For this work, we used Matlab’s default parameter values for trainbr.

2.8. Identifying the Number of EEG Features for RT Prediction

To choose the best number of EEG features to predict RT, we started by fitting a neural network (as shown in Figure 2) using the selected five EEG features, and saved the resulting RMSE based on the training datasets. Then, this process was repeated in order to have 1000 NN models using different random initialization that were averaged, for each participant. Next, we repeated the NN fitting process using the remaining subsets of 6 EEG features through 30 EEG features.
Overall, our goal was to have a low RMSE across participants while keeping the number of features small. Therefore, for each subset of EEG features (5 to 30), we identified the minimum of the optimization function ( J ) defined by,
J = max RMSE 2 + mean RMSE 2 + #   of   EEG   Features 2
where max RMSE was the maximum RMSE value across participants for each subset, mean RMSE was the mean RMSE value across participants for each subset, and #   of   EEG   Features was the number of EEG features included in the model. Additionally, each term was normalized to have a maximum value of 1. Based on the results from the training dataset, we chose to use 20 EEG features in the models.

2.9. Comparison of the Linear and Neural Network Models

For each participant, we evaluated the performance of the linear and neural network models for predicting the RTs of every trial during task performance using only a subset of EEG features.

2.9.1. EEG Epoch Size

For all of the methods described up to now, we focused solely on the epoch of EEG data that was 1400 ms prior to the distractor. However, to further analyze the influence of the epoch size on the linear and neural network model performance, we created three additional datasets of EEG features using different time windows: 400 ms prior to the distractor, 700 ms prior to the distractor, and an extended epoch that lasted from 1400 ms prior to the distractor to 1500 ms after the distractor (the 100 ms distractor was ignored). For all new epochs, we followed the exact same preprocessing pipeline and EEG feature extraction methods described previously. In order to compare model performance across epoch sizes, we use the same 20 EEG features across all models and epochs.

2.9.2. Model Performance Quantification

For each participant and epoch, we fit both a final GLM (Equation (3)) and a final set of 5000 neural network models (Figure 2) to predict RT using the training dataset of the chosen EEG features n = 20 . For both models, we computed the Pearson correlation coefficient r , the maximum absolute error (MAE), and the R 2 metrics on the training and testing datasets. The Pearson correlation was a measure of linear correlation between two variables (i.e., the predicted RT and recorded RT), and the coefficient ranged between +1 (positive linear correlation) and −1 (negative linear correlation), where 0 represented no correlation. The MAE was the value of the largest error between the recorded RT and the predicted RT. Finally, the R 2 metrics provided a “goodness-of-fit”, which ranged between 0 (no fit) and 1 (perfect fit). For the neural network results, we averaged all of the 5000 simulations together to obtain an average predicted timeseries and correlation, MAE, and R 2 values, for each participant.
To statistically compare the metrics of the GLM and NN, we used twelve separate nonparametric Mann–Whitney U tests, one for each epoch and metric combination. To reduce the number of comparisons, we combined the training and testing values together. Then, we applied a Bonferroni correction ( α = 0.004 ) to account for the 12 comparisons.

2.9.3. Sensitivity Analysis

We applied a sensitivity analysis to ascertain the contribution of each EEG feature to both the linear and neural network models. Specifically, we used the full fitted model determined using all the EEG features in the training set and then systematically re-simulated the model 20 times by setting different EEG features to zero. For each feature, we determined the decrease in model performance by comparing the correlation when a particular EEG feature was removed ( r f e a t ) to the correlation of the full model ( r f u l l ) , which is defined by,
%   C h a n g e = 100 × r f e a t r f u l l r f u l l
This process was repeated for each person, for both the GLM and each of the fitted neural networks, which were averaged over the 5000 iterations.

3. Results

3.1. Reaction Time Variability during the Working Memory Task

Overall, the RT varied between participants, and also within each participant throughout the course of the session. Figure 3 shows the normalized RT data. In general, the RT timeseries data showed participants exhibit high-frequency oscillations, which could be due to several factors, including task stimulus differences across trials, noise, or latent variables associated with motivation or attention, for example. Similar high-frequency variability in RT has been observed in many other studies during attention-demanding tasks (e.g., [30]). There are also low-frequency trends, including a decrease in overall RT over the session, which may suggest that participants continue to learn the VDAC task over time.

3.2. EEG Feature Selection Results

In order to compare the EEG features across participants, we reduced the full EEG feature set (computed from the epoch 1400 ms prior to the distractor) using our feature selection algorithm. Figure 4A shows the resulting optimization process described in Equation (4). By taking the minimum of J, we see that 20 EEG features was the best choice for both keeping the mean and maximum RMSE values across participants small and using a fewer number of features.
The result of this selection was a set of 20 EEG features that performed well across all participants (Figure 4B). The within region features are denoted by the black circles, and the black lines represent the between region features, i.e., functional connectivity features. The Alpha, High Beta, and Gamma frequency bands included the largest number of features. The right frontal-parietal connection was selected in the Alpha, Low Beta, High Beta, and Gamma bands. In addition, features within the frontal and parietal regions were selected across multiple bands.

3.3. Linear and Neural Network Model Performance for Various Epoch Sizes

Using the selected set of 20 EEG features, both a participant-specific linear (GLM) and NN model were fitted for the four different epoch sizes: 400 ms, 700 ms, 1400 ms prior to the distractor, and 1400 ms before and after the distractor (denoted as 2800 ms). To compare the performance of the linear model and the neural network model (Figure 5), we computed the Pearson correlation coefficient ( r ), maximum absolute error (MAE), and R 2 for the training and testing datasets, for each epoch.
A series of Mann–Whitney U statistical tests were applied to compare the GLM performance to the NN, for each epoch size and metric. Across all metrics and epochs, the p-values were less than 0.0001, which indicates that the NN outperformed the GLM for predicting reaction time (refer to the Supplementary Materials for detailed results).
Overall, for both models, the performance (shown by the metrics in Figure 5) improved as the epoch size became larger, except for the largest epoch size of 2800 ms. Generally, the linear model showed smaller increases in model performance, relative to the neural network model, as the epoch size increased. For the neural network, the model performance changed more over the different epoch sizes. Model performance was reduced for some participants’ data using the longest epoch size, which may be due to this epoch including some EEG activity from both before and after the distractor stimulus. All other epoch sizes included only activity related to maintenance and preparation for distractor suppression.
On average, the neural network generally outperformed the linear model for all participants on all epochs. This trend is particularly clear in the 1400 ms epoch window, where the neural network performs well on the training set and experiences only a small drop-off during testing, across the three metrics. Therefore, this shows the neural network model was generalizing well and able to predict the RT of the participants well, using the full epoch of data from Delay 1. For our linear model, we have a worse model performance on the training set, but the testing change was generally smaller. Based on the epoch size results, the remaining analysis will focus on using the EEG features, which are computed on the epoch 1400 ms prior to the distractor.

3.4. Comparison of the Linear and Neural Network Model Performance

In Figure 6, we show the RT prediction overlaid on top of the recorded RT for both the linear and neural network model for participants 4, 9, and 13. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. The shaded bounds represent one standard deviation over the 5000 iterations for the neural network model. This figure again illustrates that the neural network can capture the trial-by-trial RT fluctuations very well. In contrast, the linear model was able to capture the larger-scale structure well but was unable to fully predict the fine structure (i.e., high-frequency oscillations) for each participant’s RT. Please refer to the Supplementary Materials to see, for each participant, the RT prediction overlaid on top of the recorded RT for both the linear and neural network models.
In order to visualize the two models’ performance for all sixteen participants, a set of scatterplots is shown in Figure 7 to compare the recorded RT with the predicted RT from the model. Again, the neural network model scatter plots have a higher overall R 2 value than the linear models. However, these scatter plots again show that both the linear and neural network models are capturing aspects of the trial-by-trial RT well because the predictions are centered on the y = x line. Additionally, the training and testing data points are denoted by the colored circles with no outlines and colored diamonds outlined in black, respectively.

3.5. Sensitivity Analysis for the Linear and Neural Network Models

Finally, the results of the sensitivity analysis for both the linear and neural network models are shown in Figure 8. The color of each feature indicates the mean percent change in correlation when that feature is removed, across participants. Therefore, the darker the color, the more important that EEG feature is to predicting RT. Generally, the results are similar for the linear and neural network models. Across both models, the most important feature was the right frontal gamma feature. Also, both models indicated additional highly important features from the high beta and gamma bands. However, for the linear model, the left frontal-temporal between-region feature in the low beta band was also highly important.

4. Discussion

The current study suggests that mathematical modeling of trial-by-trial RT can provide a new perspective for linking behavioral and neuroimaging data together to examine differences within and across individuals in both working memory maintenance and proactive distractor suppression. Specifically, we used a feature selection algorithm based on a general linear model to reduce the EEG features needed to predict RT. The linear model was successful in capturing large time-scale trends in the RT but was unable to predict the high-frequency oscillations present in the RT. Therefore, we used a feedforward neural network model to accurately capture the high RT variability.
The size of the epoch window was influential on the model performance, as shown in Figure 5. As shown by the three metrics, for both models, the performance improved as the epoch size became larger, except at the largest epoch size, which included activity from both before and after the distractor stimulus. The other size epochs included only activity from before the distractor. These results may indicate that all the information contained within the EEG throughout the Delay 1 period prior to the distractor is important to predict RT accurately.
On average, the neural network model generally outperformed the linear model for all participants on all epochs. This trend is particularly clear in the 1400 ms epoch window, where the neural network performs well on the training set and experiences only a small drop-off during testing, across the three different performance metrics. Therefore, this shows the neural network model was generalizing well and able to predict the RT of the participants well, using the full epoch of data from Delay 1. Based on the epoch size results, the remaining discussion will focus on results from the EEG features, which are computed on the epoch 1400 ms prior to the distractor.
Previous work on the present dataset [11] revealed that attentional capture by no longer relevant information was associated with lower posterior alpha power contralateral to the distractor, after the distractor. Distractors also resulted in reduced accuracy and longer RT. Those behavioral and neurophysiological effects of the distractor were stronger when spatial relations, as opposed to locations, were maintained in working memory. The results of the current work also indicate that lower alpha power within-region and between-region connectivity modulate with RT (Figure 4 and Figure 8). However, we solely used EEG data recorded before the distractor for the model predictions, which allows us to predict a participant’s reaction time long before they can even answer the question at hand. This suggests that alpha may be an indicator of attentional or working memory state prior to the distractor, perhaps suggesting proactive processes that enable suppression of the distractor interference with working memory maintenance.
In addition to the role of alpha power, the participant-specific linear and neural network models created during this study suggest that additional regions and connections can provide insight into RT variability. In particular, based on the sensitivity analysis, right frontal gamma power was shown to have the largest impact on both the linear and neural network model predictions. Our findings are consistent with previous work in which proactive control of information processing and response inhibition are mediated in part by right prefrontal cortex, in order to reduce the influence of irrelevant information [11,31,32,33,34], including previous work specifically implicating gamma and alpha functional connectivity between right frontal and occipitoparietal regions in this role [35].
The EEG data used in the modeling were prior to the distractor presentation, but after the encoding of the sample stimuli into working memory. While the results emphasize the role of right frontal gamma during this time period in predicting variability in RT, other frequency bands and other groups of electrodes and their connectivities also appear to contribute to RT as well. Activity and connectivity in the theta, alpha, beta, and gamma bands have all previously been associated with working memory-related processes, including information processing, maintenance, updating, and response preparation (e.g., [36,37,38]). In addition to the regulation of the control of interfering information [39], alpha activity has also been associated with the active maintenance of relevant information in working memory [40]. Oscillations in the gamma band (30 Hz to +100 Hz) have been associated with information processing and the integration of perceptual information [41]. The current results add to this literature by identifying the features of the EEG activity during the delay prior to reward-related distraction that appear to be most critical in predicting RT when working memory is then tested after distraction.

Limitations and Future Work

A limitation of this study is the number of participants, which is sixteen. However, the current results suggest that, in the future, the methodology could be applied to larger datasets of different tasks or different epochs. By using such models, the contributions of neural activity to behavior may be highlighted by providing additional information that might not be evident using traditional analyses. Additionally, as shown in Figure 5, the model performance of the neural network is far more sensitive to changes in epoch size than the GLM.
One possible explanation is that, in the shorter windows, additional artifacts, such as movement, which could remain despite our artifact removal preprocessing, may become more prominent. However, the effects of these artifacts could be reduced in the larger epochs due to the increased number of datapoints. For the current study, the experimental dataset did not include accelerometer, electromyography (EMG), or electrocardiogram (ECG) data. Therefore, further investigation into the effects the epoch size has on neural network model performance is needed.
Finally, in the future, mathematical models, such as the ones described here, will be instrumental in identifying new specific targets for training or neuromodulation interventions for improving performance. Models can highlight interactions between the neural activity and the observed behavior in ways that might not be evident in typical approaches.

5. Conclusions

In this work, we developed the first participant-specific feedforward neural network model to predict reaction time using short EEG epochs recorded during a working memory task in the time period prior to distractor presentation and well before the test stimulus was presented. Overall, we selected 20 EEG features that were able to predict trial-by-trial RT for all participants. We evaluated four different epoch sizes (400 ms, 700 ms, 1400 ms prior to distraction, and 2800 ms centered around the distraction) and determined that the 1400 ms epoch produced the best model performance for both the neural network ( r = 0.87 ± 0.04 ) and GLM ( r = 0.53 ± 0.08 ), across participants. Additionally, sensitivity analysis was completed for both the GLM and NN models to determine which EEG features were most important for RT prediction. We found that right frontal low gamma (35–55 Hz) was the most important EEG feature for predicting RT. Additionally, groups of EEG features in the high beta (20–35 Hz) and low gamma (35–55 Hz) bands modulated with RT. These results are consistent with prior research on working memory and inhibitory control, but by using a machine learning approach, we have improved our understanding of the specific contributions and importance of these aspects of neural activity in predicting within- and between-subject variability in cognitive performance and reaction times.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math10111848/s1. Figure S1: Participant 1 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S2: Participant 2 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S3: Participant 3 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S4: Participant 4 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S5: Participant 5 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S6: Participant 6 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S7: Participant 7 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S8: Participant 8 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S9: Participant 9 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S10: Participant 10 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S11: Participant 11 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S12: Participant 12 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S13: Participant 13 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S14: Participant 14 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S15: Participant 15 predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. Figure S16: Participant 16 fitted predictions for each model are overlaid on top of the normalized RT. For the linear model, the shaded bounds represent the 95% confidence intervals generated by the GLM. For the neural network model, the shaded bounds represent one standard deviation over the 5000 iterations. File S1: Statistical Comparison between the Neural Network and Linear Model Performance. File S2: Pseudo-code for EEG feature selection algorithm.

Author Contributions

S.M.C. conceived and planned the experiments. T.H. carried out the experiments. C.B., S.M. and S.V.S. designed the model and the computational framework and analyzed the data. All authors contributed to the interpretation of the results. C.B. and S.M. took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis, and manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Johns Hopkins Science of Learning Research Grant and NIH/NINDS T32 NSO70201 (Interdisciplinary Training in Biobehavioral Pain Research).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of Johns Hopkins University (protocol code HIRB00004624 approved on 9 June 2016).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are openly available at https://doi.org/10.7281/T1/OOMZS7.

Acknowledgments

The authors would like to thank Kara Blacker and Brian Anderson for their work on the original experimental study and Michelle DiBartolo for her work developing the modeling approach using a different data set.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Anderson, B.A.; Laurent, P.A.; Yantis, S. Value-driven attentional capture. Proc. Natl. Acad. Sci. USA 2011, 108, 10367–10371. [Google Scholar] [CrossRef] [Green Version]
  2. Awh, E.; Belopolsky, A.; Theeuwes, J. Top-down versus bottom-up attentional control: A failed theoretical dichotomy. Trends Cogn. Sci. 2012, 16, 437–443. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Alloway, T.P.; Gathercole, S.; Kirkwood, H.; Elliott, J. The Cognitive and Behavioral Characteristics of Children with Low Working Memory. Child Dev. 2009, 80, 606–621. [Google Scholar] [CrossRef] [PubMed]
  4. Albertella, L.; Le Pelley, M.E.; Chamberlain, S.R.; Westbrook, F.; Fontenelle, L.F.; Segrave, R.; Lee, R.; Pearson, D.; Yücel, M. Reward-related attentional capture is associated with severity of addictive and obsessive–compulsive behaviors. Psychol. Addict. Behav. 2019, 33, 495–502. [Google Scholar] [CrossRef] [PubMed]
  5. Cohen, M.X. It’s about time. Front. Hum. Neurosci. 2011, 5, 2. [Google Scholar] [CrossRef] [Green Version]
  6. So, W.K.Y.; Wong, S.; Mak, J.N.; Chan, R.H.M. An evaluation of mental workload with frontal EEG. PLoS ONE 2017, 12, e0174949. [Google Scholar] [CrossRef]
  7. Tervo, A.E.; Nieminen, J.O.; Lioumis, P.; Metsomaa, J.; Souza, V.H.; Sinisalo, H.; Stenroos, M.; Sarvas, J.; Ilmoniemi, R.J. Closed-loop optimization of transcranial magnetic stimulation with electroen-cephalography feedback. Brain Stimul. 2022, 15, 523–531. [Google Scholar] [CrossRef]
  8. Boutet, A.; Madhavan, R.; Elias, G.J.B.; Joel, S.E.; Gramer, R.; Ranjan, M.; Paramanandam, V.; Xu, D.; Germann, J.; Loh, A.; et al. Predicting optimal deep brain stimulation parameters for Parkinson’s disease using functional MRI and machine learning. Nat. Commun. 2021, 12, 3043. [Google Scholar] [CrossRef]
  9. Jamil, N.; Belkacem, A.N.; Ouhbi, S.; Lakas, A. Noninvasive Electroencephalography Equipment for Assistive, Adaptive, and Re-habilitative Brain–Computer Interfaces: A Systematic Literature Review. Sensors 2021, 21, 4754. [Google Scholar] [CrossRef]
  10. Serrano-Barroso, A.; Siugzdaite, R.; Guerrero-Cubero, J.; Molina-Cantero, A.; Gomez-Gonzalez, I.; Lopez, J.; Vargas, J. Detecting Attention Levels in ADHD Children with a Video Game and the Measurement of Brain Activity with a Single-Channel BCI Headset. Sensors 2021, 21, 3221. [Google Scholar] [CrossRef]
  11. Hinault, T.; Blacker, K.J.; Gormley, M.; Anderson, B.A.; Courtney, S.M. Value-driven attentional capture is modulated by the contents of working memory: An EEG study. Cogn. Affect. Behav. Neurosci. 2018, 19, 253–267. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Tankelevitch, L.; Spaak, E.; Rushworth, M.F.; Stokes, M.G. Previously Reward-Associated Stimuli Capture Spatial Attention in the Absence of Changes in the Corresponding Sensory Representations as Measured with MEG. J. Neurosci. 2020, 40, 5033–5050. [Google Scholar] [CrossRef]
  13. Bachman, M.D.; Wang, L.; Gamble, M.L.; Woldorff, M.G. Physical Salience and Value-Driven Salience Operate through Different Neural Mechanisms to Enhance Attentional Selection. J. Neurosci. 2020, 40, 5455–5464. [Google Scholar] [CrossRef] [PubMed]
  14. Kim, A.J.; Anderson, B.A. Arousal-Biased Competition Explains Reduced Distraction by Reward Cues Under Threat. J. Vis. 2020, 20, 169. [Google Scholar] [CrossRef]
  15. King, J.; Korb, F.; Von Cramon, D.Y.; Ullsperger, M. Post-Error Behavioral Adjustments Are Facilitated by Activation and Suppression of Task-Relevant and Task-Irrelevant Information Processing. J. Neurosci. 2010, 30, 12759–12769. [Google Scholar] [CrossRef] [PubMed]
  16. Zhang, Y.; Wang, C.; Wu, F.; Huang, K.; Yang, L.; Ji, L. Prediction of working memory ability based on EEG by functional data analysis. J. Neurosci. Methods 2019, 333, 108552. [Google Scholar] [CrossRef]
  17. Eryilmaz, H.; Dowling, K.F.; Hughes, D.E.; Rodriguez-Thompson, A.; Tanner, A.; Huntington, C.; Coon, W.G.; Roffman, J.L. Working memory load-dependent changes in cortical network connectivity estimated by machine learning. NeuroImage 2020, 217, 116895. [Google Scholar] [CrossRef]
  18. Jiao, Z.; Gao, X.; Wang, Y.; Li, J.; Xu, H. Deep Convolutional Neural Networks for mental load classification based on EEG data. Pattern Recognit. 2017, 76, 582–595. [Google Scholar] [CrossRef]
  19. Che, X.; Zheng, Y.; Chen, X.; Song, S.; Li, S. Decoding Color Visual Working Memory from EEG Signals Using Graph Convolutional Neural Networks. Int. J. Neural Syst. 2021, 32, 2250003. [Google Scholar] [CrossRef]
  20. Chowdhury, M.; Dutta, A.; Robison, M.; Blais, C.; Brewer, G.; Bliss, D. Deep Neural Network for Visual Stimulus-Based Reaction Time Estimation Using the Periodogram of Single-Trial EEG. Sensors 2020, 20, 6090. [Google Scholar] [CrossRef]
  21. Reddy, T.K.; Arora, V.; Kumar, S.; Behera, L.; Wang, Y.-K.; Lin, C.-T. Electroencephalogram Based Reaction Time Prediction with Differential Phase Synchrony Representations Using Co-Operative Multi-Task Deep Neural Networks. IEEE Trans. Emerg. Top. Comput. Intell. 2019, 3, 369–379. [Google Scholar] [CrossRef]
  22. Delorme, A.; Makeig, S. EEGLAB: An Open Source Toolbox for Analysis of Single-Trial EEG Dynamics Including Independent Component Analysis. J. Neurosci. Methods 2004, 134, 9–21. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Makeig, S.; Bell, A.J.; Jung, T.-P.; Sejnowski, T.J. Independent component analysis of electroencephalographic data. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 1996; pp. 145–151. [Google Scholar]
  24. Bell, A.J.; Sejnowski, T.J. An Information-Maximization Approach to Blind Separation and Blind Deconvolution. Neural Comput. 1995, 7, 1129–1159. [Google Scholar] [CrossRef] [PubMed]
  25. Lachaux, J.-P.; Rodriguez, E.; Martinerie, J.; Varela, F.J. Measuring phase synchrony in brain signals. Hum. Brain Mapp. 1999, 8, 194–208. [Google Scholar] [CrossRef] [Green Version]
  26. Selesnick, I. The design of approximate Hilbert transform pairs of wavelet bases. IEEE Trans. Signal Process. 2002, 50, 1144–1152. [Google Scholar] [CrossRef]
  27. Foresee, F.D.; Hagan, M.T. Gauss-Newton approximation to Bayesian learning. In Proceedings of the International Conference on Neural Networks (ICNN’97), Houston, TX, USA, 9–12 June 1997. [Google Scholar]
  28. MacKay, D.J.C. Bayesian interpolation. Neural Comput. 1992, 4, 415–447. [Google Scholar] [CrossRef]
  29. Moré, J.J. The Levenberg-Marquardt algorithm: Implementation and theory. In Numerical Analysis; Springer: Berlin, Germany, 1978; pp. 105–116. [Google Scholar]
  30. Castellanos, F.X.; Sonuga-Barke, E.J.; Scheres, A.; Di Martino, A.; Hyde, C.; Walters, J.R. Varieties of Attention-Deficit/Hyperactivity Disorder-Related Intra-Individual Variability. Biol. Psychiatry 2005, 57, 1416–1423. [Google Scholar] [CrossRef] [Green Version]
  31. Stoll, F.M.; Fontanier, V.; Procyk, E. Specific frontal neural dynamics contribute to decisions to check. Nat. Commun. 2016, 7, 11990. [Google Scholar] [CrossRef]
  32. Wu, S.; Hitchman, G.; Tan, J.; Zhao, Y.; Tang, D.; Wang, L.; Chen, A. The neural dynamic mechanisms of asymmetric switch costs in a combined Stroop-task-switching paradigm. Sci. Rep. 2015, 5, 10240. [Google Scholar] [CrossRef] [Green Version]
  33. ElShafei, H.A.; Fornoni, L.; Masson, R.; Bertrand, O.; Bidet-Caulet, A. What’s in Your Gamma? Activation of the Ventral Fronto-Parietal Attentional Network in Response to Distracting Sounds. Cereb. Cortex 2019, 30, 696–707. [Google Scholar] [CrossRef]
  34. Xu, K.Z.; Anderson, B.A.; Emeric, E.E.; Sali, A.W.; Stuphorn, V.; Yantis, S.; Courtney, S.M. Neural Basis of Cognitive Control over Movement Inhibition: Human fMRI and Primate Electrophysiology Evidence. Neuron 2017, 96, 1447–1458.e6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Hinault, T.; Kraut, M.; Bakker, A.; Dagher, A.; Courtney, S.M. Disrupted Neural Synchrony Mediates the Relationship between White Matter Integrity and Cognitive Performance in Older Adults. Cereb. Cortex 2020, 30, 5570–5582. [Google Scholar] [CrossRef] [PubMed]
  36. Miller, E.K.; Lundqvist, M.; Bastos, A.M. Working Memory 2.0. Neuron 2018, 100, 463–475. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Schneider, D.; Barth, A.; Wascher, E. On the contribution of motor planning to the retroactive cuing benefit in working memory: Evidence by mu and beta oscillatory activity in the EEG. NeuroImage 2017, 162, 73–85. [Google Scholar] [CrossRef]
  38. De Vries, I.E.; Slagter, H.A.; Olivers, C.N. Oscillatory Control over Representational States in Working Memory. Trends Cogn. Sci. 2019, 24, 150–162. [Google Scholar] [CrossRef]
  39. Klimesch, W. Alpha-band oscillations, attention, and controlled access to stored information. Trends Cogn. Sci. 2012, 16, 606–617. [Google Scholar] [CrossRef] [Green Version]
  40. Palva, S.; Palva, J.M. New vistas for α-frequency band oscillations. Trends Neurosci. 2007, 30, 150–158. [Google Scholar] [CrossRef]
  41. Jensen, O.; Gips, B.; Bergmann, T.O.; Bonnefond, M. Temporal coding organized by coupled alpha and gamma oscillations prioritize visual processing. Trends Neurosci. 2014, 37, 357–369. [Google Scholar] [CrossRef]
Figure 1. A schematic of the VDAC WM task for the location and relation trials. The red and blue dots represent an example of a presented distractor.
Figure 1. A schematic of the VDAC WM task for the location and relation trials. The red and blue dots represent an example of a presented distractor.
Mathematics 10 01848 g001
Figure 2. A diagram of the feedforward neural network structure that is used in this work.
Figure 2. A diagram of the feedforward neural network structure that is used in this work.
Mathematics 10 01848 g002
Figure 3. Normalized reaction time for all participants with Participant 4 (P4), Participant 9 (P9), and Participant 13 (P13) highlighted.
Figure 3. Normalized reaction time for all participants with Participant 4 (P4), Participant 9 (P9), and Participant 13 (P13) highlighted.
Mathematics 10 01848 g003
Figure 4. EEG feature selection results. (A) For each number of EEG features, a comparison of the normalized mean and maximum RMSE values, and optimization function ( J ). The features were computed on the epoch window that was 1400 ms before the distractor. (B) Features (connections) selected by our algorithm as important for predicting RT.
Figure 4. EEG feature selection results. (A) For each number of EEG features, a comparison of the normalized mean and maximum RMSE values, and optimization function ( J ). The features were computed on the epoch window that was 1400 ms before the distractor. (B) Features (connections) selected by our algorithm as important for predicting RT.
Mathematics 10 01848 g004
Figure 5. For the four epochs, the (A) Pearson correlation coefficients, (B) maximum absolute error, and (C) R 2 for the training and testing sets for both the linear and neural network models.
Figure 5. For the four epochs, the (A) Pearson correlation coefficients, (B) maximum absolute error, and (C) R 2 for the training and testing sets for both the linear and neural network models.
Mathematics 10 01848 g005
Figure 6. Predictions of each model are overlaid on top of the normalized RT for each of three representative participants: P4, P9, and P13.
Figure 6. Predictions of each model are overlaid on top of the normalized RT for each of three representative participants: P4, P9, and P13.
Mathematics 10 01848 g006
Figure 7. Scatterplots of normalized RT versus the predicted RT for both the (A) linear models and (B) neural network models, for each participant.
Figure 7. Scatterplots of normalized RT versus the predicted RT for both the (A) linear models and (B) neural network models, for each participant.
Mathematics 10 01848 g007
Figure 8. Sensitivity analysis for both the linear and neural network models. The color bar indicates the percent change in model performance (Pearson correlation coefficient) when each feature was systematically removed.
Figure 8. Sensitivity analysis for both the linear and neural network models. The color bar indicates the percent change in model performance (Pearson correlation coefficient) when each feature was systematically removed.
Mathematics 10 01848 g008
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Beauchene, C.; Men, S.; Hinault, T.; Courtney, S.M.; Sarma, S.V. Using Neural Networks to Uncover the Relationship between Highly Variable Behavior and EEG during a Working Memory Task with Distractors. Mathematics 2022, 10, 1848. https://doi.org/10.3390/math10111848

AMA Style

Beauchene C, Men S, Hinault T, Courtney SM, Sarma SV. Using Neural Networks to Uncover the Relationship between Highly Variable Behavior and EEG during a Working Memory Task with Distractors. Mathematics. 2022; 10(11):1848. https://doi.org/10.3390/math10111848

Chicago/Turabian Style

Beauchene, Christine, Silu Men, Thomas Hinault, Susan M. Courtney, and Sridevi V. Sarma. 2022. "Using Neural Networks to Uncover the Relationship between Highly Variable Behavior and EEG during a Working Memory Task with Distractors" Mathematics 10, no. 11: 1848. https://doi.org/10.3390/math10111848

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop