Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings

Kritikos, Jacob; Makrypidis, Alexandros; Alevizopoulos, Aristomenis; Alevizopoulos, Georgios; Koutsouris, Dimitris

doi:10.3390/virtualworlds2020011

Open AccessArticle

Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings

by

Jacob Kritikos

^1,*

,

Alexandros Makrypidis

¹

,

Aristomenis Alevizopoulos

²,

Georgios Alevizopoulos

³ and

Dimitris Koutsouris

⁴

¹

Department of Bioengineering, Imperial College London, London SW7 2BX, UK

²

School of Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece

³

Psychiatric Clinic, Agioi Anargyroi General Oncological Hospital of Kifissia, 14564 Athens, Greece

⁴

School of Electrical and Computer Engineering, National Technical University of Athens, 15772 Athens, Greece

^*

Author to whom correspondence should be addressed.

Virtual Worlds 2023, 2(2), 182-202; https://doi.org/10.3390/virtualworlds2020011

Submission received: 17 January 2023 / Revised: 18 March 2023 / Accepted: 29 May 2023 / Published: 9 June 2023

Download

Browse Figures

Versions Notes

Abstract

:

Brain–Machine Interfaces (BMIs) have made significant progress in recent years; however, there are still several application areas in which improvement is needed, including the accurate prediction of body movement during Virtual Reality (VR) simulations. To achieve a high level of immersion in VR sessions, it is important to have bidirectional interaction, which is typically achieved through the use of movement-tracking devices, such as controllers and body sensors. However, it may be possible to eliminate the need for these external tracking devices by directly acquiring movement information from the motor cortex via electroencephalography (EEG) recordings. This could potentially lead to more seamless and immersive VR experiences. There have been numerous studies that have investigated EEG recordings during movement. While the majority of these studies have focused on movement prediction based on brain signals, a smaller number of them have focused on how to utilize them during VR simulations. This suggests that there is still a need for further research in this area in order to fully understand the potential for using EEG to predict movement in VR simulations. We propose two neural network decoders designed to predict pre-arm-movement and during-arm-movement behavior based on brain activity recorded during the execution of VR simulation tasks in this research. For both decoders, we employ a Long Short-Term Memory model. The study’s findings are highly encouraging, lending credence to the premise that this technology has the ability to replace external tracking devices.

Keywords:

Brain–Machine Interface; electroencephalography; Virtual Reality

1. Introduction

The idea of the brain being the central organ of cognition was first proposed in the 5th century BC by Alcmaeon, and further contributions were made by Herophilus, Erasistratus, and Galen in the following centuries. Major progress in understanding the human brain occurred during the 20th century, with pioneers such as Hans Berger and Jacques Vidal making significant breakthroughs [1,2]. A Brain–Machine Interface (BMI) comprises two main components: (1) access to the human brain’s activity, and (2) the extraction of useful information from this signal, which can be processed in various ways depending on the specific purpose of the BMI system. The system has interfaces between the brain and some external machine or device, with the goal of modifying interactions between the central nervous system (CNS) and either the body or the external world. BMIs have the potential to impact the lives of individuals with disabilities by providing new ways to interact with the world and to regain lost function [3,4,5,6]. BMI systems have a wide range of potential applications, including the restoration of lost muscle control and the enhancement of mobility for paralyzed individuals [7,8,9]. They can be used for communication and to restore lost body functions in individuals with physical disabilities. BMI systems can improve the natural CNS output and supplement it by providing an additional layer of control and precision. Additionally, they can allow healthy individuals to perform tasks that would be difficult or impossible to achieve using traditional means, such as operating a robotic arm in a hazardous environment [10].

Brain activity needs to be recorded to facilitate interactions through a BMI. To record brain activity for a BMI, various electrophysiological methods can be used, such as EEG, MEG, and fMRI, each with its own advantages and limitations [3,4]. The choice of method will depend on the requirements of the specific BMI system and the characteristics of the brain activity being measured. For example, intracortical microelectrodes allow detailed control of a robotic arm but have rapid wear, while electrodes on the cortical surface provide stable signals but only allow nontargeted control [11]. Effective interaction between the CNS and the targeted device is crucial for successful BMI application, which is a challenging task. BMIs can be categorized as noninvasive, semi-invasive, or invasive, and proper signal acquisition is critical for the function of any BMI. The selection of optimal regions for the recording of signals from the CNS is crucial for effective BMI systems. Several studies have focused on recording signals from the sensorimotor cortical areas, which are easily accessible, produce measurable signals, and can be monitored using noninvasive electrodes [12,13,14]. Deep neural networks (DNNs) have shown promise for decoding neural signals from the motor cortex. DNNs are a type of machine learning algorithm composed of multiple layers of artificial neurons [15,16,17]. These layers allow the network to learn increasingly abstract representations of the input data, leading to better performance on complex tasks, such as image and speech recognition. The network learns to associate specific patterns of neural activity with specific movement directions, enabling it to predict the desired movement from the recorded neural signals. DNNs have been shown to outperform traditional decoding algorithms, such as linear filters and support vector machines, particularly in situations where the relationship between neural signals and movement commands is highly nonlinear.

To obtain a holistic BMI, on the other hand, you must develop a stimulus, a goal, and a task. Over the past few years, VR technology has made significant advances, becoming more accessible and enabling the development of a wider range of applications. VR technology has the potential to increase surgical efficiency, physical rehabilitation treatments, memory and cognitive abilities, and mental health care [18,19,20,21,22,23]. One of the primary benefits of VR is its ability to create highly realistic and immersive artificial environments, allowing users to experience situations that may not be possible in the real world [24,25,26,27,28]. VR technology has provided clinicians with a powerful tool for delivering immersive therapeutic interventions to their patients. By simply putting on a headset, patients can be transported into a wide range of virtual environments where they can be exposed to a variety of therapeutic tasks and challenges. These simulated environments offer a sense of safety and comfort that can help patients feel more confident and willing to confront their fears and challenges. While the therapeutic tasks take place within the virtual world, research has shown that the learning and skills gained through VR can transfer to real-world situations. VR can be used in conjunction with DNNs and can be trained to decode neural signals recorded from the motor cortex, which can then be used to control virtual limbs or avatars in a VR environment [29,30]. This approach has several advantages, including the ability to provide a more natural and intuitive interface for users, as well as the ability to create immersive and engaging training environments that can facilitate motor learning and rehabilitation.

Various DNN models have been utilized to decode neural signals originating from the motor cortex [31,32,33,34]. Nonetheless, a few models, particularly those that focus on motor activity during a Virtual Reality simulation, may be helpful for time series datasets. In this study, we aim to investigate the potential of a DNN model called Long Short-Term Memory (LSTM), which is an effective model [35,36,37,38] for decoding neural signals recorded from the motor cortex during Virtual Reality simulations, with a particular focus on improving the immersive qualities of such simulations [39]. Our objective is to gain a deeper understanding of how VR experiences can be designed to be more impactful for users, which could inform the development of more effective therapeutic interventions and other applications of VR technology. Our current research involves the use of a BMI system that records brain activity from the motor cortex using EEG while individuals perform specific arm movement tasks during the VR simulation. Our study aims to investigate how well the LSTM model can accurately predict the path of arm movements based on recorded brain activity. By understanding the relationship between brain activity and arm movement, we aim to develop new ways to assist individuals who require VR treatment (Figure 1).

2. Materials and Methods

A neural decoder was developed with the goal of controlling a hypothetical arm inside the virtual simulation (Figure 2). In the virtual environment, objects were placed at different positions, allowing users to virtually interact with them. This provided a way for users to engage with the virtual environment and perform tasks within it. The placement of objects at different positions allowed for a variety of interactions and challenges for the user, which potentially led to a more immersive and engaging experience. In addition, the variability of the tasks at a small scale (e.g., stereotypical reach-to-grasp movement was executed for the diverse arrangement of positions) promoted the training of a decoder that was invariant to movement-specific EEG signals and more tuned to task-specific generalized activities.

The virtual objects were placed in locations that could be reached only by arm movement, which suggests that the experiment aimed to measure arm motor activity (Figure 3). The objects were positioned within a 60-degree range of the participant’s central vision to make them easily visible and reachable. The placement of the objects was carefully planned to avoid intense body stretching or head movement, which could introduce unwanted artifacts or noise into the data. The goal was to record body movement with as minimal movement and noise as possible. By limiting movement to only the dominant hand, the researchers could ensure that any changes in motor activity were directly related to the movement of the arm and not to any other confounding factors. To minimize movement artifacts and noise, participants were instructed to stand steadily and not move their body or head during the experiment. This further ensured that the recorded motor activity was related solely to the movement of the arm and was not influenced by other sources of motion. The careful planning of object positions and participant instructions aimed to provide high-quality EEG data for developing a neural decoder to control arm movement in the virtual environment.

We developed a Virtual Reality (VR) system tailored to the needs of our study that possesses the necessary specifications for our research. The hardware components consisted of a desktop computer with an NVIDIA GeForce GTX 1070 graphics card, AMD Ryzen 7 2700X CPU, 16 GB G.Skill TridentZ DDR4 RAM, HDMI 1.3 video output, 3 USB 3.0 and 1 USB 2.0 ports, an Oculus Rift VR headset, and an Arduino Uno connected to a Seeed 101020052 Grove Electrodermal Activity Sensor to measure skin electrical conductance. The software used included the Windows 10 operating system with Oculus Rift drivers, Unity 3D, Blender 3D computer graphics software, Adobe Photoshop, OVR Plugin, and the Arduino IDE. The GSR sensor values were read by the Arduino at a rate of 100 Hz and used accordingly with our Unity software. Furthermore, we used Matlab for post-data analysis and to form the neural network of the simulation data.

The study included 10 healthy participants, consisting of 7 males and 3 females between the ages of 21 and 42 years old. The use of both male and female participants across a range of ages helps to ensure that the results are more broadly applicable and generalizable. The participants wore EEG equipment to record activity in the motor cortex, specifically using 35 channels (Figure 4). Prior to participating in the study, all individuals provided written informed consent to take part in the research. During the experiment, each participant repeatedly reached targets located at different positions in the virtual environment. Specifically, each participant repeated the movement 50 times for each of the 8 virtual objects, resulting in a total of 400 movements being recorded for each participant. This means that, in total, 4000 movements were recorded for the entire experiment. EEG data were collected using a sampling frequency of 400 Hz. Prior to analysis, the data were passed through a bandpass filter with a range of 0.5–35 Hz. This range was chosen to filter out unwanted noise and electrical interference while retaining the relevant signals associated with motor cortex activity. The 0.5–35 Hz frequency range is widely used in EEG recordings, as it captures movement-related activity in the motor cortex and other relevant brain regions. EEG recordings can be contaminated by various sources of noise, including environmental factors, electrical equipment, and physiological artifacts, such as muscle activity and eye blinks. To minimize the impact of these artifacts, we employed a pre-processing pipeline that included several steps. First, we used a notch filter to remove line noise at 50 Hz and its harmonics. Then, we used a threshold-based approach and independent component analysis (ICA) to identify and remove muscle activity and eye movement artifacts. We also designed the VR simulation to minimize the potential for extraneous muscle activity and movement-related artifacts that might have been introduced by larger body movements. In addition, to reduce movement artifacts, we instructed the participants to only move their dominant arm while keeping the rest of their body still during the experiment. This helped to ensure that the recorded signals were primarily related to motor cortex activity and were not affected by other sources of movement-related noise.

EEG data can be contaminated by various types of noise and artifacts, such as those caused by blinking, swallowing, and abrupt body movements, as well as those resulting from unstable electrode contact or reference electrode issues [40]. To minimize movement-related artifacts, we instructed participants to avoid abrupt movements and to maintain a comfortable resting position while completing the assigned tasks. Additionally, by tracking hand movement, we were able to discard irrelevant parts of the recording and only retain channel recordings collected during or close to the assigned task. Finally, after selecting the recording windows relative to the movement, we removed electrode noise by subtracting the DC component. These steps helped to ensure that the EEG data collected were of high quality and accurately reflected the neural activity of interest [41,42].

The recording window that was selected for analysis began 300 milliseconds (ms) before movement and included 100 ms after movement completion, giving a total of 1000 ms (Figure 5). Implementation of this design allowed the decoder to predict the hand position vector based on neural activity that was relevant to the task assigned to the participant. By including a period of time before movement onset, it was possible to capture preparatory activity, which is an increase in neuronal firing that occurs in response to the presentation of a stimulus and the subsequent motor planning. By examining activity in these different periods, it is possible to gain a more comprehensive understanding of the neural processes involved in movement and how they may vary across the movement cycle. In more detail, the data included the EEG activity of each channel during each trial at each position in the form of a 35XT array, where T represents time in milliseconds, and 35 represents the number of channels.

By capturing the 3D positions of the controllers, the x, y, and z positions of the participant’s hand (right or left) were recorded during the experiment. These hand position data were stored in a 3XT array, where T represents time in milliseconds and each row represents the position in a different axis (x, y, z). This storage method allowed for the efficient organization and analysis of the hand position data (x, y, z), which were used to train and test the neural networks (Figure 6).

A DNN was chosen as the machine learning model to be trained and tested. This was preferred over other techniques, such as support vector machines and the Kalman filter, due to its ease of implementation. These days, it just takes a few lines of code to construct neural network applications utilizing Matlab or Python frameworks. Preparing data for training and testing is the application’s primary task. Neural networks use input data, in this case, brain activity recorded by electrodes, to identify patterns and relationships with data labels, in this case, arm position. It is important for the training data to be structured in a meaningful way to ensure that any relationships identified are useful and valid for the problem at hand. To this end, the training data were divided into two groups: the data from the initial 300 milliseconds (ms) before movement onset and the remaining data collected during movement minus the last 100 ms after movement completion. This split allowed for the analysis of neural activity at different stages of the movement process and ensured that the training data were relevant to the problem at hand.

The first group of training data (Figure 7b) was used to train a classification DNN to find a relationship between the EEG signal averaged over time bins and the target position. The training data consisted of the first 300 ms of the signal for all 35 channels, and the training labels were a 1 × 8 vector, with each column representing the positions. During this 300 ms period, the arm was not moving, and any amplitude/power that occurred was likely due to the presentation of a stimulus (the target position) and subsequent motor planning. This increase in neuronal firing before movement, known as preparatory activity, was used to train the network to predict the position of the movement trajectory. To speed up the computation time, the data were binned into 20 ms chunks, in a total of 15 bins for the 300 ms period with normalized values between 0 and 1. This allowed the neural network to process fewer data while still maintaining the necessary resolution for the analysis. Overall, this training approach was designed to identify the relationship between EEG signals and target positions and to enable the neural network to accurately predict the position of the movement trajectory.

The second neural network (Figure 7c) used in the study was a more advanced network designed for regression, which used the remaining data (during movement minus the last 100 ms) to find the relationship between EEG activity and the hand position vector. To speed up the computation time, the data were binned into 20 ms chunks, in a total of 30 bins with values between 1 and 0. The positional data were also binned, with each 20 ms chunk containing the average position over that time period. The location labels were the binned x and y position data, starting at 301 ms and ending 100 ms before recording termination. This setup reflects the predictive nature of the neural network, which was designed to predict hand position.

LSTM is a special variant of the recurrent neural network (RNN) that is capable of tracking long-term dependencies among input data sequences [43,44]. LSTMs are specifically designed for processing sequences of data over time, where each input to the network is a sequence of values (a sequential signal). In an LSTM network, there are multiple memory cells, each with its own input, output, and forget gates. These gates regulate the flow of information into and out of each cell, allowing the network to selectively remember or forget information based on its relevance to the task at hand. At each time step, the LSTM network receives an input sequence, and the network performs a series of calculations using the input value, the previous cell state, and the previous output. These calculations are used to update the cell state and the output for the current time step. The output of the network can then be used for both classification (first neural network) and regression (second neural network. In the case of EEG signals, the LSTM network can be used to classify the signals into different categories based on the task being performed by the subject. The input to the network is a sequence of EEG data, and the output is a classification label indicating the task being performed; in our case, for the first, it was one of the 8 angles, and for the second neural network, it was the arm position [44,45].

h_{t} = σ_{g} (W x_{t} + R h_{t - 1} + b) ⊙ σ_{c} (c_{t})

(1)

$σ_{c}$ : hyperbolic tangent function; $σ_{g}$ : sigmoid function;
W: input weights; R: recurrent weights, b: Bias.

In our study, both neural networks used in the study employed an LSTM architecture. This architecture was chosen based on the algorithm, architecture, and formulas provided in Matlab (mathworks.com/help/deeplearning/ug/long-short-term-memory-networks.html) accessed on 20 May 2023 (Formula (1)). The first neural network, which was designed to predict the position observed by the participant, used a Classification LSTM Network (Figure 8).

The second neural network, which was designed to predict hand movement, used a Regression LSTM Network (Figure 9). Both neural networks had three hidden layers, with one layer dedicated to the LSTM block and two additional hidden layers. These hidden layers were situated between the input and output layers, and they transformed the input data into a form that was more meaningful for the output layer. For the first neural network (which was used to decode the position), the LSTM layer had a size of 15, corresponding to the number of time windows (bins). The second and third hidden layers had sizes of 15 and 20, respectively. The output layer of the first neural network had a size of 8, reflecting the number of different positions. For the second neural network (which was used to decode the movement), the LSTM layer had a size of 30, corresponding to the number of time windows (bins). The second and third hidden layers had sizes of 30 and 32, respectively. The output layer of the second neural network had a size of 30, corresponding to the number of time window bins. These sizes were determined through trial and error, with the aim of finding the optimal number of layers and neurons in each layer to decode target positions and hand positions from test data consisting of average EEG signals in 20 ms chunks over a time period from the initial 0 ms to 300 ms for the first neural network and from the initial 300 ms to the end of the movement for the second neural network.

3. Results

The dataset was randomly divided into training and testing sets, with 80% of the data used for training and 20% used for testing. Specifically, 3200 movements and recordings were used for training (400 for each of the 8 angles), and 800 movements and recordings were used for testing (100 for each of the 8 angles). To prevent data snooping, the testing data were not seen during the model training phase. This ensured that the model was not memorizing the training data but was instead learning the underlying patterns in the data that could generalize well to new, unseen data. LSTM neural networks, which fall under the umbrella of supervised learning, were employed in this study. The input layer of both neural networks was the EEG-recorded time series. For the first model, the targeted 8 angles/classes were the output layer, while for the second model, the (x, y, z) virtual movement recorded from the controllers was the output layer.

For the first network, we evaluated the model based on a multiclass confusion matrix obtained by training data, as presented in Table 1. The neural network’s predicted/output distribution is shown in rows, while the columns indicate the true/expected class distribution. The diagonal elements are the correctly predicted data. A total of 685 samples were correctly predicted out of the total of 800 samples. Thus, the overall accuracy was 85.25%.

Furthermore, we evaluated the first model based on the classification precision (Formula (2)), recall/sensitivity (Formula (3)), F1 score (Formula (4)), specificity (Formula (5)), and negative predictive value (Formula (6)), as presented in Table 2. The precision recall, and F1 score values for each class provide insights into the performance of the prediction. Precision is the fraction of True Positives among all positive (True Positive and False Positive) predictions, while recall is the fraction of True Positives among all actual positive (True Positive and False Negative) instances.

True Positive refers to a sample that is correctly classified as positive, meaning that it was accurately predicted to belong to the actual class (i.e., the correct angle).
False Positive refers to a sample that is incorrectly classified as positive, meaning that it was predicted to belong to this class while it does not belong to this class (i.e., incorrect angle).
False Negative refers to a sample that is incorrectly classified as negative, meaning that it accurately belongs to the class, while it was predicted to belong to a different class (i.e., the correct angle).

False Positives and False Negatives are both types of error in classification tasks. The main difference between the two is based on whether the predicted class is incorrect or correct. In other words, a False Positive is a sample that is predicted to belong to a certain class but is not actually part of that class, while a False Negative is a sample that is predicted to not belong to a certain class but is actually part of that class.

A high precision value indicates that the model is accurately predicting the True Positives for that class. A high recall value indicates that the model is correctly identifying most of the True Positives for that class. Class/angle 1 had the highest precision and recall values, indicating that the model was able to correctly classify a high proportion of the samples in that class. Classes/angles 2, 5, 6, 7, and 8 also had high precision and recall values, indicating that the model was able to classify the majority of samples in those classes correctly. However, for classes/angles 3 and 4, the model had lower precision and recall values, which means that the model struggled to correctly classify samples in those classes. This may indicate that the features that distinguish classes 3 and 4 from the other classes are more difficult to capture or that the model needs further optimization to better classify samples in these classes/angles. The first neural network model performed rather well in categorizing the EEG motor recording signals into one of the 8 angles, although there is still space for improvement in several classes, according to the precision and recall values.

P r e c i s i o n (i) = \frac{T r u e P o s i t i v e s}{(T r u e P o s i t i v e s + F a l s e P o s i t i v e s)}

(2)

R e c a l l (i) = \frac{T r u e P o s i t i v e s}{(T r u e P o s i t i v e s + F a l s e N e g a t i v e s)}

(3)

F 1 S c o r e (i) = \frac{2 * P r e c i s i o n * R e c a l l}{(P r e c i s i o n + R e c a l l)}

(4)

S p e c i f i c i t y (i) = \frac{T r u e N e g a t i v e s}{(F a l s e P o s i t i v e s + T r u e N e g a t i v e s)}

(5)

N P V (i) = \frac{T r u e N e g a t i v e s}{(F a l s e N e g a t i v e s + T r u e N e g a t i v e s)}

(6)

i = \{1, 2, . ., 8\} t h e c l a s s / a n g l e n u m b e r

The F1 score provides a balance between precision and recall, and it is calculated as the harmonic mean of those two, providing a single metric to evaluate the overall performance of a classifier. In our study, we calculated the F1 score for each class of the first model to evaluate its performance in predicting the virtual arm movement tasks. The results indicate that the model performed well overall, with F1 scores ranging from 0.76 to 0.94. The class with the highest F1 score (0.94) was class 8, which corresponds to the virtual arm movement task of reaching for the bottom-right target. This indicates that the model was particularly accurate when predicting this movement compared to other movements. The next highest F1 scores were observed for class 1 (0.90) and class 7 (0.91), corresponding to the virtual arm movements of reaching for the top-right and bottom-left targets, respectively. The lowest F1 scores were observed for class 3 (0.77) and class 4 (0.76), corresponding to the virtual arm movements of reaching for the middle targets.

The specificity of a model represents the ability of the model to correctly identify True Negatives, that is, the percentage of negative cases that the model correctly identifies. In our study, the specificity of the model ranged from 0.97 to 0.99 across all classes. This indicates that the model was effective for identifying negative cases and reducing False Positives. The Negative Predictive Value (NPV) represents the probability that a negative prediction from the model is accurate. In our study, the NPV ranged from 0.96 to 1.00 across all classes, with most classes achieving a value of 0.98 or higher. This indicates that the model was able to accurately predict negative cases, which is particularly important in cases when trying to replace hand controllers during the Virtual Reality simulation.

For the second LSTM, we evaluated the performance by comparing the predicted hand position to the actual hand position for each of the 30 bins at the (x, y, z) positions. The root mean squared error (RMSE) was used as a measure of the difference in distance between the predicted and actual hand positions. The RMSE measures the average deviation of the predicted values from the actual values and is calculated by taking the square root of the average squared differences between the predicted and actual values. The RMSE value is relevant to the dataset, as a lower RMSE value indicates better performance of the second LSTM in predicting the virtual hand movement based on EEG recordings. We measured the RMSE values in two variations for the second LSTM. The first variation was relevant to the class of to which its predicted movement belonged (Formula (7)). We present the boxplot for each class (Figure 10). There were no significant differences between each class/angle’s RMSD. The error differences between predicted and actual movements did not affect the targeted virtual object to which each movement belonged.

R M S D (i, j) = \sqrt{\frac{\sum_{n = 1}^{N = 30} {(x_{n}^{a} (i, j) - x_{n}^{e} (i, j))}^{2}}{N}}

(7)

$R M S D$ = root mean square deviation of each test data point (i, j);
$i$ = {1, 2, …, 8} the class/angle number;
$j$ = {1, 2, …,100} testing data number for each class/angle;
$i$ = variable;
$N$ = number of data points (30 bins/points);
$x_{n}^{a}$ = actual (x, y, z) position;
$x_{n}^{e}$ = estimated (x, y, z) position.

However, when we calculated the RMSDs based on the bin/position, there was a significant difference between the starting and ending positions (Formula (8)). As seen from the error bars, the accuracy of the position prediction improved as the movement progressed (Figure 11). Initially, the algorithm made incorrect predictions (high RMSDs), but as time went on and the movement continued, the algorithm made better movement predictions (lower RMSDs).

R M S D (i) = \sqrt{\frac{\sum_{n = 1}^{N = 800} {(x_{n}^{a} (i) - x_{n}^{e} (i))}^{2}}{N}}

(8)

$R M S D$ = root mean square deviation of each test data point (i, j);
$i$ = {1, 2, …,30} number of bins;
$j$ = {1, 2, …, 100} testing data number of each class/angle;
$i$ = variable;
$N$ = 800; the number of testing data points in this bin);
$x_{n}^{a}$ = actual (x, y, z) position;
$x_{n}^{e}$ = estimated (x, y, z) position.

In addition to assessing the performance of our LSTM model using metrics such as the confusion matrix and RMSE, we also utilized D (Deep Shapley Additive Explanations) to interpret the network’s predictions. DeepSHAP is a method that attributes importance scores to each input feature, which can be represented visually as a heatmap indicating which features have the most significant impacts on the model’s prediction. Specifically, we used the DeepSHAP package to generate saliency maps for each virtual arm movement task during the VR simulation, highlighting the EEG signal regions that had the most substantial influences on the LSTM model’s output. Our results demonstrate that the first LSTM model heavily relied on specific frequency bands in the EEG signals, particularly in the beta range (Figure 12). We found that the model assigned more weight to certain electrodes over others, with the motor cortex electrodes having the highest relevance scores in areas around channels F1, FZ, F2, FC3, and FC1. Similarly, the second LSTM model also depended significantly on specific frequency bands in the EEG signals, particularly in the gamma range. The model also placed more weight on certain electrodes over others, with the motor cortex electrodes having the highest relevance scores in areas around channels FC5, FC3, FC1, C3, and C1.

4. Discussion

4.1. Long Short-Term Memory (LSTM) Advantages

The high accuracy achieved by our LSTM model for predicting movement intentions based on EEG signals recorded from the motor cortex is promising, with an average accuracy of 85%. This suggests that our approach has the potential to provide an effective alternative to traditional Virtual Reality controllers. As Virtual Reality becomes more widely adopted in a variety of settings, there is a growing need for more natural and intuitive interfaces that can replace or supplement existing controllers. Our results suggest that EEG-based BCIs could be a viable solution, offering users a more natural and intuitive way to interact with virtual environments. In more detail, we chose the LSTM over the traditional RNN for decoding EEG signals during Virtual Reality tasks. The LSTM’s unique architecture allows it to capture long-term dependencies in sequential data, which is particularly beneficial for EEG signals with inherent noise. Deep neural networks typically struggle with the decoding of EEG signals due to their focus on local recording areas that can be affected by artifacts [46]. However, the LSTM attempts to extract information from both sequential information and the general behavior of the EEG signal, taking into account information from other electrodes [35,37,38]. LSTMs are able to effectively capture these long-term dependencies by using a specialized architecture that includes gates that allow information to be selectively retained or forgotten. This ability to selectively retain or forget information is what enables LSTMs to track long-term dependencies, and it makes them particularly well-suited for tasks involving time series data. This approach leads to a more robust prediction model, as suggested by several studies. Therefore, LSTM’s ability to capture long-term dependencies in EEG data series makes it a more effective tool than traditional RNNs for decoding EEG signals in Virtual Reality tasks.

4.2. Artifacts and Noise Minimization

EEG recordings can be contaminated by noise from various sources, such as electrical equipment, environmental factors, electrode drift, muscle activity, eye blinks, and heart rate. Line noise from electrical equipment typically appears as peaks in the EEG power spectrum at 50 Hz or 60 Hz, depending on the frequency of the local power grid. In our study, we observed line noise at 50 Hz and its harmonics in some EEG channels, which we removed using a notch filter. Other sources of noise, such as muscle activity and eye blinks, can occur across a broad range of frequencies, often overlapping with the frequency ranges of EEG signals of interest. To minimize the impact of these artifacts, we employed a preprocessing pipeline that included band-pass filtering (0.5–35 Hz), artifact rejection using a threshold-based approach, and an independent component analysis (ICA) to remove eye movement artifacts. We also designed the VR simulation in a way that minimized the potential for extraneous muscle activity and movement-related artifacts that might have been introduced by larger body movements. Moreover, in our study, we attempted to reduce movement artifacts by instructing the participants to only move their dominant arm while keeping the rest of their body still. Finally, it should be noted that while we made efforts to remove various types of artifacts from the EEG data, some non-neural artifacts, such as EMG, may have still been present in the signals. Therefore, some of the features extracted by the LSTM model may have incorporated these artifacts and not purely reflected neural activity.

4.3. Participant Limitations

The limitations of the participant demographics in this study are worth noting. In this study, the participants were limited to university students or staff who had experience with similar experiments, which could have introduced bias into the results. The familiarity of the participants with the study procedures may have influenced their performance, making it difficult to generalize the findings to a more diverse population. However, it is important to note that the primary purpose of this study was to establish the feasibility of using the proposed system for motor cortex decoding, rather than to generalize the findings to a broader population. It is acknowledged that if this system is generalized to participants who do not have experience with such a system, the EEG recordings during the VR simulation could be different. Nevertheless, this study is only the first step towards a more extensive research program aimed at expanding the use of this system for patients with mental and neurological illnesses. Future studies should seek to broaden the participant demographics and explore the use of the proposed system in diverse populations to better understand its potential clinical applications.

4.4. Virtual Reality Task Limitations

One important limitation of the study is the repetition of simple movement tasks used in the experiment. While this approach allowed for a more detailed analysis of the neural activity involved in arm movement and more accurate decoding of the participants’ intended movements based on the EEG data, it also limits the generalization of the model. The participants were asked to repeat the same movement 50 times for each task, which may not accurately reflect real-world scenarios where movements are often more complex and varied. Furthermore, the simplicity of the movements may not be representative of the range of movements that people typically perform in their daily lives. Despite these limitations, the use of a relatively repetitive set of movements across multiple virtual objects provides a strong foundation for developing and testing the neural decoder. This approach ensures that the results are robust and generalizable across a range of different participants and movements. However, it is important to acknowledge that the purpose of this study was limited to establishing the feasibility of using the proposed system for motor cortex decoding, and future studies should aim to expand the range of movements and scenarios tested to better understand the potential clinical applications of the system.

4.5. Motor Cortex Activeness

The primary objective of this study was to demonstrate the potential for machine learning to replace traditional hand controllers during Virtual Reality simulations. Although the main focus was not the interpretation of EEG data used in this research, the results of the DeepSHAP analysis provide some useful insights into the underlying neural mechanisms that are involved in controlling virtual arm movements using EEG signals. During the VR simulation, the EEG data recorded from the motor cortex of the brain were found to be highly informative for predicting movement intentions. This finding is consistent with previous studies that have shown that the motor cortex is involved in planning, executing, and controlling movements of the limbs. Additionally, the DeepSHAP analysis revealed that both LSTM models rely heavily on specific frequency bands in the EEG signals, particularly in the beta oscillation range (~13–30 Hz). This finding is in line with previous studies that have shown that beta and gamma oscillations are associated with visual stimuli and pre-movement [47,48,49] and movement activity [50,51,52,53,54,55]. Furthermore, the relevance scores obtained from the DeepSHAP analysis indicate that certain electrodes in the motor cortex have greater impacts on the LSTM model’s output. The motor cortex electrodes with the highest relevance scores were found to be located in areas around channels F1, FZ, F2, FC3, and FC1 for the first model and FC5, FC3, FC1, C3, and C1 for the second model. These findings suggest that the frontal and parietal lobes are involved in arm movements and that the cerebellum fine-tunes this movement, playing a crucial role in determining the limb position.

5. Conclusions

In this study, we demonstrated the potential of combining EEG and VR to create an immersive and interactive experience for users. We developed two neural networks, a classification network and a regression network, both with an LSTM architecture, to decode hand positions from the EEG data. Our evaluation of the networks using various metrics showed that both networks performed well, despite some demographic limitations. Our study suggests that LSTM’s ability to capture long-term dependencies in EEG data series makes it a more effective tool than traditional RNNs for decoding EEG signals in Virtual Reality tasks. Our results indicate that the LSTM model can accurately decode neural signals recorded from the motor cortex during Virtual Reality simulations and identify the key features in the EEG signals that contribute to the model’s predictions.

Our team has been focused on enhancing the immersive qualities of VR simulations for the treatment of mental and neurological conditions. Our goal has been to create virtual environments that are as realistic and immersive as possible to optimize the therapeutic effects of this type of treatment. Initially, we began by exploring ways to replace traditional VR controllers with motion-tracking cameras in order to enhance the immersive qualities [56,57,58]. Following that, we created a system that incorporates doctors in VR simulations during treatment [28]. The COVID-19 pandemic highlighted the need for more remote, interactive treatment systems that allow doctors and patients to communicate and work together remotely [59]. After that, we began to use electrophysiology body sensors to track the behaviors and responses of users during VR experiences [60,61]; by using these sensors and incorporating this real-time adaptive capability, we aimed to enhance the therapeutic value of VR exposure therapy [27,62,63,64]. Moreover, most research on the use of EEG in VR has focused on the active or reactive modulation of EEG signals to directly control or interact with the virtual environment [65,66].

Our next goal is to incorporate this system into a Virtual Reality simulation with the aim of completing more complex movements with a more diverse population with the end goal being to use it as a tool for treating neurological or mental disorders. This approach has the potential to provide a noninvasive and engaging treatment option for these disorders. However, further research is needed to test the feasibility and effectiveness of this approach in real-world scenarios and to optimize the model for various users and scenarios.

Author Contributions

Conceptualization, J.K., A.M. and A.A.; methodology, A.A.; software, J.K. and A.M.; validation, G.A. and D.K.; formal analysis, A.A.; investigation, J.K., A.M. and A.A.; resources, J.K.; data curation, J.K., A.M. and A.A.; writing—original draft preparation, J.K.; writing—review and editing, J.K., A.M. and A.A.; visualization, J.K., A.M. and A.A.; supervision, G.A. and D.K.; project administration, G.A. and D.K.; funding acquisition, G.A. and D.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Krus, M.; Hansen, K.K.; Künzel, H.M. Principles of Neural Science; McGraw-Hill: New York, NY, USA, 2000. [Google Scholar] [CrossRef]
Bear, M.F.; Connors, B.W.; Paradiso, M.A. Neuroscience: Exploring the Brain, 4th ed.; Jones & Bartlett Learning: Burlington, MA, USA, 2015. [Google Scholar]
Donoghue, J.P. Connecting cortex to machines: Recent advances in brain interfaces. Nat. Neurosci. 2002, 5, 1085–1088. [Google Scholar] [CrossRef] [PubMed]
Krusienski, D.J.; McFarland, D.J.; Principe, J.C.; Wolpaw, J.; Wolpaw, E.W. BCI Signal Processing: Feature Extraction. In Brain-Computer Interfaces: Principles and Practice; Oxford University Press: New York, NY, USA, 2012. [Google Scholar] [CrossRef]
Schwartz, A.B. Cortical Neural Prosthetics. Annu. Rev. Neurosci. 2004, 27, 487–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Handbook of Clinical Neurology 3RD Series. In Handbook of Clinical Neurology; Elsevier: Amsterdam, The Netherlands, 2012. [CrossRef]
Zhao, X.; Chu, Y.; Han, J.; Zhang, Z. SSVEP-Based Brain–Computer Interface Controlled Functional Electrical Stimulation System for Upper Extremity Rehabilitation. IEEE Trans. Syst. Man Cybern. Syst. 2016, 46, 947–956. [Google Scholar] [CrossRef]
Thorp, E.B.; Abdollahi, F.; Chen, D.; Farshchiansadegh, A.; Lee, M.-H.; Pedersen, J.P.; Pierella, C.; Roth, E.J.; Gonzalez, I.S.; Mussa-Ivaldi, F.A. Upper Body-Based Power Wheelchair Control Interface for Individuals With Tetraplegia. IEEE Trans. Neural Syst. Rehabil. Eng. 2015, 24, 249–260. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Muller, S.M.T.; Bastos-Filho, T.F.; Sarcinelli-Filho, M. Using a SSVEP-BCI to command a robotic wheelchair. In Proceedings of the ISIE 2011: 2011 IEEE International Symposium on Industrial Electronics, Gdansk, Poland, 27–30 June 2011. [Google Scholar] [CrossRef]
Penaloza, C.I.; Nishio, S. BMI control of a third arm for multitasking. Sci. Robot. 2018, 3, eaat1228. [Google Scholar] [CrossRef] [Green Version]
Peterson, V.; Galván, C.; Hernández, H.; Spies, R. A feasibility study of a complete low-cost consumer-grade brain-computer interface system. Heliyon 2020, 6, e03425. [Google Scholar] [CrossRef]
Negoita, S.; Boone, C.; Anderson, W.S. Long-term Training with a Brain-Machine Interface-Based Gait Protocol Induces Partial Neurological Recovery in Paraplegic Patients. Neurosurgery 2016, 79, N22–N24. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chase, S.M.; Kass, R.E.; Schwartz, A.B.; Lebedev, M.A.; Nicolelis, M.A.L.; Boulay, C.B.; Pieper, F.; Leavitt, M.; Martinez-Trujillo, J.; Sachs, A.J.; et al. Behavioral and neural correlates of visuomotor adaptation observed through a brain-computer interface in primary motor cortex. J. Neurophysiol. 2012, 108, 624–644. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mirabella, G.; Lebedev, M. Interfacing to the brain’s motor decisions. J. Neurophysiol. 2017, 117, 1305–1319. [Google Scholar] [CrossRef]
Horikawa, T.; Aoki, S.C.; Tsukamoto, M.; Kamitani, Y. Characterization of deep neural network features by decodability from human brain activity. Sci. Data 2019, 6, 190012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ye, H.; Liang, L.; Li, G.Y.; Juang, B.-H. Deep Learning-Based End-to-End Wireless Communication Systems With Conditional GANs as Unknown Channels. IEEE Trans. Wirel. Commun. 2020, 19, 3133–3143. [Google Scholar] [CrossRef] [Green Version]
Wang, X.; Liang, X.; Jiang, Z.; Nguchu, B.A.; Zhou, Y.; Wang, Y.; Wang, H.; Li, Y.; Zhu, Y.; Wu, F.; et al. Decoding and mapping task states of the human brain via deep learning. Hum. Brain Mapp. 2019, 41, 1505–1519. [Google Scholar] [CrossRef] [PubMed]
Hussain, I.; Park, S.J. Big-ECG: Cardiographic Predictive Cyber-Physical System for Stroke Management. IEEE Access 2021, 9, 123146–123164. [Google Scholar] [CrossRef]
Park, S.J.; Hussain, I.; Hong, S.; Kim, D.; Park, H.; Benjamin, H.C.M. Real-time Gait Monitoring System for Consumer Stroke Prediction Service. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 4–6 January 2020. [Google Scholar] [CrossRef]
Hussain, I.; Young, S.; Kim, C.H.; Benjamin, H.C.M.; Park, S.J. Quantifying Physiological Biomarkers of a Microwave Brain Stimulation Device. Sensors 2021, 21, 1896. [Google Scholar] [CrossRef]
Hussain, I.; Park, S.-J. Prediction of Myoelectric Biomarkers in Post-Stroke Gait. Sensors 2021, 21, 5334. [Google Scholar] [CrossRef]
Hussain, I.; Park, S.-J. Quantitative Evaluation of Task-Induced Neurological Outcome after Stroke. Brain Sci. 2021, 11, 900. [Google Scholar] [CrossRef]
Hussain, I.; Park, S.J. HealthSOS: Real-Time Health Monitoring System for Stroke Prognostics. IEEE Access 2020, 8, 213574–213586. [Google Scholar] [CrossRef]
Blakey, S.M.; Abramowitz, J.S. The effects of safety behaviors during exposure therapy for anxiety: Critical analysis from an inhibitory learning perspective. Clin. Psychol. Rev. 2016, 49, 1–15. [Google Scholar] [CrossRef]
Bun, P.; Gorski, F.; Grajewski, D.; Wichniarek, R.; Zawadzki, P. Low—Cost Devices Used in Virtual Reality Exposure Therapy. Procedia Comput. Sci. 2017, 104, 445–451. [Google Scholar] [CrossRef]
Gainsford, K.; Fitzgibbon, B.; Fitzgerald, P.B.; E Hoy, K. Transforming treatments for schizophrenia: Virtual reality, brain stimulation and social cognition. Psychiatry Res. 2020, 288, 112974. [Google Scholar] [CrossRef]
Kritikos, J.; Alevizopoulos, G.; Koutsouris, D. Personalized Virtual Reality Human-Computer Interaction for Psychiatric and Neurological Illnesses: A Dynamically Adaptive Virtual Reality Environment that Changes According to Real-Time Feedback from Electrophysiological Signal Responses. Front. Hum. Neurosci. 2021, 15, 596980. [Google Scholar] [CrossRef] [PubMed]
Caravas, P.; Kritikos, J.; Alevizopoulos, G.; Koutsouris, D. Participant Modeling: The Use of a Guided Master in the Modern World of Virtual Reality Exposure Therapy Targeting Fear of Heights. In Proceedings of the Wearables in Healthcare: Second EAI International Conference, HealthWear 2020, Virtual Event, 10–11 December 2020; Springer: Berlin/Heidelberg, Germany, 2021; Volume 376, pp. 161–174. [Google Scholar] [CrossRef]
Song, F. 3D Virtual Reality Implementation of Tourist Attractions Based on the Deep Belief Neural Network. Comput. Intell. Neurosci. 2021, 2021, 9004797. [Google Scholar] [CrossRef] [PubMed]
Jeong, D.; Yoo, S.; Jang, Y. Motion Sickness Measurement and Analysis in Virtual Reality using Deep Neural Networks Algorithm. J. Korea Comput. Graph. Soc. 2019, 25, 23–32. [Google Scholar] [CrossRef] [Green Version]
Xu, S.; Wang, Z.; Sun, J.; Zhang, Z.; Wu, Z.; Yang, T.; Xue, G.; Cheng, C. Retraction to using a deep recurrent neural network with EEG signal to detect Parkinson’s disease. Ann. Transl. Med. 2021, 9, 1396. [Google Scholar] [CrossRef]
Ma, Q.; Wang, M.; Hu, L.; Zhang, L.; Hua, Z. A Novel Recurrent Neural Network to Classify EEG Signals for Customers’ Decision-Making Behavior Prediction in Brand Extension Scenario. Front. Hum. Neurosci. 2021, 15, 610890. [Google Scholar] [CrossRef]
Brantley, J.A.; Luu, T.P.; Ozdemir, R.; Zhu, F.; Winslow, A.T.; Huang, H.; Contreras-Vidal, J.L. Noninvasive EEG correlates of overground and stair walking. In Proceedings of the 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 16–20 August 2016. [Google Scholar] [CrossRef]
Naufel, S.; Glaser, J.I.; Kording, K.P.; Perreault, E.; Miller, L.E. A muscle-activity-dependent gain between motor cortex and EMG. J. Neurophysiol. 2019, 121, 61–73. [Google Scholar] [CrossRef] [Green Version]
Hofmann, S.M.; Klotzsche, F.; Mariola, A.; Nikulin, V.; Villringer, A.; Gaebler, M. Decoding subjective emotional arousal from EEG during an immersive virtual reality experience. eLife 2021, 10, e64812. [Google Scholar] [CrossRef]
Nakagome, S.; Luu, T.P.; He, Y.; Ravindran, A.S.; Contreras-Vidal, J.L. An empirical comparison of neural networks and machine learning algorithms for EEG gait decoding. Sci. Rep. 2020, 10, 4372. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jeong, J.-H.; Shim, K.-H.; Kim, D.-J.; Lee, S.-W. Brain-Controlled Robotic Arm System Based on Multi-Directional CNN-BiLSTM Network Using EEG Signals. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 1226–1238. [Google Scholar] [CrossRef]
Tortora, S.; Ghidoni, S.; Chisari, C.; Micera, S.; Artoni, F. Deep learning-based BCI for gait decoding from EEG with LSTM recurrent neural network. J. Neural Eng. 2020, 17, 046011. [Google Scholar] [CrossRef]
Wu, J.; Ma, Y.; Ren, Z. Rehabilitative Effects of Virtual Reality Technology for Mild Cognitive Impairment: A Systematic Review with Meta-Analysis. Front. Psychol. 2020, 11, 1811. [Google Scholar] [CrossRef]
Repovš, G. Dealing with Noise in EEG Recording and Data Analysis. Inform. Med. Slov. 2010, 15, 18–25. [Google Scholar]
Leske, S.; Dalal, S. Reducing power line noise in EEG and MEG data via spectrum interpolation. Neuroimage 2019, 189, 763–776. [Google Scholar] [CrossRef]
Shad, E.H.T.; Molinas, M.; Ytterdal, T. Impedance and Noise of Passive and Active Dry EEG Electrodes: A Review. IEEE Sensors J. 2020, 20, 14565–14577. [Google Scholar] [CrossRef]
Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef] [Green Version]
Borovkova, S.; Tsiamas, I. An ensemble of LSTM neural networks for high-frequency stock market classification. J. Forecast. 2019, 38, 600–619. [Google Scholar] [CrossRef] [Green Version]
Chen, D.; Zhang, J.; Jiang, S. Forecasting the Short-Term Metro Ridership With Seasonal and Trend Decomposition Using Loess and LSTM Neural Networks. IEEE Access 2020, 8, 91181–91187. [Google Scholar] [CrossRef]
Pooja; Pahuja, S.; Veer, K. Recent Approaches on Classification and Feature Extraction of EEG Signal: A Review. Robotica 2021, 40, 77–101. [Google Scholar] [CrossRef]
Mohseni, M.; Shalchyan, V.; Jochumsen, M.; Niazi, I.K. Upper limb complex movements decoding from pre-movement EEG signals using wavelet common spatial patterns. Comput. Methods Programs Biomed. 2020, 183, 105076. [Google Scholar] [CrossRef] [PubMed]
Sburlea, A.I.; Montesano, L.; de la Cuerda, R.C.; Diego, I.M.A.; Miangolarra-Page, J.C.; Minguez, J. Detecting intention to walk in stroke patients from pre-movement EEG correlates. J. Neuroeng. Rehabil. 2015, 12, 113. [Google Scholar] [CrossRef] [Green Version]
Hasan, S.M.S.; Siddiquee, M.R.; Atri, R.; Ramon, R.; Marquez, J.S.; Bai, O. Prediction of gait intention from pre-movement EEG signals: A feasibility study. J. Neuroeng. Rehabil. 2020, 17, 50. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Özmen, N.G. Eeg Analysis of Real and Imaginary Arm Movements by Spectral Coherence. Uludağ Univ. J. Fac. Eng. 2021, 26, 109–126. [Google Scholar] [CrossRef]
Filimon, F. Human Cortical Control of Hand Movements: Parietofrontal Networks for Reaching, Grasping, and Pointing. Neurosci. 2010, 16, 388–407. [Google Scholar] [CrossRef] [PubMed]
Ball, T.; Demandt, E.; Mutschler, I.; Neitzel, E.; Mehring, C.; Vogt, K.; Aertsen, A.; Schulze-Bonhage, A. Movement related activity in the high gamma range of the human EEG. Neuroimage 2008, 41, 302–310. [Google Scholar] [CrossRef]
Kim, J.-H.; Biessmann, F.; Lee, S.-W. Decoding Three-Dimensional Trajectory of Executed and Imagined Arm Movements from Electroencephalogram Signals. IEEE Trans. Neural Syst. Rehabil. Eng. 2015, 23, 867–876. [Google Scholar] [CrossRef]
Schellekens, W.; Bakker, C.; Ramsey, N.F.; Petridou, N. Moving in on human motor cortex. Characterizing the relationship between body parts with non-rigid population response fields. PLOS Comput. Biol. 2022, 18, e1009955. [Google Scholar] [CrossRef]
Planelles, D.; Hortal, E.; Costa, Á.; Úbeda, A.; Iáez, E.; Azorín, J.M. Evaluating Classifiers to Detect Arm Movement Intention from EEG Signals. Sensors 2014, 14, 18172–18186. [Google Scholar] [CrossRef] [Green Version]
Kritikos, J.; Zoitaki, C.; Tzannetos, G.; Mehmeti, A.; Douloudi, M.; Nikolaou, G.; Alevizopoulos, G.; Koutsouris, D. Comparison between Full Body Motion Recognition Camera Interaction and Hand Controllers Interaction used in Virtual Reality Exposure Therapy for Acrophobia. Sensors 2020, 20, 1244. [Google Scholar] [CrossRef] [Green Version]
Kritikos, J.; Mehmeti, A.; Nikolaou, G.; Koutsouris, D. Fully portable low-cost motion capture system with real-time feedback for rehabilitation treatment. In Proceedings of the International Conference on Virtual Rehabilitation, ICVR, Tel Aviv, Israel, 21–24 July 2019. [Google Scholar] [CrossRef]
Kritikos, J.; Poulopoulou, S.; Zoitaki, C.; Douloudi, M.; Koutsouris, D. Full Body Immersive Virtual Reality System with Motion Recognition Camera Targeting the Treatment of Spider Phobia. In Pervasive Computing Paradigms for Mental Health; Cipresso, P., Serino, S., Villani, D., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; Volume 288, pp. 216–230. [Google Scholar] [CrossRef]
Alevizopoulos, A.; Kritikos, J.; Alevizopoulos, G. Intelligent machines and mental health in the era of COVID-19. Psychiatriki 2021, 32, 99–102. [Google Scholar] [CrossRef]
Kritikos, J.; Caravas, P.; Tzannetos, G.; Douloudi, M.; Koutsouris, D. Emotional stimulation during motor exercise: An integration to the holistic rehabilitation framework. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; Volume 2019, pp. 4604–4610. [Google Scholar] [CrossRef]
Kritikos, J.; Tzannetos, G.; Zoitaki, C.; Poulopoulou, S.; Koutsouris, D. Anxiety detection from Electrodermal Activity Sensor with movement & interaction during Virtual Reality Simulation. In Proceedings of the International IEEE/EMBS Conference on Neural Engineering, NER, San Francisco, CA, USA, 20–23 March 2019. [Google Scholar] [CrossRef]
Tromp, J.; Peeters, D.; Meyer, A.S.; Hagoort, P. The combined use of virtual reality and EEG to study language processing in naturalistic environments. Behav. Res. Methods 2017, 50, 862–869. [Google Scholar] [CrossRef] [Green Version]
Baumgartner, T.; Valko, L.; Esslen, M.; Jäncke, L. Neural Correlate of Spatial Presence in an Arousing and Noninteractive Virtual Reality: An EEG and Psychophysiology Study. CyberPsychology Behav. 2006, 9, 30–45. [Google Scholar] [CrossRef] [Green Version]
Bayliss, J.D.; Ballard, D.H. Single trial P3 epoch recognition in a virtual environment. Neurocomputing 2000, 32–33, 637–642. [Google Scholar] [CrossRef]
Vortmann, L.-M.; Kroll, F.; Putze, F. EEG-Based Classification of Internally- and Externally-Directed Attention in an Augmented Reality Paradigm. Front. Hum. Neurosci. 2019, 13, 348. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Tremmel, C.; Herff, C.; Sato, T.; Rechowicz, K.; Yamani, Y.; Krusienski, D.J. Estimating Cognitive Workload in an Interactive Virtual Reality Environment Using EEG. Front. Hum. Neurosci. 2019, 13, 401. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Brain–Machine Interface workflow.

Figure 2. The virtual environment with objects placed at different positions.

Figure 3. Eight angles that each participant had to approach in order to complete the 8 different motor tasks: (a) task/angle 1; (b) task/angle 2; (c) task/angle 3; (d) task/angle 4; (e) task/angle 5; (f) task/angle 6; (g) task/angle 7; and (h) task/angle 8.

Figure 4. EEG activity was recorded from the motor cortex using a 35-channel cap. The EEG map of 35 channels on the motor cortex is shown.

Figure 5. The EEG channel recordings were divided into three parts: a pre-movement period with a duration of 300 milliseconds (ms), a during-movement period with a duration of 600 ms, and a post-movement period with a duration of 100 ms. This split allowed for the analysis of neural activity at different stages of the movement process, including preparatory activity (pre-movement), movement execution (during-movement), and post-movement activity.

Figure 6. The 3-D hand position movement with (x, y, z) coordinates of one selected trail.

Figure 7. EEG recording storage allowed for the efficient organization and analysis of the EEG data, and it facilitated the process of training and testing the neural networks. (a) EEG recordings were stored in two-dimensional arrays, with the number of rows representing the number of EEG channels and the number of columns representing the number of time windows (bins) for the pre-movement and post-movement periods; (b) the recorded EEG signals were stored in arrays, while the pre-movement period, with a duration of 300 milliseconds (ms), was stored in 15 bins in the second array; (c) the during-movement period, with a duration of 600 ms, was stored in 30 bins in the third array.

Figure 8. LSTM of the first neural network. An input layer containing 35 neurons, representing the number of EEG channels. The first hidden layer contains LSTM blocks of size 15, and there were two additional hidden layers with sizes of 15 and 20, respectively. The output layer had a size of 8, corresponding to the number of different positions.

Figure 9. LSTM of the second neural network. An input layer containing 35 neurons, representing the number of EEG channels. The first hidden layer contains LSTM blocks of size 30, and there were two additional hidden layers with sizes of 30 and 32, respectively. The output layer had a size of 30, corresponding to the number of time windows (bins).

Figure 10. The RMSEs of the second neural network were calculated for each of the 8 angles/classes.

Figure 11. The RMSEs of the second neural network were calculated for each of the 30 bins, with each bin representing a 20 ms chunk of movement over time at one of the eight positions.

Figure 12. The DeepSHAP heatmap. (a) The motor cortex electrodes with the highest relevance scores for the first neural model were located around channels F1, FZ, F2, FC3, and FC1; (b) the motor cortex electrodes with the highest relevance scores for the second neural model were located around channels FC5, FC3, FC1, C3, and C1.

Table 1. Confusion Matrix of the First Neural Network.

	Expected Class/Angles
		1	2	3	4	5	6	7	8
Predicted Class/Angles	1	88	7	3	0	0	2	0	0
	2	2	86	4	0	3	1	4	0
	3	0	2	82	14	0	2	0	0
	4	0	0	16	76	2	4	2	0
	5	0	0	1	9	81	5	4	0
	6	4	0	0	0	5	91	0	0
	7	0	8	1	0	0	0	91	0
	8	2	0	5	0	0	5	0	88

Table 2. Classes.

Class	Precision	Recall/Sensitivity	F1-Score	Specificity	Negative Predictive Value
1	0.88	0.92	0.90	0.98	0.99
2	0.86	0.83	0.85	0.98	0.98
3	0.82	0.73	0.77	0.97	0.96
4	0.76	0.76	0.76	0.97	0.97
5	0.80	0.89	0.84	0.97	0.99
6	0.91	0.83	0.87	0.99	0.97
7	0.91	0.90	0.91	0.99	0.99
8	0.88	1.00	0.94	0.98	1.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kritikos, J.; Makrypidis, A.; Alevizopoulos, A.; Alevizopoulos, G.; Koutsouris, D. Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings. Virtual Worlds 2023, 2, 182-202. https://doi.org/10.3390/virtualworlds2020011

AMA Style

Kritikos J, Makrypidis A, Alevizopoulos A, Alevizopoulos G, Koutsouris D. Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings. Virtual Worlds. 2023; 2(2):182-202. https://doi.org/10.3390/virtualworlds2020011

Chicago/Turabian Style

Kritikos, Jacob, Alexandros Makrypidis, Aristomenis Alevizopoulos, Georgios Alevizopoulos, and Dimitris Koutsouris. 2023. "Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings" Virtual Worlds 2, no. 2: 182-202. https://doi.org/10.3390/virtualworlds2020011

Article Menu

Can Brain–Computer Interfaces Replace Virtual Reality Controllers? A Machine Learning Movement Prediction Model during Virtual Reality Simulation Using EEG Recordings

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

4.1. Long Short-Term Memory (LSTM) Advantages

4.2. Artifacts and Noise Minimization

4.3. Participant Limitations

4.4. Virtual Reality Task Limitations

4.5. Motor Cortex Activeness

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI