Next Article in Journal
Big Data and Energy Poverty Alleviation
Previous Article in Journal
Optimal Number of Choices in Rating Contexts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Emotional Decision-Making Biases Prediction in Cyber-Physical Systems

Integrated Systems Laboratory, Center for Computational Simulation, Universidad Politécnica de Madrid, 28040 Madrid, Spain
*
Authors to whom correspondence should be addressed.
Big Data Cogn. Comput. 2019, 3(3), 49; https://doi.org/10.3390/bdcc3030049
Submission received: 28 June 2019 / Revised: 21 August 2019 / Accepted: 28 August 2019 / Published: 30 August 2019

Abstract

:
This article faces the challenge of discovering the trends in decision-making based on capturing emotional data and the influence of the possible external stimuli. We conducted an experiment with a significant sample of the workforce and used machine-learning techniques to model the decision-making process. We studied the trends introduced by the emotional status and the external stimulus that makes these personnel act or report to the supervisor. The main result of this study is the production of a model capable of predicting the bias to act in a specific context. We studied the relationship between emotions and the probability of acting or correcting the system. The main area of interest of these issues is the ability to influence in advance the personnel to make their work more efficient and productive. This would be a whole new line of research for the future.

1. Introduction

Research in cyber-physical systems integrates the principles of science in the disciplines of computing and engineering to develop new technology. In industrial practice, many engineering systems have been designed by decoupling the control system from the hardware/software implementation details. After the control system is designed and verified by an extensive simulation, ad hoc adjustment methods have been used to address the modeling. The integration of various subsystems, while keeping the system functional and operational, has been time-consuming and costly. The increasing complexity of components and the use of advanced technologies for sensors and actuators, wireless communications, and multicore processors poses a major challenge for building the next generation of control systems.
An important part of cyber-physical systems, which is not commonly considered, is the human one. People are in continuous interaction with the system and their decisions condition its productivity. We have been looking for a new paradigm that links the physical world to the human world, both together, working as one. We found tools coming from other knowledge areas that may be useful for the goal we pursue. The management of the emotions, present in the workforce of any operations team, would be an excellent example for the subject of our study. In several research articles [1,2,3,4,5,6,7,8] we found the different relationships inside the psychological part of the emotional decision-making, and how these emotions can be redirected thanks to an external stimulus trigger. In fact, our main driver is to find and use the link between psychology and the use of bio-metrics, traditional telemetry, or systems signals, using machine-learning techniques, and having a single view of the overall system’s behavior, to be able to interact with it. We found different examples to build our use case [9,10]. A self-explanatory cite from Dr Heinström (2010) [11] is: “We are genetically programmed for a stronger and more instinctive reaction towards negative stimuli than towards positive or neutral ones. This is a survival mechanism developed through evolution, just like the automatic reaction to pain which protects us from physical suffering. The processing of negative stimuli is therefore automatic and immediate, and requires little cognitive capacity [10]. Consequently, every one of us reacts instinctively and intuitively to signals of danger, although some may react more strongly than others”.
This work evaluates from a contextualized point of view of a cyber-physical system, the influence of stimuli and emotions in decision-making around the operations management tasks in a data center.
The study was made in two ways: a general point of view and a specific purpose point of view. The general view study was developed for the entire dataset, obtaining a random forest model capable of predicting with a 85% accuracy the acting (or not acting) of a possible operator inside a data center. The specific purpose model have been focused on the correlation between the external stimulus, the emotion after this stimulus (obtained through the questionnaires) and the final decision (acting/not acting) of the subject. In this case, other random forest model has been used, obtaining with the 85% accuracy the action (or not action) of the subject.

2. Materials and Methods

For implementing the whole experiment, we selected a significant sample of people working in technology that must make decisions every day in their job. We used the statistical tools to contrast if the sample is relevant for the study. Then, we decided the data that we needed them to answer, and evaluated their ability to trigger actions or decisions based on emotions. We relied on the studies from psychology that will be referenced across this article, to build the questionnaire and make the right assumptions for building the experiment.
After that, we decided what would be the best procedure or tool to determine the predictive model for decision-making, based on the information that we had. We used Python for the development of the models using techniques of machine-learning. Finally, we contrasted the different results from the machine-learning techniques with other statistical procedures and tools to confirm the hypothesis, looking for a new way of contrasting the findings from other disciplines, such as humanities, social sciences, or health, i.e., Psychology.

2.1. About the Sample

In the quest for closing the loop of actuation, on the emotion awareness and influence, we decided to run an experiment with people from different sex and ages. The total number of the sample is 100, and to be sure that the sample was representative of the universe here, we made some calculations. The calculation of the size of the sample is one of the aspects to be specified in the previous phases of commercial research and determines the degree of credibility that we will grant to the results obtained.
For the sample size estimation, n, we supposed a Gaussian distribution with confidence level of 95.5% [12,13].
n = k 2 · p · q · N e 2 · ( N - 1 ) + K 2 · p · q
N: It is the size of the population or universe (total number of possible respondents).
k: It is a constant that depends on the level of confidence that we assign (see Table 1). The level of confidence indicates the probability that the results of our research are true: a 95.5% confidence is the same as saying that we can be wrong with a probability of 4.5%.
e: It is the desired sampling error. The sampling error is the difference that can exist between the result that we obtain by asking a sample of the population and the one that we would obtain if we asked the total of it. In our experiment: if the results of our survey say that 100 people would take an action and we have a sampling error of 5%, they will act between 95 and 105 people.
p: It is the proportion of individuals who possess the characteristic of study in the population. This data is generally unknown, and it is usually assumed that p= q = 0.5 which is the safest option.
q: It is the proportion of individuals who do not possess this characteristic, i.e., 1 − p.
n: It is the size of the sample (number of surveys).
However, practical criteria based on experience or simple logic are usually used to calculate the sample size. Some of the most used methods are the following:
  • The budget that we have available for research.
  • Experience in similar studies.
  • The representation of each group considered: choose from each of them a sufficient number of respondents so that the results are indicative of the opinion of that group.
  • Calculations for our study: Contrast the percentage of total active workers in a country with the people working in data center Management related activities. If the population of the country is 47 million people, and we find that 19,564,600 is the registered active workers, the number of them estimate that 5% of this population is related to this specialty (p = 0.05 and q = 0.95), we want a confidence of 95.5% that determines that k = 2 and we are willing to assume a sampling error of 5% (e) we would need a sample of at least 76 people to be representative. In our specific study:
    N: 19,564,600 (total active workers in Spain 19 March 2019)
    k: 2 (for 95.5% confidence level)
    e: 5%
    p: 0.05 (proportion of people that would match the desired profile)
    q: 0.95 (rest of the population proportion)
    Calculating the sample size:
    n: 76 is the minimum sample size for our study (we have 100).

2.2. About the Questionnaire

We built a questionnaire taking the example from previous works on decision-making techniques science articles [7,14,15,16] and added some control variables. The objective of the questionnaire and the stimulus is to verify if, at the end of the study, the subject takes action to the problem presented in the stimulus. This action can be either to solve the problem by their own or to alert a superior.
During this phase, first, we asked to the participants to define themselves in terms of mood/emotions, (Angry: Emotion = 1, Happiness: Emotion = 2, Neutral: Emotion = 3, Sadness: Emotion = 4, Surprise: Emotion = 5) before starting 10 questions related to general decision-making. Then we introduce an external stimulus, consisting of a bad or good news and question again to self-analyze the dominant emotion/mood. Finally, we ask to the individuals if they are willing to act/raise the problem to the chain of command. Therefore, with this study we obtained 24 variables:
  • Emotion A It is the initial emotion/mood recorded at the beginning of the experiment. It is indicated as a number from 1–5 that corresponds with the emotions above specified.
  • Question 1A: When I make an important decision, for me, it is essential to overcome doubtful aspects. Questions with the suffix A indicate that no stimulus has been introduced. Every question is evaluated on a scale of 1 to 9 (being 1 the minimum score).
  • Question 2A: When I make an important decision, for me, it is essential to organize the actions depending on the time.
  • Question 3A: When I make an important decision, for me, it is essential to define the desired goals.
  • Question 4A: When I make an important decision, for me, it is essential to accept responsibility for the decision.
  • Question 5A: When I make an important decision, for me, it is essential to be motivated to make decision.
  • Question 6A: When I make an important decision, for me, it is essential to generate emotions that will help me decide.
  • Question 7A: When I make an important decision, for me, it is essential to reflect on the need to make the decision.
  • Question 8A: When I make an important decision, for me, it is essential to plan the actions to be performed.
  • Question 9A: When I make an important decision, for me, it is essential to make decisions without external pressure.
  • Question 10A: When I make an important decision, for me, it is essential to take the goals of the business into account.
  • Stimulus: Is the external news. it is a binary variable, as the news can be positive/good (0) or a negative/bad (1).
  • Emotion B: After the news, the individuals are asked again to state the predominant emotion/mood they feel. Just like Emotion A, the emotion is indicated as a number from 1–5.
  • Question 1B: When I make an important decision, for me, it is essential to overcome doubtful aspects. It is important to note that questions with the suffix B indicate that the stimulus (news) has been carried out.
  • Question 2B: When I make an important decision, for me, it is essential to organize the actions depending on the time.
  • Question 3B: When I make an important decision, for me, it is essential to define the desired goals.
  • Question 4B: When I make an important decision, for me, it is essential to accept responsibility for the decision.
  • Question 5B: When I make an important decision, for me, it is essential to be motivated to make decision.
  • Question 6B: When I make an important decision, for me, it is essential to generate emotions that will help me decide.
  • Question 7B: When I make an important decision, for me, it is essential to reflect on the need to make the decision.
  • Question 8B: When I make an important decision, for me, it is essential to plan the actions to be performed.
  • Question 9B: When I make an important decision, for me, it is essential to make decisions without external pressure.
  • Question 10B: When I make an important decision, for me, it is essential to take the goals of the business into account.
  • Decision: After the stimulus and the ten B questions, we ask the subjects about their willingness to act and raise the problem according to their situation. It is a binary variable, if they decide to act/raise the problem to the chain of command, the result is 1 and if they do not, it takes the 0 value.

2.3. Machine-Learning Techniques

No single algorithm dominates when choosing a machine-learning model. Some perform better with large data sets and some perform better with high-dimensional data. Thus, it is important to assess a model’s effectiveness for our particular data set. In this article, we will give a high-level overview of how random forest works and discuss the real-world advantages and drawbacks of this model in the Appendix A.
Neural Networks can be used with small datasets as well, but it depends on the classes we are trying to split. For instance, if we just try to classify black versus white images, we will need very few training examples. Moreover, there are cases that do not allow us to have much training examples, for instance, the case of predicting tsunamis, so Neural Networks are totally valid for small datasets (as long as we do not get overtrained networks). But as we have data with not clearly separated classes and with few variables, Neural Networks do not make sense in this problem. For this reason, we considered random forest.
Random Forest is a good model if we want high performance with less need for interpretation [17]. This classification technique was introduced by Breiman [18] and it has been used in numerous articles [19,20,21,22,23,24], but its application in psychology [9,10] and decision-making fields is scarce.
The outcome of this experiment will be explained in the next section, Results.

3. Results

3.1. The Final Results

As mentioned before, the study was made in two ways: a general point of view and a specific purpose point of view. Both based on the same data, which is made up of 100 samples (the stimulus is totally balanced-50 positive and 50 negative news- and the proportion of action over not action is 51–49).
The approach of this study is to develop two different models to predict, on the one hand, the capacity of action (or not action) of a subject through the whole available variables obtained in our study and, on the other hand, the relation between the stimulus, the emotion and the decision-making after the stimulus, establishing a connection with the final action of the subject. For that, we developed with Python two random forest models. We studied the correlation between variables and their p-value, rejecting the variables that had not a statistic significance p = 0.05. We made different studies and tests with the models to find the best hyperparameters of the algorithm that adjust to our problem, obtaining the final models detailed below.
In this way, the general view study was developed for the entire dataset (formed by the 24 variables, except the ones rejected by their p-value), obtaining a random forest model capable of predicting with a 85% accuracy the acting (or not acting) of a possible operator inside a data center (the metrics obtained for this model are represented in Table 2). The specific purpose model was focused on the correlation between the external stimulus, the emotion after this stimulus (obtained through the questionnaires) and the final decision (acting/not acting) of the subject. In this case, we used other random forest model, obtaining with the 85% accuracy the action (or not action) of the subject (see Table 2 for the details of the metrics obtained for this model). This specific random forest model was made with only a part of the dataset (formed by the variables corresponding to the stimulus, emotion after stimulus, B questions, and the decision variable).

3.1.1. General Random Forest Model

The first model was able to predict, with all the data flow (the emotion and decision-making answers to the questionnaires, the stimulus and the emotion and decision-making answers to the questionnaires after stimulus), the probability of acting/reporting to the chain of command. The importance of the model variables is shown in Figure 1.
We run two processes for evaluating the data. The first one was to verify the correlation between the variables and reject, in case there were two variables with a high correlation (higher than 95%), one of them (the correlation matrix of the variables is shown in Figure 2). The second process was to make a p-value analysis for rejecting the variables that had not a statistic significance p = 0.05.
As shown in Figure 2, there is no variable with a correlation higher than 95%, but after the p-value analysis some variables were rejected. The distribution and frequencies of the selected variables after processing are shown in Figure 3 and Figure 4, respectively.
As we can see in Figure 3, the variables Emotion B (emotion obtained after stimulus) and Stimulus have a clear distribution, Emotion B = 2 (happiness) tends to not acting whereas Stimulus = 1 (equivalent to negative news) produces an active response, driving subjects to action.

3.1.2. Specific Random Forest Model

Apart from having a general model capable of predicting the action of a subject into a data center, it was very interesting to know the importance of stimuli and emotions in decision-making. For this reason, a second model was developed. The selected variables for the analysis have been Stimulus, Emotion B, and the answers of the decision-making process after the stimulus. The main goal is to establish the probability of action according to an emotion and stimulus, and the reason we developed this experiment.
The procedure was identical to the general model. We evaluated the correlation and the p-value of the variables according to the predicted label (Decision to act). The correlation matrix of this specific model is shown in Figure 5, and the analysis of the distribution and frequencies of selected variables (after p-value processing (p = 0.05)) is shown in Figure 6 and Figure 7, respectively.
Paying attention to Figure 6, Emotion B = (2,3) (2-happiness and 3-neutral) tends to not act and Stimulus = 1 (negative news) drives the subject to action.
We obtained Table 3, evaluating the probability of the willingness to act from the participants according to their input stimulus . As we can see in Table 3 a negative stimulus always increases the probability to act and the emotions that drive us to an actuation are sadness and surprise.
We used Weka [25], the Machine-Learning software, (also used for data mining) to check the accuracy and prediction of other models. In particular, we used: trees J48, random forest, and random trees. The results where almost identical, but the ones obtained with our models and processing achieved better outcomes (see Appendix B for more details). It is important to note that these models were implemented from a collection of machine-learning algorithms that Weka has previous established. Weka did the computation and chose the final model that better adjusts to the dataset. But we did not any of our preprocessing nor adjustment of hyperparameters (as we did in Python for our random forest models).

4. Discussion

In this research study we highlighted the importance of selecting the right information and the appropriate modeling techniques for the different data sources. We used machine-learning techniques to solve the problem of predicting the willingness of to act from the users of a cyber-physical system in a specific context.
Thanks to the different relationships between the psychological part of the emotional decision-making process, and how these emotions can be re-conducted using an external stimulus triggers, this work evaluates from a contextualized point of view of a cyber-physical system, the influence of stimuli and emotions in decision-making around the operations management rasks in a data center.
The study was able to predict the action of a possible operator inside a data center with an 85% accuracy. The results obtained about stimulus and actions reinforce the theories from other articles [11,26] about negative stimuli and how humans are genetically programmed for a stronger and more instinctive reaction towards negative stimuli than towards positive or neutral ones. The processing of negative stimuli is therefore automatic and immediate, and requires little cognitive capacity [27,28]. Fear is also an important factor in negative stimuli, as it increases the risk perception [4], so it increases the probability of the acting or escalating an issue to the chain of command compared with the participants that received a positive stimuli.
We have found an important work field for the future, using emotional variables. We can translate subjective information into technical information and act with it using the commonest predictive modeling techniques. We found tools coming from other knowledge areas that are useful for the goal we pursue: to establish a direct relationship between psychological or subjective variables and technical or numerical variables.
There is a clear path for closing the loop and being able to aid to the operations personnel to enhance their skills and to help them to automate the commonest tasks. This is the way we can be more productive, in specially changing environments, such as Smart Cities, IoT, Edge Datacenters. This will be the main research area for the authors of this article. The future of Telecommunications will give us a new field for improvement, as we will have to move the services closer to end-user with high value information and a very low latency.

Author Contributions

Conceptualization, A.C.; Methodology, A.C. and J.M.M.; Software, A.C. and M.R.; validation, A.C. and J.M.M.; formal analysis, A.C. and M.R.; Investigation, A.C.; Resources, A.C. and M.R.; Data curation, A.C. and M.R.; writing—original draft preparation, A.C. and M.R.; writing—review and editing, A.C. and J.M.M.; Visualization, A.C.; Supervision, J.M.M.

Acknowledgments

Special Thanks to ERIS INNOVATION (www.erisinnovation.com) for its administrative and technical support, and the materials used for the experiments.

Conflicts of Interest

The authors declare no conflict of interest to publish the results.

Appendix A. Machine-Learning Techniques Used in This Article

Appendix A.1. Random Forest

As we stated previously, no single algorithm dominates when choosing a machine-learning model. Some perform better with large data sets and some perform better with high-dimensional data. Thus, it is important to assess the model effectiveness for our particular data set. In this appendix, we will give a high-level overview of how random forest works and discuss the real-world advantages and drawbacks of this model.
Random Forest is a good model if we want high performance with less need for interpretation [17]. This classification technique was introduced by Breiman (2001) [18] and it has been used in numerous articles [19,20,21,22,23,24], but its application in psychology [9,10] and decision-making fields is scarce.
What is random forest? Random forests are bagged decision tree models that split on a subset of features on each split. We will look at a single decision tree, then discuss bagged decision trees and finally introduce splitting on a random subset of features.
A decision tree splits the data into smaller data groups based on the features of the data until we have a small enough set of data that only has data points under one label. For example: a decision tree of whether one should play tennis.
In the example above (Figure A1), the decision tree is split on multiple features until we reach a conclusion of “Yes”, we should play tennis, or “No” we should not play tennis. Follow the lines along the tree to determine the decision. For example, if the outlook is overcast, then “Yes” we should play tennis. If the outlook is sunny and humidity is high, then “No” we should not play tennis.
In a decision tree model, these splits are chosen according to a purity measure, i.e., at each node, we want information gain to be maximized. For a regression problem, we consider residual sum of square (RSS) and for a classification problem, we consider the Gini index or entropy.
Figure A1. Decision Tree.
Figure A1. Decision Tree.
Bdcc 03 00049 g0a1

Appendix A.1.1. Bagged Trees

After the decision tree concept, we will apply the principles of bootstrapping to create bagged trees. Bootstrapping is a sampling technique (Figure A2) in which we randomly sample with replacement from the data set. When bootstrapping, we use only about 2/3 of the data. The approximately 1/3 of the data (“out-of-bag” data) is not used in the model and can conveniently be used as a test set. Bagging, or bootstrap aggregating, is where we create bagged trees by creating X number of decision trees that are trained on X bootstrapped training sets. The final predicted value is the average value of all our X decision trees. One single decision tree has high variance (tends to overfit), so by bagging or combining many weak learners into strong learners, we are averaging away the variance.
Figure A2. Classification.
Figure A2. Classification.
Bdcc 03 00049 g0a2

Appendix A.1.2. Random Forest

Random forest improves on bagging because it decorrelates the trees with the introduction of splitting on a random subset of features. This means that at each split of the tree, the model considers only a small subset of features rather than all of the features of the model, i.e., from the set of available features n, a subset of m features (m = square root of n) are selected at random. This is important so that variance can be averaged away. Consider what would happen if the data set contains a few strong predictors. These predictors will consistently be chosen at the top level of the trees, so we will have very similar structured trees. In other words, the trees would be highly correlated.
Therefore, in summary of what was stated initially, random forests are bagged decision tree models that split on a subset of features on each split.
Whether we have a regression or classification task, random forest is an applicable model for our needs. It can handle binary features, categorical features, and numerical features. There is very little preprocessing that needs to be done. The data does not need to be re-scaled or transformed.
They are parallelizable, meaning that we can split the process to multiple machines to run. This results in faster computation time. Boosted models are sequential in contrast, and would take longer to compute.
It is faster to train than decision trees because we are working only on a subset of features in this model, so we can easily work with hundreds of features. Prediction speed is significantly faster than training speed because we can save generated forests for future uses. Random forest handles outliers by essentially binning them. It is also indifferent to non-linear features.
It has methods for balancing error in class population unbalanced data sets. Random forest tries to minimize the overall error rate, so when we have an unbalance data set, the larger class will get a low error rate while the smaller class will have a larger error rate.
Each decision tree has a high variance, but low bias. But because we average all the trees in random forest, we are averaging the variance as well so that we have a low bias and moderate variance model. Model interpretability: Random forest models are not all that interpretable; they are like black boxes. For very large data sets, the size of the trees can take up a lot of memory. It can tend to overfit, so we should tune the hyperparameters.

Appendix B. Weka Models

In this section, we collected the results obtained with different models developed using Weka [25]. The Figure A3 and Figure A4 show the results of the general and specific tree J48 models, respectively. Results for the random forest models and random tree ones are shown in Figure A5, Figure A6, Figure A7, Figure A8.

Appendix B.1. Trees J48

Appendix B.1.1. General Model

Figure A3. Trees J48 Results for the General Model.
Figure A3. Trees J48 Results for the General Model.
Bdcc 03 00049 g0a3

Appendix B.1.2. Specific Model

Figure A4. Trees J48 Results for the Specific Model.
Figure A4. Trees J48 Results for the Specific Model.
Bdcc 03 00049 g0a4

Appendix B.2. Random Forest

Appendix B.2.1. General Model

Figure A5. Random Forest Results for the General Model.
Figure A5. Random Forest Results for the General Model.
Bdcc 03 00049 g0a5

Appendix B.2.2. Specific Model

Figure A6. Random Forest Results for the Specific Model.
Figure A6. Random Forest Results for the Specific Model.
Bdcc 03 00049 g0a6

Appendix B.3. Random Tree

Appendix B.3.1. General Model

Figure A7. Random Tree Results for the General Model.
Figure A7. Random Tree Results for the General Model.
Bdcc 03 00049 g0a7

Appendix B.3.2. Specific Model

Figure A8. Random Tree Results for the Specific Model.
Figure A8. Random Tree Results for the Specific Model.
Bdcc 03 00049 g0a8

References

  1. Seunghee, H.; Lerner, J.S.; Keltner, D. Feelings and Consumer Decision Making: The Appraisal-Tendency Framework. J. Consum. Psychol. 2007, 17, 158–168. [Google Scholar]
  2. Lerner, J.S.; Li, Y.; Valdesolo, P.; Kassam, K. Emotion and Decision Making. Annu. Rev. Psychol. 2015, 66, 799–823. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Russell, J.A.; Pratt, G. A Description of the Affective Quality Attributed to Environments. J. Personal. Soc. Psychol. 1980, 38, 311–322. [Google Scholar] [CrossRef]
  4. Lerner, J.S.; Keltner, D. Beyond valence: Toward a model of emotion-specific influences on judgement and choice. Cognit. Emot. 2000, 14, 473–493. [Google Scholar] [CrossRef]
  5. Schwarz, N. Emotion, cognition, and decision making. Cognit. Emot. 2000, 14, 433–440. [Google Scholar] [CrossRef]
  6. Velásquez, J.D. Modeling Emotion-Based Decision-Making. In Emotional and Intelligent: The Tangled Knot of Cognition, Proceedings of the 1998 AAAI Fall Symposium, Orlando, FL, USA, 22–24 October 1998; Cañamero, D., Ed.; AAAI Press: Palo Alto, CA, USA, 1998; pp. 164–169. [Google Scholar]
  7. Sanz de Acedo, M.L.; Sanz de Acedo, M.T.; Cardelle-Elawar, M. Factors that affect decision making: Gender. Int. J. Psychol. Psychol. Ther. 2007, 7, 381–391. [Google Scholar]
  8. Bechara, A.; Damasio, H.; Damasio, A.R. Emotion, Decision Making and the Orbitofrontal Cortex. Cereb. Cortex 2000, 10, 295–307. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  9. Hayes, T.; Usami, S.; Jacobucci, R.; McArdle, J.J. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations. Psychol. Aging 2015, 30, 911–929. [Google Scholar] [CrossRef] [PubMed]
  10. Fillingim, R.B.; Ohrbach, R.; Greenspan, J.D.; Knott, C.; Diatchenko, L.; Dubner, R.; Bair, E.; Baraian, C.; Mack, N.; Slade, G.D.; et al. Psychological Factors Associated With Development of TMD: The OPPERA Prospective Cohort Study. J. Pain 2013, 14, T75–T90. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Heinström, J. Negative affectivity—The emotional dimension. In From Fear to Flow: Personality and Information Interaction; Chandos: Amsterdam, The Netherlands, 2010; pp. 75–103. [Google Scholar]
  12. Fisher, R.A. Statistical Methods and Scientific Inference; Oliver and Boyd: Edinburgh, UK, 1956; p. 32. [Google Scholar]
  13. Freund, J.E. Mathematical Statistics Prentice Hall; Prentice Hall: Englewood Cliffs, NJ, USA, 1962; pp. 227–228. [Google Scholar]
  14. Colligan, L.; Potts, H.W.W.; Finn, C.T.; Sinkin, R.A. Cognitive workload changes for nurses transitioning from a legacy system with paper documentation to a commercial electronic health record. Int. J. Med. Inf. 2015, 84, 469–476. [Google Scholar] [CrossRef] [PubMed]
  15. Hart, S.G.; Staveland, L.E. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Human Mental Workload; Hancock, P.A., Meshkati, N., Eds.; Elsevier: Amsterdam, The Netherlans, 1988; pp. 139–183. [Google Scholar]
  16. Hart, S.G.; NASA Task Load Index (TLX). Paper and Pencil Package; NASA Center, Ames Research Center: Mountain View, CA, USA, 1986; Volume 1.0. [Google Scholar]
  17. Kho, J. Why Random Forest Is My Favorite Machine Learning Model. Towards Data Sci. 2008. Available online: https://towardsdatascience.com/why-random-forest-is-my-favorite-machine-learning-model-b97651fa3706 (accessed on 28 August 2019).
  18. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  19. Prasad, A.M.; Iverson, L.R.; Liaw, A. Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction. Ecosystems 2006, 9, 181–199. [Google Scholar] [CrossRef]
  20. Chen, X.-W.; Liu, M. Prediction of protein–protein interactions using random decision forest framework. Bioinformatics 2005, 21, 4394–4400. [Google Scholar] [CrossRef] [PubMed]
  21. Palmer, D.S.; O’Boyle, N.M.; Glen, R.C.; Mitchell, J.B.O. Random Forest Models To Predict Aqueous Solubility. J. Chem. Inf. Model. 2007, 47, 150–158. [Google Scholar] [CrossRef] [PubMed]
  22. Jiang, P.; Wu, H.; Wang, W.; Ma, W.; Sun, X.; Lu, Z. MiPred: Classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Res. 2007, 35, W339–W344. [Google Scholar] [CrossRef] [PubMed]
  23. Qi, Y.; Klein-Seetharaman, J.; Bar-Joseph, Z. Random Forest Similarity for Protein-Protein Interaction Prediction from Multiple Sources. Biocomputing 2005, 10, 531–542. [Google Scholar]
  24. Cutler, D.R.; Edwards, T.C., Jr.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random Forest for Classification in Ecology. Ecol. Ecol. Soc. Am. 2007, 88, 2783–2792. [Google Scholar] [CrossRef]
  25. Frank, E.; Hall, M.A.; Witten, I.H. The WEKA Workbench. In Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: Amsterdam, The Netherlands, 2016. [Google Scholar]
  26. Miyazawa, S.; Iwasaki, S. Effect of negative emotion on visual attention: Automatic capture by fear-related stimuli. Jpn. Psychol. Res. 2009, 51, 13–23. [Google Scholar] [CrossRef]
  27. Bradley, M.M. Emotional memory: A dimensional analysis. In The Emotions: Essays on Emotion Theory; Elsevier: Amsterdam, The Netherlans, 1994; pp. 97–134. [Google Scholar]
  28. Bradley, M.M.; Lang, P.J. Measuring emotion: The self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 1994, 25, 49–59. [Google Scholar] [CrossRef]
Figure 1. Importance of Model Variables.
Figure 1. Importance of Model Variables.
Bdcc 03 00049 g001
Figure 2. Correlation Matrix.
Figure 2. Correlation Matrix.
Bdcc 03 00049 g002
Figure 3. Distribution of variables.
Figure 3. Distribution of variables.
Bdcc 03 00049 g003
Figure 4. Frequencies of variables.
Figure 4. Frequencies of variables.
Bdcc 03 00049 g004
Figure 5. Correlation Matrix.
Figure 5. Correlation Matrix.
Bdcc 03 00049 g005
Figure 6. Distribution of variables.
Figure 6. Distribution of variables.
Bdcc 03 00049 g006
Figure 7. Frequencies of variables.
Figure 7. Frequencies of variables.
Bdcc 03 00049 g007
Table 1. The most used k values and their confidence levels.
Table 1. The most used k values and their confidence levels.
k1.151.281.441.651.9822.58
Confidence level75%80%85%90%95%95.5%99%
Table 2. Model Metrics.
Table 2. Model Metrics.
MetricsModels
General Random ForestSpecific Random Forest
Accuracy0.850.85
F10.840.87
Precision0.730.75
Recall0.801.00
Average Precision0.810.77
Area under the ROC curve0.910.98
Table 3. Probabilities to Action according to Stimulus and Emotion.
Table 3. Probabilities to Action according to Stimulus and Emotion.
StimulusEmotion BProbability to Act
010.157
20.044
30.434
40.677
50.850
110.742
20.672
30.906
40.990
50.990
Positive news: Stimulus = 0, Negative news: Stimulus = 1, Angry: Emotion = 1, Happiness: Emotion = 2, Neutral: Emotion = 3, Sadness: Emotion = 4, Surprise: Emotion = 5.

Share and Cite

MDPI and ACS Style

Corredera, A.; Romero, M.; Moya, J.M. Emotional Decision-Making Biases Prediction in Cyber-Physical Systems. Big Data Cogn. Comput. 2019, 3, 49. https://doi.org/10.3390/bdcc3030049

AMA Style

Corredera A, Romero M, Moya JM. Emotional Decision-Making Biases Prediction in Cyber-Physical Systems. Big Data and Cognitive Computing. 2019; 3(3):49. https://doi.org/10.3390/bdcc3030049

Chicago/Turabian Style

Corredera, Alberto, Marta Romero, and Jose M. Moya. 2019. "Emotional Decision-Making Biases Prediction in Cyber-Physical Systems" Big Data and Cognitive Computing 3, no. 3: 49. https://doi.org/10.3390/bdcc3030049

Article Metrics

Back to TopTop