Considerations for Developing Robot-Assisted Crisis De-Escalation Practices

Pierce, Kathryn; Pepler, Debra J.; Craig, Stephanie G.; Jenkin, Michael

doi:10.3390/app13074337

Open AccessReview

Considerations for Developing Robot-Assisted Crisis De-Escalation Practices

by

Kathryn Pierce

^1,*,

Debra J. Pepler

¹

,

Stephanie G. Craig

² and

Michael Jenkin

³

¹

Department of Psychology, York University, Toronto, ON M3J 1P3, Canada

²

Department of Psychology, University of Guelph, Guelph, ON N1G 2W1, Canada

³

Lassonde School of Engineering, York University, Toronto, ON M3J 1P3, Canada

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(7), 4337; https://doi.org/10.3390/app13074337

Submission received: 7 February 2023 / Revised: 15 March 2023 / Accepted: 23 March 2023 / Published: 29 March 2023

(This article belongs to the Special Issue Advanced Human-Robot Interaction)

Download

Browse Figure

Versions Notes

Abstract

:

Robots are increasingly entering the social sphere and taking on more sophisticated roles. One application for which robots are already being deployed is in civilian security tasks, in which robots augment security and police forces. In this domain, robots will encounter individuals in crisis who may pose a threat to themselves, others, or personal property. In such interactions with human police and security officers, a key goal is to de-escalate the situation to resolve the interaction. This paper considers the task of utilizing mobile robots in de-escalation tasks, using the mechanisms developed for de-escalation in human–human interactions. What strategies should a robot follow in order to leverage existing de-escalation approaches? Given these strategies, what sensing and interaction capabilities should a robot be capable of in order to engage in de-escalation tasks with humans? First, we discuss the current understanding of de-escalation with individuals in crisis and present a working model of the de-escalation process and strategies. Next, we review the capabilities that an autonomous agent should demonstrate to be able to apply such strategies in robot-mediated crisis de-escalation. Finally, we explore data-driven approaches to training robots in de-escalation and the next steps in moving the field forward.

Keywords:

human–robot interaction; de-escalation; security; social robotics; reinforcement learning

1. Introduction

As the development of artificial intelligence (AI) and robotics technology continues to accelerate, mobile robots are increasingly entering the social sphere and taking on more sophisticated roles. For example, robots have been designed to provide education (e.g., [1]), health and home care (e.g., [2]), autism therapy (e.g., [3]), and security services [4]. Regarding the latter task, a number of robots have been developed to assist in security and policing tasks. The Knightscope security robot is a representative example. This robot currently patrols parking lots and structures, shopping malls, hospitals, airports, and corporate campuses across the United States, providing real-time information and surveillance for security teams [5].

There are many potential applications for a security robot. The device could act as a mobile sensor providing information to remote human staff or it could act as a deterrent to crime just by its presence in a particular environment. An expected task in the security setting is one in which the robot interacts with members of the community. This interaction might include relatively simple tasks such as having the robot provide information or directions. Robots could also be expected to engage in complex tasks, such as interacting with an agitated, confrontational, or potentially aggressive individual with the goal of defusing the situation. This process is known as de-escalation.

If robots can engage in effective crisis de-escalation, they could potentially help defuse situations when human personnel are not readily available, or act as part of an on-site team to help shift an agitated individual towards a state of greater calm and control. When human first responders are faced with the task of de-escalating an agitated person, there is potential for physical harm to both the first responder and the community member [6]. Robots would seem to be an attractive enhancement to existing de-escalation strategies as the robot could at least be used to reduce the risk to the responder. Furthermore, in such an encounter, just as with a human responder, a robot could take steps to secure the environment (e.g., lock doors, remove objects which could be used as weapons) and call for human assistance. Of course, it is not possible to diffuse or de-escalate every possible interaction scenario. The more realistic goal is to develop and implement strategies to diffuse or de-escalate situations as much as possible. The question remains: Can a robot effectively employ crisis de-escalation strategies in such interactions, and if so, when and how should robots be used in crisis de-escalation?

This paper considers the task of utilizing mobile robots in de-escalation tasks. Section 2 reviews the literature on de-escalation with individuals in crisis. Section 3 summarizes the basic requirement for social robots involved in de-escalation and presents a working model for the process of, and tasks involved in, crisis de-escalation. Section 4 focuses on integrating de-escalation practices into human–robot interactions, including considerations for assessing the individual in crisis and the surrounding environment, planning and orchestrating a response, and communication strategies and principles that support de-escalation. Section 5 discusses data-driven approaches to training robots in de-escalation, and Section 6 focuses on the next steps in moving the field forward.

2. Crisis De-Escalation

For the purposes of this paper, a crisis is defined as a short-term and overwhelming event involving a disruption of an individual’s normal and stable state, in which the usual coping and problem-solving mechanisms do not work [7]. In this paper, we further narrow our lens by focusing specifically on crises in which there is an element of agitation and the potential for aggression. For a safe resolution of the crisis, it is critical for responders to effectively interrupt the progression of agitation to violence [8]. One way of interrupting this progression is through the implementation of de-escalation strategies. De-escalation strategies refer to a complex range of verbal and non-verbal skills designed to enable agitated individuals to rapidly develop their own internal locus of control [9,10] and to shift away from a trajectory that may lead to aggression.

De-escalation is commonly practiced by health care workers and mental health nurses, with an increasing uptake in police and security services [11]. Notably, there is no consensus regarding a clear definition of optimal de-escalation techniques or guidelines [12]. Furthermore, most research on de-escalation focuses on the impact of de-escalation training programs (e.g., Crisis Intervention Training, Six Core Strategies) on reducing physical restraint, injury, or violence towards service providers. Although research on the effectiveness of training is relevant for advancing the field of human-led de-escalation, it does not elucidate which intervention strategies promote effective de-escalation. For example, most de-escalation training programs contain a significant psychoeducational component designed to debunk myths about violent behaviour and equip staff with a better understanding of the underlying psychology of agitated individuals in crisis. The rationale for such training is that, with such information, staff will perceive and approach agitated individuals in a more appropriate manner, perhaps with greater empathy and respect, setting the stage for a better de-escalation interaction and outcome. Although this training may be helpful from a human perspective, this research offers little in the way of understanding the specific, moment-to-moment manifestations of effective de-escalation needed to translate such practices into human–robot interactions.

The evidence base of effective de-escalation techniques for human–human interactions is limited; however, researchers have conducted qualitative studies, literature reviews, and convened expert working groups in an attempt to elucidate the concepts related to human de-escalation interactions and to identify common strategies and best practices across professional disciplines. Hallett and Dickens [12] conducted a concept analysis, reviewing 79 papers investigating human-to-human de-escalation over the past three decades. They identified five domains of de-escalation skills: communication, assessment, self-regulation, actions, and maintaining safety. Mavandadi and colleagues [13] created and validated a modified English language version of a pre-existing structured measure of de-escalation practices [14] containing seven domains of skills: valuing the client, reducing fear, inquiring about client’s queries and anxieties, providing guidance to the client, working out possible agreements, remaining calm, and risk management. Todak and James [11] conducted a systematic observational study of police de-escalation tactics in Spokane, Washington, identifying eight categories of de-escalation tactics: respect, honesty, calmness, perspective-taking, compromise, listening, getting to the client’s level and reducing power imbalances, and client empowerment. They observed that the use of several of these tactics was associated with a calmer citizen demeanor. Finally, Richmond et al. [10] published an expert consensus statement on best practice guidelines for verbal de-escalation of agitated patients in emergency psychiatry, covering ten domains: respecting personal space, not being provocative, establishing verbal contact, being concise, identifying wants and feelings, listening closely to what the patient is saying, agreeing or agreeing to disagree, laying down the law and setting clear limits, offering choices and optimism, and debriefing the patient and staff.

These papers are not an exclusive list of models published on de-escalation strategies; however, based on a review of the extant literature, they offer a comprehensive overview of the current knowledge of de-escalation practices. At first glance, these models contain both common elements (e.g., remaining calm, listening, use of verbal skills, maintaining safety), and specific techniques relating to each of the domains that were identified by Hallett and Dickens [12], Richmond et al. [10], and Mavandadi et al. [13]. As part of this review, the strategies referred to within these studies are arranged into a working model of the de-escalation process to inform AI- and robot-based approaches to de-escalation within human–robot interactions (see Figure 1). Notably, though it is possible to construct a list of principles and possible techniques for de-escalation, there is no evidence base to support or offer insights into optimal decision-making around the choice, timing, and sequencing of de-escalation techniques for specific individuals in specific contexts. As is the case with most psychosocial interventions, understanding of the moment-to-moment process of de-escalation and what works for whom and when is currently limited.

3. Basic Requirements for Social Robots and Working Model for De-Escalation

Human interactions are characterized by a high degree of adaptivity. People instinctively change their actions according to the perceived cognitive and affective state of their interactive partners and environmental context. This social adaptability is made possible through the complex interplay of real-time sensory, cognitive, affective, and behavioural systems that comprise the central nervous system. For robots to successfully enter the social sphere, they will similarly require the ability to perceive and adapt to their interactive partners and the environmental context. Robots will need to be able to perceive and express emotions, sense verbal and non-verbal signals from humans and respond adequately, comprehend and generate natural language, have memory and reasoning capacities, plan actions and execute movements, and demonstrate various social competencies according to what is required by the specific context [15]. Research in computer science and engineering has focused on designing cognitive architectures and adaptive behavioural models inspired by the human brain to enable artificial intelligence in robotic agents and enable them to exhibit intelligent behaviours, learn new tasks, and adapt to changes in their environment [16]. Robots capable of de-escalation will require sufficient mastery of these baseline skills needed for social robots, in addition to computational models geared specifically towards the de-escalation process.

Working Model of De-Escalation

Grounded in the findings of the qualitative studies, the working group consensus statement, and the literature reviewed above, we created a working model for de-escalation (Figure 1). We opted to develop our own model rather than draw upon one in the literature because the requirements for instructing humans on de-escalation practices differs from those for robots. By developing our own model, we obtain the level of detail and type of stepwise organization required for considering the translation of de-escalation practices into human–robot interactions.

The model begins with principles for verbal and non-verbal communication as well as other general principles that are typically applicable throughout the de-escalation process. For robots, these can be conceived as a checklist of relatively fixed conditions that should be met at most, and in some cases all, timepoints of the interaction. In addition to these general principles, the model includes a stepwise illustration of de-escalation. Richmond et al. [10] described de-escalation as follows:

“De-escalation frequently takes the form of a verbal loop in which the clinician listens to the patient, finds a way to respond that agrees with or validates the patient’s position, and then states what he wants the patient to do (e.g., accept medication, sit down, etc.). The loop repeats as the clinician listens again to the patient’s response. The clinician may have to repeat his message a dozen or more times before it is heard by the patient.”
[10] (p. 19)

Informed by this description, we conceptualized de-escalation as a flexible, creative, and iterative process, in which crisis responders (i.e., people engaging in de-escalation with agitated individuals) move fluidly through the stages, drawing upon a large set of possible skills and approaches to move towards the goal of decreasing agitation and helping individuals regain their own internal locus of control. From a psychological perspective, humans engaged in de-escalation are operating with complex schemas and heuristics (or mental models), drawing upon their training, past experiences, intuition, and rationale about the current situation. Throughout, crisis responders engage in continual, ongoing assessment of the agitated individual to gauge the risk for violence and aggression, whether selected de-escalation strategies are working, and the anticipated trajectory of the situation. There are few fixed rules about which skills and approaches should be included within each de-escalation scenario; however, generally, the interaction should begin with (1) establishing verbal contact (e.g., stating name and role, providing orientation to time and place, etc.) and developing rapport, and (2) gathering information about agitated individuals and identifying their concerns or reasons for agitation and escalation. The latter is important for informing the selection of subsequent de-escalation strategies and identifying conditions that may facilitate de-escalation. Next, crisis responders have multiple options by which to proceed, though their choice should ideally be grounded in their assessment of the individual and the situation. The consensus of experts in emergency medicine is that de-escalation can frequently be successful in less than five minutes [10]. After successful de-escalation, a best practice is to debrief with fellow crisis responders or staff and with the agitated individual whenever possible.

4. Integrating the De-Escalation Process into Human–Robot Interactions

In the following sections, we explore the elements of our model in greater depth and integrate them with robotics research, machine learning and artificial intelligence.

4.1. Assessment: Sensing the Environment and Knowing When to Intervene

We described assessment as ongoing throughout the de-escalation process. Humans are constantly making “assessments” of their environments, continually sensing and processing information which subsequently informs their actions. When engaged in de-escalation, a specific frame is applied such that crisis responders are intentionally attuned to information that will guide de-escalation intervention (e.g., whether the individual’s agitation appears to be increasing or decreasing, whether the immediate physical environment and social context contain elements that could exacerbate agitation). Humans are naturally equipped with several modes through which they can sense information as part of their “assessment”, including sight, hearing, touch, olfaction, taste, vestibular (movement and balance), proprioception (where body parts are in relation to one another), and interoception (sense of the internal state of the body, both consciously and subconsciously). Robots must be equipped with the hardware and software necessary to sense information relevant to the tasks required of them. Determining which types of information robots must be able to sense for de-escalation is one of the first design challenges. Broadly, robots should be able to sense information to determine whether the environment is secure (i.e., free of bystanders, potential weapons, etc.) and to assess the level of agitation and its trajectory on a moment-to-moment basis. Developing robots with functionality for the first is a simpler task as it does not have same social processing requirements as the latter two.

For human–human interactions, the assessment component of de-escalation comprises “assessing the aggressor’s emotional state or situation”, “observing and recognizing known warning signs of aggression”, “using all five senses to assess the situation”, “judging the anticipated trajectory of the situation in the context of the individual using existing knowledge”, and “knowing when to intervene” [12] (p. 15). These descriptions reflect part of the challenge in translating de-escalation practices into robots. For example, humans experienced in dealing with agitated individuals may have an intuitive sense of what it means to assess the aggressor’s emotional state or situation and its implications for de-escalation. In contrast, robots do not have innate emotion detection or built-in capacities for empathy or “theory of mind” [17,18] to rely upon in making such behavioural calculations. Furthermore, efforts to build robots or AI systems that can estimate emotional state based on text, acoustical properties of speech, and visual appearance have yet to reach the level of performance associated with humans viewing other humans. Some of the other descriptions are even less helpful. “Knowing when to intervene” is a complex process of being attuned to agitated individuals. It involves making decisions about when to intervene in response to the unique context and circumstances, as opposed to applying a blanket rule. It may also serve as a reminder of the “window of opportunity” to intervene prior to escalation to more serious agitation, aggression, or violence. “Knowing when to intervene”, however, offers little in the way of informing the type of behaviours robots should be attuned to or as a policy to guide their actions.

The escalation of agitation and aggression has been described as occurring on a continuum. Models of this process typically depict escalation in three phases: (1) trigger/activation phase; (2) escalation phase; and (3) crisis phase. In the trigger phase, a catalyst event that induces stress begins the escalation process [19]. In the escalation phase, anxiety and agitation grow, evoking angry emotions [20]. In the crisis phase, the individual experiences a loss of self-control and may act violently towards others [19]. Kaplan and Wheeler [21] describe two additional phases in their “assault cycle” model: recovery and post-crisis depression. During these phases, agitated individuals begin to regulate their emotions and regain self-control. Their abilities to think and act rationally begin to return; they may feel remorseful about their actions, with a dip in mood to below baseline before returning to normal. During the transition from “crisis” to “recovery”, individuals may continue to experience elevated stress hormones. As such, though the immediate crisis may have passed, anger can be easily reignited, prompting a possible return to the crisis phase [21].

McKnight [20] elaborated on the continuum models of agitation and aggression by describing the progression of emotions and behaviours through the stages. Individuals move from a state of calm through to anxiety, agitation, aggression, and violence. Assessing and distinguishing between the different stages of escalation is far from an exact science with clear algorithms, though some authors have attempted to delineate markers of each of the stages, e.g., [20] (p. 33). Furthermore, there is variability in how individuals display emotions and states of agitation. Robots equipped with the hardware and software to perceive and identify these behaviours could rely on actuarial predictions of the state of agitated individuals; the more behaviours an individual displays within a certain stage along the continuum, the more likely they are to be at that stage.

Experts on de-escalation in the context of emergency medicine and psychiatry recommend the use of objective scales to assess agitation [10]. Examples of such measures include the Overt Agitation Severity Scale (OASS) [22] or the Modified Overt Aggression Scale (MOAS) [23]. These scales are behavioural checklists quantifying the severity of agitation based on observable behaviours or varying degrees of specificity. For example, the OASS requires the assessor to indicate the degree of the presence of vocalizations and oral/facial movements (e.g., smacking or licking of lips, chewing, jaw clenching; licking, grimacing, spitting), upper torso and upper extremity movements (e.g., slapping, swatting, hitting at objects or others) and lower extremity movements (e.g., pacing, wandering). The MOAS takes a different approach, classifying signs of agitation and aggression into categories of verbal aggression, aggression against property, self-aggression, and physical aggression. Each category contains subcategories of behaviours ranging from the least severe to most severe form of each type of agitation/aggression. For example, the verbal aggression category asks the assessor to rank agitated individuals according to the structure given in Table 1.

There is a gap between what assessment of agitation means for humans and the requirements to equip robots with the necessary skills. For the human assessor, these measurement scales may add an element of “objectivity” and provide benchmarks of behaviours to look for in quantifying the severity of agitation. In contrast, robots are not privy to the baseline knowledge possessed by humans to determine the difference between “curses mildly” or “curses viciously”. Even among humans, there may not be perfect agreement about what constitutes mild vs. vicious cursing, or about any of the other behaviours. Humans, however, can conjure an image or thought of how each of these five subcategories may look, sound, and feel. The descriptions above and behavioural checklists can provide a starting point for elements to consider in developing robots capable of de-escalation, but they fall short in offering the level of specificity required for training a robot in these practices.

To be engaged in de-escalation, robots not only require the ability to identify individuals in crisis and determine their level of agitation, but they also need to be able to assess the impact of de-escalation interventions in real time to determine whether the intervention is working or whether adjustments are necessary. At a macro-level, indicators that de-escalation interventions are working include decreased agitation, evidenced by positive shifts in affect and behaviour. At a micro-level, robots will need to detect indicators that de-escalation is proceeding effectively. These indicators will vary for each task the robot carries out but may include assessing whether agitated individuals have been successfully engaged by the robot (e.g., verbal acknowledgment, eye contact), whether agitated individuals are receptive to de-escalation tasks (e.g., if offered a drink of water, do they accept it? If limits are set, do they abide by them?), and whether an appropriate amount of personal space is being maintained.

4.2. Planning the Response: Making Decisions about What to Do Next

As illustrated in Figure 1, de-escalation is not linear, but rather a flexible and iterative process that may manifest in numerous ways. To maximize the likelihood of a favourable response from agitated individuals, it would be beneficial for robots to select intervention strategies according to the stage of escalation and emotions that agitated individuals are exhibiting. Furthermore, some intervention strategies may have a higher likelihood of achieving successful de-escalation with individuals, both across and within stages of escalation. Thus, it would be advisable for the robot to begin with these strategies, assess their effectiveness with the agitated individual at hand, and shift to a new strategy if unsuccessful. Some characteristics of agitated individuals (e.g., gender, age, reason for agitation, current emotional presentation) may make them more or less likely to respond well to specific de-escalation strategies. For example, Takayama and Pantofaru [24] found that when a robot’s head is oriented toward a person’s face, the minimum comfortable distance from the robot increases for women but decreases for men. They also found that the personality trait of agreeableness was associated with decreased personal space when people approach robots, while neuroticism was associated with increased personal space. The determination of relevant factors to guide optimal decision making for robots engaged in de-escalation remains an open empirical question.

In addition to between-person differences in optimal robot decision making, there can exist within-person considerations. That is to say, the same individual may have varied reactions to robots depending on their current state and the context of their agitation. For example, a grey literature publication on de-escalation training identified common underlying reasons for agitation along with suggested corresponding intervention goals [25]. The underlying reasons were fear, frustration, manipulation, and intimidation. When escalation is motivated by fear, individuals are presumed to be defending themselves against a perceived threat, and the goal of intervention is to respond in ways that will reduce the perceived threat. When frustration underlies the escalation, individuals are presumed to be acting out in response to a need to express intolerable frustration, and the suggested de-escalation approach is to convey that the crisis responder is in control of the environment. If manipulation is assessed to be the underlying motive of escalation, individuals are presumed to be impulsively attempting to obtain something in exchange for maintaining emotional control and not doing something dangerous. In this case, interventions that indicate crisis responders’ detachment and refusal to become involved in manipulation are thought to decrease the likelihood that individuals will perceive a gain from their agitated behaviour and thereby promote de-escalation. Finally, when escalation is driven by the motive to intimidate, individuals are presumed to be engaging in a calculated attempt to obtain something in exchange for the physical safety of others. Clear communication of consequences for aggression and violence is the suggested de-escalation strategy [25]. The validity of this model of the underlying reasons for escalation and suggested response styles needs to be explored empirically; however, it exemplifies how robots’ decision-making models for de-escalation need to consider contextual factors that can vary within individuals and across situations.

Though shifting to a new intervention strategy may be necessary if the current strategy is ineffective, it will be important for robots to develop an intelligence capable of perceiving when a strategy needs to be repeated versus when it is ineffective and needs to be changed. At times, crisis responders may pursue more than one strategy concurrently (e.g., validating feelings and circumstances while also setting limits, or asking clarification questions while also problem solving and offering food or water). Robots engaged in crisis de-escalation will similarly need to be able to make decisions to move fluidly between strategies. The pacing of interventions will also be an important consideration for robots. For example, robots will need to decide how long to wait before repeating an instruction or request. Humans make decisions around pacing based on a complex array of verbal and non-verbal perceptions about individuals they are interacting with, in combination with individual differences in communication style. Appropriate pacing for human–robot interactions in the context of crisis de-escalation is another empirical domain that remains to be explored.

4.3. Actions to Support De-Escalation

The components discussed in the last two sections occur “behind the scenes”; they are not directly observable to people with whom robots may be interacting. This section addresses the ways in robots may outwardly engage in de-escalation practices. Much like humans, robots can draw on an array of verbal and non-verbal characteristics and behaviours to convey meaning and exert desired effects on their environment. Though not an all-inclusive list, robots can vary in terms of their appearance (e.g., gender, size, anthropomorphism, use of avatar, uniform, method for displaying emotion, etc.), language (e.g., including both the content of what is said and how the robot says it—tone, volume, speed, accent, etc.), personality (e.g., introverted vs. extraverted), gaze behaviour, movements, and proxemics (i.e., personal space). Researchers within the field of human–robot interaction have begun to explore how design choices within these areas can impact interactions with humans. A comprehensive review of every domain of robot-design research relevant to de-escalation tactics is beyond the scope of this paper, as many warrant their own independent review. Instead, we present representative examples within domains to illustrate the spectrum of design considerations.

4.3.1. Verbal Communication Principles

As outlined in Figure 1, there are numerous verbal communication principles that crisis responders are advised to incorporate throughout de-escalation interventions, regardless of the specific strategies implemented at that moment. These include active listening, paraphrasing, using open questions and clear, concise, and decisive language, demonstrating perspective-taking, and showing empathy and concern. Robots engaged in de-escalation require the ability to understand and respond to open-ended input from agitated individuals, who may be uncooperative and deviate from a predictable conversational flow. Continued advancement in voice-enabled chatbots, harnessing natural language processing technology, will be crucial in developing robots that can utilize sophisticated receptive and expressive language skills to effectively employ these verbal de-escalation strategies (see Xiao et al. [26] for a recent example of a model for an interview chatbot capable of active listening). Regarding empathy, Bejarano and colleagues [27] found that a robot designed to maintain the flow of conversation by asking related vs. unrelated follow-up questions to further understand a person’s feelings was perceived as more empathetic than one that did not use questioning. This finding suggests that humans may be just as sensitive to the nuances that convey empathy and understanding (and related constructs) in robot–human interactions as they are in human–human interactions.

Some research suggests that design choices regarding verbal communication may be more complicated than simply using state-of-the-art AI voice technology. For example, Law and colleagues [28] conducted an experiment in which they manipulated a robot’s perceived emotional intelligence, gender, and communication method (voice vs. text) to explore the impact on trust. Unexpectedly, they found that participants reported greater trust in the robot when it communicated through text rather than voice format. They hypothesized that this effect may have been due to participants’ expectation that the robot would have a more expressive and human-like voice and the violation of this expectation subsequently resulted in reduced trust.

4.3.2. Non-Verbal Communication Principles

Non-verbal communication principles that should be integrated throughout de-escalation interactions include the use of an appropriate tone of voice, appearing calm, engaging in slow and predictable movements, making appropriate, non-provocative eye contact, being non-threatening, maintaining appropriate personal space and physical positioning, exhibiting slow and simple pacing, showing congruence between words and actions, showing concern, and matching agitated individuals to reduce perceived power differentials.

Robot tone of voice can be manipulated using features including volume, speaking speed, and pitch. The pacing of the de-escalation interaction can also be controlled using speaking speed along with the amount of speech. Studies examining the impact of tone of voice on the perception of robot personality suggest that humans are sensitive to these features and use them to attribute different characteristics to robots. For example, research indicates that robots who speak at a higher volume, with fast speed, higher and more varied pitch, and a larger amount of speech are typically perceived as being more extraverted e.g., [29,30]. Kim et al. [31] found that robot voice volume was negatively related to perceived friendliness. The de-escalation literature offers little insight into what specifically constitutes an appropriate tone of voice in the context of de-escalation. Presumably however, the tone of voice of crisis responders should be calm, empathic, warm, firm, and non-threatening. Future research could elucidate which configurations of a robot’s tone of voice are most likely to be perceived as having these qualities.

There is a large body of research on the gaze behaviour of robots (see Admoni and Scassellati [32] for a review). First and foremost, design choices dictate the type of gaze behaviours of which robots are capable. Virtual agents (i.e., avatars) can allow for fine-tuned control over the appearance and timing of gaze behaviour, incorporating subtle eyelid, eyebrow, and eyeball movements. Such small and subtle movements are difficult to achieve with physical motors on embodied robots. Virtual agents can mimic human eye movements with greater precision than physical robots because they are animated; however, it should be noted that encoding realistic gaze behaviour is an ongoing area of research [33]. Furthermore, although devices such as the Furhat [34], with a rear-projected head-shaped projection surface, can provide a head-shaped avatar, it is unclear whether projected avatar displays are as effective as physical structures. Given that de-escalation invariably requires a physically embodied robot, designs merging embodied robots with virtual agents (e.g., a robot body with a screen displaying an avatar face) may enable more realistic gaze behaviour.

Second, humans appear to respond differently to robot gaze than human gaze. A test of reflexive cueing (the tendency of humans to shift their attention in the direction of another person’s averted gaze) found that robots failed to elicit this response in people, suggesting that humans process robot gaze more as directional arrows than as faces [32,35]. Further, eye-tracking studies with infants have found that anticipatory eye gaze does not shift in response to a human–robot referential gaze as it typically does with humans [36]. Though humans may respond differently to robot gaze, research suggests that humans are sensitive to the interplay between their own gaze and a robot’s gaze. Robots that convey joint attention and mutual gaze that is responsive to human interactional partners induce greater self-reports of the “feeling of being looked at” compared with robots with gaze behaviour that is unresponsive to and independent of humans’ gaze [37,38].

The context of human–robot interactions has been found to influence which type of gaze behaviour is best. When conversational topics are emotionally neutral, robots that make eye contact are perceived as more sociable and intelligent. When the topic of conversation is embarrassing, however, robots that avoid eye contact are perceived more favourably [39]. When the goal of interaction was for robots to persuade their conversational partners, a natural gaze behaviour was most effective [40]. Moreover, when persuasive gestures were performed in conjunction with eye gaze, robots’ persuasiveness was improved; when performed in the absence of eye gaze, persuasive gestures impeded robots’ persuasiveness [41]. Finally, Admoni and Scassellati [32] note that robots’ gaze behaviour can be used to regulate the pace of conversations, convey mental states, and express personality and emotion. Exploring the ideal type of gaze behaviour within de-escalation interactions and its impact on humans’ perception and response of robots in that context will be an important area for future research.

Robot movement (e.g., speed of approach) and proxemic behaviour is another active field of research e.g., [24,42,43,44,45]. Interpersonal distancing theories [46,47] propose three functions of interpersonal distancing: (1) protection (perceived threat due to spatial invasion evokes a fight/flight with greater distance, facilitating easier escape); (2) regulation of arousal (interpersonal distance can be used to control the amount of incoming information and prevent overstimulation); and (3) communication (information about the nature of the relationship between individuals can be communicated through interpersonal distancing, such as through physical closeness). Agitated individuals are likely to be in a state of fight/flight and hyperarousal, thus their needs regarding personal space likely differ from participants typically included in studies of robot movement and proxemics. For de-escalation led by human crisis responders, slow and predictable movements are recommended. One study indicated that humans prefer robots that move more slowly than humans: approximately 1 m per second just under the average human walking speed [48]. The preference for robots to move slightly more slowly than humans under ordinary circumstances could mean that movements may need to be even slower in de-escalation situations, though this needs to be tested empirically. Macarthur and colleagues [43] found that humans rated robots as more trustworthy when they maintained greater personal space and had a slower speed of approach; however, the authors did not specify the degrees of personal space or speeds that were used in the experiment. Ideally, the movements and proxemic behaviour of crisis responders convey a non-threatening and calm presence to agitated individuals. Robots’ movement and proxemic behaviours should be designed with similar goals in mind. The idea of designing robots to move in the same manner as humans is known as social navigation and is an open area of research in robotics (see Baghi and Dudek [49] as an example). Work to date in social navigation leaves open the question as to whether humans would prefer robots to move in ways similar to or different than humans.

In human–human interactions, congruence between actions and words conveys genuineness and supports a sense of trust, safety, security, and greater predictability within relationships. Above, we described a study in which greater trust was reported when the robot communicated in text rather than voice format, with this effect possibly explained by a violation of the expectation for the robot to have a more expressive and human-like voice [28]. The robot in this case could be considered “incongruent”, in that its voice was not consistent with what participants expected based on its other qualities. Other studies within the field of human–robot interaction have noted effects related to a possible violation of expectations or “incongruence”. Chita-Tegmark et al. [50] found that, despite studies suggesting that women are typically perceived as more emotionally intelligent than men [51,52], male robots were rated as more emotionally intelligent than female robots, not only when the robots had gendered voices, but also when the only indication of their gender was their name. The researchers hypothesized that this finding may be due to a violation of the expectation for female robots to be more emotionally intelligent, whereas male robots were perceived as performing better than expected, thus resulting in more favourable ratings. To determine what congruence looks like for robots engaged in de-escalation, a necessary first step may be developing a thorough understanding of human expectations of robots in this role, so that robots can be designed in a way that is consistent and “congruent” with human expectations.

Robots have access to a range of non-verbal communication mechanisms that are unavailable to humans, which can be uniquely manipulated to produce different effects. One example is the size or height of the robot. Walters and colleagues [53] found that robots with a humanoid face and a shorter height (1.2 m tall) were perceived as less conscientious and more neurotic, while the same robot with a taller height (1.4 m tall) was perceived as more humanlike and conscientious. It is unclear at this time how a robot’s size might impact de-escalation, but studies such as this suggest it will be an important factor to consider.

4.3.3. Specific Tasks

The specific tasks of de-escalation described in Figure 1 (e.g., establish verbal contact, develop rapport, gather information, validate feelings and circumstances, limit setting, use of humour, offer choices, give reassurance/support, problem solving, etc.) do not represent an all-inclusive list of potential de-escalation strategies. Given that robot-led de-escalation has yet to be attempted, there is little evidence to draw upon in terms of how robots can effectively engage agitated individuals in specific de-escalation tasks. Furthermore, de-escalation tasks require a varied and flexible skillset. For example, the skills required to offer an agitated individual food and water versus the verbal and knowledge-based skills required to engage in problem-solving are radically different in terms of their computational and physical requirements. Robots engaged in de-escalation do not need to be able to perform all possible de-escalation tasks—equally effective human crisis responders certainly vary in terms of the specific skills they draw upon and tasks they tend towards. At this point, it is difficult to make recommendations regarding which types of tasks to prioritize, as there is not adequate evidence to inform such recommendations. The starting place for specific tasks, may also be dictated by the tasks that are most feasible given the current state of technology and by the context in which the robot is intended to engage in de-escalation (e.g., healthcare, customer service, security, etc.).

5. Training the Robot

Robots designed to engage in crisis de-escalation cannot be programmed in a fixed, algorithmic, rule-based manner, as for the most part, there are not specific, universal rulesets to govern the interaction (e.g., if client’s speech volume increases by 25%, then move 1 m away). Even if there were algorithms governing human-to-human de-escalation, whether those would translate into an effective robot-to-human de-escalation is an empirical question. Whether humans will respond similarly to robots as they do to humans engaged in analogous behaviours is also an open empirical question. It is worthwhile noting that humans undergoing training in de-escalation face the same limitations that robots must overcome, with regard to a lack of specific algorithms to guide their behaviour. Though there are some general principles of de-escalation that beginner crisis responders can draw on to inform their initial approach, the development of expertise in this area requires repeated practice and experience. Through an experience-based or “data-driven” learning process, the human brain can detect patterns and develop implicit and explicit “schemas” and “heuristics”, or, in the language of computer programming, “policies”, to guide behaviour in future de-escalation interactions.

At this point in time, there are modern machine-learning techniques that are well suited to data-driven strategies for learning. There exists a range of possible approaches, including, e.g., fuzzy logic [54] and supervised and unsupervised neural networks [55,56]. However, one approach that might be particularly suitable is reinforcement learning. Reinforcement learning [57] is an adaptive form of artificial intelligence in which the machine learns a new policy to follow through the provision of rewards associated with a target system state. A reward function serves as an incentive mechanism to inform the agent (i.e., robot or machine) what is correct and what is wrong, with the goal of the agent being to maximize the total reward. In standard reinforcement learning, the agent performs an action in the environment and then receives the next state and reward. The agent learns by iteratively interacting with the environment.

Reinforcement learning has found a wide range of applications from wireless network optimization [58] to power distribution management [59] and the development of socially aware navigation functions for robots [60]. Developing reward functions for robots to learn crisis de-escalation skills is a challenging task, as de-escalation involves a complex series of behaviours, the sequence of which is neither well understood, nor necessarily consistent within and between individuals and contexts. The potential action space for robots training in de-escalation will also be so large that sparse reward environments (i.e., when a very small number of actions return a reward) will likely pose a challenge because selecting actions at random may not achieve the final reward state (i.e., successful de-escalation, however it may be defined). Reward shaping may offer some promise in overcoming this challenge. In this method, a reward function is designed to provide more frequent feedback on appropriate behaviours by rewarding actions that achieve states that are precursors of or close to the final goal state. Still, setting the reward function for de-escalation is more complicated than most, if not all, of the tasks in which reinforcement learning has been successful. For example, when training an agent to play chess or to stack a block on top of another block, the desired end state is more easily defined (i.e., win vs. lose; block balanced on top of other block vs. not). Operationalizing a successful de-escalation intervention is not as easily done. Broadly, the reward function could reinforce actions that are associated with reductions in a person’s agitation, but defining what does and does not represent the end state of reduced agitation is layered, depending on both the context and humans involved.

There is also the issue of sample inefficiency, given that obtaining the large set of data required to train an agent is problematic. First and foremost, there are significant ethical issues associated with sending an untrained or partially trained agent into situations with agitated individuals to engage in the trial-and-error process of reinforcement learning. Ethical issues aside, it is also hard to imagine where to acquire a sample of agitated individuals large enough to train the agent. For example, training an agent to play Atari 2600 games required millions of attempts to play the game [61]. De-escalation involves a significantly larger action space and more complicated reward function than Atari 2600 games; as such, the number of training attempts required would likely be much larger.

Offline reinforcement learning offers some promise in overcoming these challenges. In offline reinforcement learning, the agent trains on a fixed dataset of previously collected experiences with known trajectories. The agent interacts with the dataset to collect a set of experiences to learn a policy without engaging with a real-world environment. Offline reinforcement learning is particularly valuable for tasks for which real-world interaction is prohibitively dangerous or expensive, as in the case for training a robot in de-escalation. This learning paradigm is limited, however, by the datasets that are available for training. With police bodycams becoming a norm within the field, it is possible that footage collected from police interactions could be one dataset on which robots could be trained. Terrill and Zimmerman [62] offer a systematic methodology for coding and analyzing video data collected from bodycams for patterns of escalation and de-escalation, and observations from their study regarding the strengths and limitations of bodycam footage may help to guide machine learning with this type of data. Police interactions are a highly specific context for de-escalation, however, and would likely not be adequate to train a robot for de-escalation in other settings. As such, it would be beneficial to acquire datasets representative of other fields in which de-escalation practices are often implemented (e.g., emergency medicine, mental health nursing, special education).

With reinforcement learning, the solutions that machines devise to maximize reward functions are often not the outcomes intended. What is intended is not always consistent with what is incentivized, and it can be difficult at times to capture exactly what an agent should be rewarded to do. Consequently, reward functions use imperfect but easily quantified proxies for desired outcomes. In one example, researchers training an agent on a motorboat race game encountered a problem in which the agent learned to maximize its score by circling around and hitting targets repeatedly, but without finishing the race [63]. With de-escalation, it is not hard to imagine reward functions that could be maximized through strategies causing more harm than good, or strategies that are inherently dangerous. For example, a quick way to reduce agitation would be to cause a person to lose consciousness, but this is not an acceptable strategy for de-escalation.

Lastly, there remains the challenge of equipping robots with the requisite hardware and software to sense the information needed for decision making and to adequately convey the appropriate response to their human interactional partner.

6. Future Directions

Throughout this paper, we have considered the capabilities required by robots to engage in the de-escalation process from start to finish as autonomous agents. Given the complexity of this task as well as the question of whether humans would react favourably to a robot attempting de-escalation, it may be better to consider how robots could assist humans with aspects of the de-escalation process, rather than how robots could engage in de-escalation independently. For example, though we have not addressed it in depth thus far, a crucial step in de-escalation for human teams is the debriefing process. Debriefing is important as it provides an opportunity for humans to reflect on their interventions, consolidate aspects that went well, and identify areas for improvement. Robots could theoretically play a role in augmenting the debriefing process. For example, a robot with a video record of situations in which de-escalation occurred could be trained to identify key moments to present to human teams for discussion. Robots could also have a preventative role. For example, a robot engaged in surveillance in public settings could be trained to identify individuals exhibiting warning signs of significant agitation so that intervention by humans can occur before a crisis point is reached. Robots could perform other assistive roles as well to increase the efficiency and effectiveness of human crisis responders. Finally, rather than directly interacting with agitated individuals, robots could take on an advisory role, recommending interventions to human crisis responders engaged in de-escalation. Trust between human crisis responders and the robot is crucial in this scenario (see de Visser et al. [64] for a discussion on trust in human–robot teams).

Before robots can be trained to perform any role in de-escalation, research is needed to better understand and codify de-escalation. Video-recorded naturalistic observational research on de-escalation would help to fill current gaps in understanding. Naturalistic observations allow researchers to study phenomenon up close in the moment as they occur, enabling a fine-tuned examination of moment-to-moment processes in de-escalation. Most of the current research on de-escalation has been based on post-hoc qualitative reports from crisis responders, which have provided insight on macro-level themes in de-escalation but have not allowed for in-depth analysis of more micro-level processes. Video-recorded naturalistic observations would also help to build the large datasets required for reinforcement learning regimes to train robots in de-escalation. Though this type of work can be costly, time-consuming, and labour-intensive, it is a necessary first step in the effort to develop robots capable of assisting with such a socially and behaviourally complex task as de-escalation.

Author Contributions

Conceptualization, K.P., D.J.P., S.G.C. and M.J.; methodology, K.P.; writing—original draft preparation, K.P.; writing—review and editing, K.P, D.J.P. and M.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Canadian Innovation for Defence Excellence and Security (IDEaS) Innovation Networks, grant number CFPMN2-027.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rubenstein, M.; Cimino, B.; Nagpal, R.; Werfel, J. AERobot: An affordable one-robot-per-student system for early robotics education. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 6107–6113. [Google Scholar]
Robinson, N.L.; Connolly, J.; Suddery, G.; Turner, M.; Kavanagh, D.J. A humanoid social robot to provide personalized feedback for health promotion in diet, physical activity, alcohol and cigarette use: A health clinic trial. In Proceedings of the 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), Vancouver, BC, Canada, 8–12 August 2021; pp. 720–726. [Google Scholar] [CrossRef]
Rakhymbayeva, N.; Amirova, A.; Sandygulova, A. A Long-Term Engagement with a Social Robot for Autism Therapy. Front. Robot. AI 2021, 8, 669972. [Google Scholar] [CrossRef] [PubMed]
Brumson, B. Robotics in Security and Military Applications. 2021. Available online: https://www.automate.org/industry-insights/robotics-in-security-and-military-applications (accessed on 28 March 2023).
Knightscope. 2022. Available online: https://www.knightscope.com (accessed on 28 March 2023).
Taylor, J.A.; Murray, R.; Shepler, L.; Davis, A. Mitigation of Occupational Violence to Firefighters and EMS Responders; FEMA Report HSFE20-15-Q-0053; United States Fire Administration: Washington, DC, USA, 2017. Available online: https://www.usfa.fema.gov/downloads/pdf/publications/mitigation_of_occupational_violence.pdf (accessed on 28 March 2023).
Monroe, C.M.; Van Rybroek, G.J.; Maier, G.J. Decompressing aggressive. inpatients: Breaking the aggression cycle to enhance positive outcome. Behav. Sci. Law 1988, 6, 543–557. [Google Scholar] [CrossRef]
Roberts, A.R. Bridging the past and present to the future of crisis intervention and crisis management. In Crisis Intervention Handbook: Assessment, Treatment, and Research, 3rd ed.; Roberts, A.R., Ed.; Oxford University Press: Oxford, UK, 2005; pp. 3–34. [Google Scholar]
National Institute for Clinical Excellence. Violence: The Short Term Management of Disturbed/Violent Behaviour in Psychiatric In-Patient Settings and Emergency Departments National Cost-Impact Report; National Institute for Clinical Excellence: London, UK, 2005. [Google Scholar]
Richmond, J.S.; Berlin, J.S.; Fishkind, A.B.; Holloman, G.H.; Zeller, S.L.; Wilson, M.P.; Rifai, M.A.; Ng, A.T. Verbal De-escalation of the Agitated Patient: Consensus Statement of the American Association for Emergency Psychiatry Project BETA De-escalation Workgroup. West. J. Emerg. Med. 2012, 13, 17–25. [Google Scholar] [CrossRef] [PubMed]
Todak, N.; James, L. A Systematic Social Observation Study of Police De-Escalation Tactics. Police Q. 2018, 21, 509–543. [Google Scholar] [CrossRef]
Hallett, N.; Dickens, G.L. De-escalation: A survey of clinical staff in a secure mental health inpatient service. Int. J. Ment. Health Nurs. 2015, 24, 324–333. [Google Scholar] [CrossRef]
Mavandadi, V.; Bieling, P.J.; Madsen, V. Effective ingredients of verbal de-escalation: Validating an English modified version of the ‘De-Escalating Aggressive Behaviour Scale’. J. Psychiatr. Ment. Health Nurs. 2016, 23, 357–368. [Google Scholar] [CrossRef]
Nau, J.; Halfens, R.; Needham, I.; Dassen, T. The De-Escalating Aggressive Behaviour Scale: Development and psychometric testing. J. Adv. Nurs. 2009, 65, 1956–1964. [Google Scholar] [CrossRef]
Fong, T.; Nourbakhsh, I.; Dautenhahn, K. A survey of socially interactive robots. Robot. Auton. Syst. 2003, 42, 143–166. [Google Scholar] [CrossRef] [Green Version]
Nocentini, O.; Fiorini, L.; Acerbi, G.; Sorrentino, A.; Mancioppi, G.; Cavallo, F. A Survey of Behavioral Models for Social Robots. Robotics 2019, 8, 54. [Google Scholar] [CrossRef] [Green Version]
Premack, D.; Woodruff, G. Does the chimpanzee have a theory of mind? Behav. Brain Sci. 1978, 1, 515–526. [Google Scholar] [CrossRef] [Green Version]
Perner, J. Understanding the Representational Mind; MIT Press: Cambridge, MA, USA, 1991. [Google Scholar]
Byrnes, J.D. The aggression continuum: A paradigm shift. Occup. Health Saf. 2000, 69, 70–71. [Google Scholar] [PubMed]
McKnight, S.E. De-Escalating Violence in Health Care: Strategies to Reduce Emotional Tension and Aggression; Sigma Theta Tau International: Indianapolis, IN, USA, 2020. [Google Scholar]
Kaplan, S.G.; Wheeler, E.G. Survival Skills for Working with Potentially Violent Clients. Soc. Casework 1983, 64, 339–346. [Google Scholar] [CrossRef]
Yudofsky, S.C.; Kopecky, H.J.; Kunik, M.; Silver, J.M.; Endicott, J. The Overt Agitation Severity Scale for the objective rating of agitation. J. Neuropsychiatry Clin. Neurosci. 1997, 9, 541–548. [Google Scholar] [CrossRef]
Kay, S.R.; Wolkenfeld, F.; Murrill, L.M. Profiles of Aggression among Psychiatric Patients. J. Nerv. Ment. Dis. 1988, 176, 539–546. [Google Scholar] [CrossRef]
Takayama, L.; Pantofaru, C. Influences on proxemic behaviors in human-robot interaction. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 5495–5502. Available online: https://dl.acm.org/doi/10.5555/1732643.1732940 (accessed on 28 March 2023).
Saskatchewan Association for Safe Workplaces in Health. De-Escalation Verbal Crisis Intervention: Education Session Participant Handout. 2019. Available online: https://www.srsd119.ca/wp-content/uploads/SSS/SASW_De-escalation_Verbal-Crisis-Intervention_WEB.pdf (accessed on 28 March 2023).
Xiao, Z.; Zhou, M.X.; Chen, W.; Yang, H.; Chi, C. If I hear you correctly: Building and evaluating interview chatbots with active listening skills. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar] [CrossRef]
Bejarano, A.; Lomax, O.; Scherschel, P.; Williams, T. Designing for perceived robot empathy for children in long-term care. In International Conference on Social Robotics; Springer: Cham, Switzerland, 2021; pp. 743–748. [Google Scholar]
Law, T.; Chita-Tegmark, M.; Scheutz, M. The Interplay Between Emotional Intelligence, Trust, and Gender in Human–Robot Interaction. Int. J. Soc. Robot. 2021, 13, 297–309. [Google Scholar] [CrossRef]
Chang, R.C.-S.; Lu, H.-P.; Yang, P. Stereotypes or golden rules? Exploring likable voice traits of social robots as active aging companions for tech-savvy baby boomers in Taiwan. Comput. Hum. Behav. 2018, 84, 194–210. [Google Scholar] [CrossRef]
Tapus, A.; Ţăpuş, C.; Matarić, M.J. User—Robot personality matching and assistive robot behavior adaptation for post-stroke rehabilitation therapy. Intell. Serv. Robot. 2008, 1, 169–183. [Google Scholar] [CrossRef]
Kim, J.; Kwak, S.S.; Kim, M. Entertainment robot personality design based on basic factors of motions: A case study with ROLLY. In Proceedings of the RO-MAN 2009-The 18th IEEE International Symposium on Robot and Human Interactive Communication, Toyama, Japan, 27 September–2 October 2009; pp. 803–808. [Google Scholar] [CrossRef]
Admoni, H.; Scassellati, B. Social eye gaze in human-robot interaction: A review. J. Hum. Robot. Interact. 2017, 6, 25–63. [Google Scholar] [CrossRef] [Green Version]
Ruhland, K.; Peters, C.E.; Andrist, S.; Badler, J.B.; Badler, N.I.; Gleicher, M.; Mutlu, B.; McDonnell, R. A Review of Eye Gaze in Virtual Agents, Social Robotics and HCI: Behaviour Generation, User Interaction and Perception. Comput. Graph. Forum 2015, 34, 299–326. [Google Scholar] [CrossRef]
Al Moubayed, S.; Beskow, J.; Skantze, G.; Granström, B. Furhat: A back-projected human-like robot head for multiparty human-machine interaction. In Cognitive Behavioural Systems; Springer: Berlin/Heidelberg, Germany, 2012; pp. 114–130. [Google Scholar]
Admoni, H.; Bank, C.; Tan, J.; Toneva, M.; Scassellati, B. Robot gaze does not reflexively cue human attention. Proc. Annu. Meet. Cogn. Sci. Soc. 2011, 33, 1983–1988. Available online: https://escholarship.org/uc/item/3pq1v9b0 (accessed on 28 March 2023).
Okumura, Y.; Kanakogi, Y.; Kanda, T.; Ishiguro, H.; Itakura, S. Infants understand the referential nature of human gaze but not robot gaze. J. Exp. Child Psychol. 2013, 116, 86–95. [Google Scholar] [CrossRef] [PubMed]
Yoshikawa, Y.; Shinozawa, K.; Ishiguro, H.; Hagita, N.; Miyamoto, T. Responsive robot gaze to interaction partner. In Robotics: Science and Systems; 2006; pp. 37–43. Available online: http://www.roboticsproceedings.org/rss02/p37.pdf (accessed on 28 March 2023).
Yoshikawa, Y.; Shinozawa, K.; Ishiguro, H.; Hagita, N.; Miyamoto, T. The effects of robot gaze on human attention and memory in a collaborative task. In Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, 9–15 October 2006; pp. 5221–5226. [Google Scholar] [CrossRef]
Choi, J.J.; Kim, Y.; Kwak, S.S. Have you ever Lied: The impacts of gaze avoidance on people’s perception of a robot. In Proceedings of the 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Tokyo, Japan, 3–6 March 2013; pp. 105–106. [Google Scholar] [CrossRef]
Chidambaram, V.; Chiang, Y.H.; Mutlu, B. Designing persuasive robots: How robots might persuade people using vocal and nonverbal cues. In Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction, Boston, MA, USA, 5–8 March 2012; pp. 293–300. [Google Scholar] [CrossRef]
Ham, J.; Cuijpers, R.H.; Cabibihan, J.-J. Combining Robotic Persuasive Strategies: The Persuasive Power of a Storytelling Robot that Uses Gazing and Gestures. Int. J. Soc. Robot. 2015, 7, 479–487. [Google Scholar] [CrossRef] [Green Version]
Leichtmann, B.; Nitsch, V. Is the Social Desirability Effect in Human–Robot Interaction overestimated? A Conceptual Replication Study Indicates Less Robust Effects. Int. J. Soc. Robot. 2021, 13, 1013–1031. [Google Scholar] [CrossRef]
MacArthur, K.R.; Stowers, K.; Hancock, P.A. Human-robot interaction: Proximity and speed—Slowly back away from the robot! In Advances in Human Factors in Robots and Unmanned Systems; Springer: Cham, Switzerland, 2017; pp. 365–374. [Google Scholar]
Mumm, J.; Mutlu, B. Human-robot proxemics: Physical and psychological distancing in human-robot interaction. In Proceedings of the 6th International Conference on Human-Robot Interaction, Lausanne, Switzerland, 6–9 March 2011; pp. 331–338. [Google Scholar] [CrossRef]
Rios-Martinez, J.; Spalanzani, A.; Laugier, C. From Proxemics Theory to Socially-Aware Navigation: A Survey. Int. J. Soc. Robot. 2015, 7, 137–153. [Google Scholar] [CrossRef]
Uzzell, D.; Horne, N. The influence of biological sex, sexuality and gender role on interpersonal distance. Br. J. Soc. Psychol. 2006, 45, 579–597. [Google Scholar] [CrossRef] [Green Version]
Aiello, J.R. Human spatial behaviour. In Handbook of Environmental Psychology; Stokols, D., Altman, I., Eds.; Wiley: Hoboken, NJ, USA, 1987; Volume 1, pp. 505–531. [Google Scholar]
Butler, J.T.; Agah, A. Psychological Effects of Behavior Patterns of a Mobile Personal Robot. Auton. Robot. 2001, 10, 185–202. [Google Scholar] [CrossRef]
Baghi, B.H.; Dudek, G. Sample efficient social navigation using inverse reinforcement learning. arXiv 2021. [Google Scholar] [CrossRef]
Chita-Tegmark, M.; Lohani, M.; Scheutz, M. Gender effects in perceptions of robots and humans with varying emotional intelligence. In Proceedings of the 2019 14th ACM/IEEE International Conference on Human–Robot Interaction (HRI), Daegu, Republic of Korea, 11–14 March 2019; pp. 230–238. [Google Scholar] [CrossRef]
Petrides, K.V.; Furnham, A.; Martin, G.N. Estimates of Emotional and Psychometric Intelligence: Evidence for Gender-Based Stereotypes. J. Soc. Psychol. 2004, 144, 149–162. [Google Scholar] [CrossRef]
Lopez-Zafra, E.; Gartzia, L. Perceptions of gender differences in self-report measures of emotional intelligence. Sex Roles 2014, 70, 479–495. [Google Scholar] [CrossRef]
Walters, M.L.; Koay, K.L.; Syrdal, D.S.; Dautenhahn, K.; Te Boekhorst, R. Preferences and Perceptions of Robot Appearance and Embodiment in Human-Robot Interaction Trials. Proceedings of New Frontiers in Human-Robot Interaction. 2009. Available online: https://uhra.herts.ac.uk/bitstream/handle/2299/9642/903516.pdf?sequence=1&isAllowed=y (accessed on 28 March 2023).
De Silva, C.W. Intelligent Control: Fuzzy Logic Applications; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Domingos, P. A few useful things to know about machine learning. Commun. ACM 2012, 55, 78–87. [Google Scholar] [CrossRef] [Green Version]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Mismar, F.B.; Evans, B.L.; Alkhateeb, A. Deep Reinforcement Learning for 5G Networks: Joint Beamforming, Power Control, and Interference Coordination. IEEE Trans. Commun. 2019, 68, 1581–1592. [Google Scholar] [CrossRef] [Green Version]
Gao, Y.; Yu, N. Deep reinforcement learning in power distribution systems: Overview, challenges, and opportunities. In Proceedings of the 2021 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 16–18 February 2021; pp. 1–5. [Google Scholar] [CrossRef]
Baghi, B.H.; Konar, A.; Hogan, F.; Jenkin, M.; Dudek, G. SESNO: Sample Efficient Social Navigation from Observation. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 9164–9171. [Google Scholar] [CrossRef]
Terrill, W.; Zimmerman, L. Police Use of Force Escalation and De-escalation: The Use of Systematic Social Observation With Video Footage. Police Q. 2022, 25, 155–177. [Google Scholar] [CrossRef]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Graves, A.; Antonoglou, I.; Wierstra, D.; Riedmiller, M. Playing Atari with deep reinforcement learning. arXiv 2013. [Google Scholar] [CrossRef]
Clark, J.; Amodei, D. Faulty Reward Functions in the Wild. 2016. Available online: https://openai.com/blog/faulty-reward-functions/ (accessed on 28 March 2023).
de Visser, E.J.; Peeters, M.M.M.; Jung, M.F.; Kohn, S.; Shaw, T.H.; Pak, R.; Neerincx, M.A. Towards a Theory of Longitudinal Trust Calibration in Human–Robot Teams. Int. J. Soc. Robot. 2019, 12, 459–478. [Google Scholar] [CrossRef]

Figure 1. Working model of de-escalation to inform development of robot-assisted de-escalation practices.

Table 1. Agitation ranking from low to high. From MOAS [23].

Rating	Descriptor
0	No verbal aggression
1	Shouts angrily, curses mildly, or makes personal insults
2	Curses viciously, is severely insulting, has temper outbursts
3	Impulsively threatens violence toward others or self
4	Threatens violence toward others, either self-repeatedly or deliberately

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pierce, K.; Pepler, D.J.; Craig, S.G.; Jenkin, M. Considerations for Developing Robot-Assisted Crisis De-Escalation Practices. Appl. Sci. 2023, 13, 4337. https://doi.org/10.3390/app13074337

AMA Style

Pierce K, Pepler DJ, Craig SG, Jenkin M. Considerations for Developing Robot-Assisted Crisis De-Escalation Practices. Applied Sciences. 2023; 13(7):4337. https://doi.org/10.3390/app13074337

Chicago/Turabian Style

Pierce, Kathryn, Debra J. Pepler, Stephanie G. Craig, and Michael Jenkin. 2023. "Considerations for Developing Robot-Assisted Crisis De-Escalation Practices" Applied Sciences 13, no. 7: 4337. https://doi.org/10.3390/app13074337

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Considerations for Developing Robot-Assisted Crisis De-Escalation Practices

Abstract

1. Introduction

2. Crisis De-Escalation

3. Basic Requirements for Social Robots and Working Model for De-Escalation

Working Model of De-Escalation

4. Integrating the De-Escalation Process into Human–Robot Interactions

4.1. Assessment: Sensing the Environment and Knowing When to Intervene

4.2. Planning the Response: Making Decisions about What to Do Next

4.3. Actions to Support De-Escalation

4.3.1. Verbal Communication Principles

4.3.2. Non-Verbal Communication Principles

4.3.3. Specific Tasks

5. Training the Robot

6. Future Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI