# A Complete Analysis on the Risk of Using Quantal Response: When Attacker Maliciously Changes Behavior under Uncertainty

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### Outline of the Article

## 2. Related Work

## 3. Background

**Stackelberg Security Games ($\mathtt{SSG}$s).**There is a set of $\mathbf{T}=\{1,2,\cdots ,T\}$ targets that a defender has to protect using $L<T$ security resources. A pure strategy of the defender is an allocation of these L resources over the T targets. A mixed strategy of the defender is a probability distribution over all pure strategies. In this work, we consider the no-scheduling-constraint game setting, in which each defender mixed strategy can be compactly represented as a coverage vector $\mathbf{x}=\{{x}_{1},{x}_{2},\cdots ,{x}_{T}\}$, where ${x}_{t}\in [0,1]$ is the probability that the defender protects target t and ${\sum}_{t}{x}_{t}\le L$ [28]. We denote by $\mathbf{X}$ the set of all defense strategies. In $\mathtt{SSG}$s, the defender plays first by committing to a mixed strategy, and the attacker responds against this strategy by choosing a single target to attack.

**Quantal Response Model ($\mathtt{QR}$).**$\mathtt{QR}$ is a well-known behavioral model used to predict boundedly rational (attacker) decision making in security games [2,6,7]. Essentially, $\mathtt{QR}$ predicts the probability that the attacker attacks each target t using the softmax function:

## 4. Attacker Behavior Deception under Unknown Learning Outcome

Given uncertainty about learning outcomes of the defender, can the attacker still benefit from playing deceptively?

#### 4.1. A Polynomial-Time Deception Algorithm

**Theorem**

**1**

#### 4.1.1. Decomposability of Deception Space

**Lemma**

**1**

**Lemma**

**2.**

**Lemma**

**3.**

Algorithm 1: Imitative behavior deception—Decomposition of QR parameter domain into sub-intervals |

**Proposition**

**1**

**Proof of Lemma**

**2**.

**Proof of Lemma**

**3**.

**Proof of Proposition**

**1.**

- Case 1: ${k}_{j}^{max}<K$ and $l{b}_{{k}_{j}^{max}+1}\le u{b}_{{k}_{j}^{min}}$. This is when (i) the ${j}^{th}$ deception interval does not cover the maximum possible learning outcome ${\lambda}_{K}^{\mathrm{learnt}}$; and (ii) the smallest deception value w.r.t the learning outcome ${\lambda}_{{k}_{j}^{max}+1}^{\mathrm{learnt}}$ is less than the largest deception value w.rt the learning outcome ${\lambda}_{{k}_{j}^{min}}^{\mathrm{learnt}}$. Intuitively, (ii) implies that the upper bound of the ${j}^{th}$ deception interval is strictly less than $l{b}_{{k}_{j}^{max}+1}$. Otherwise, this deception upper bound will correspond to an uncertainty set which covers the learning outcome ${\lambda}_{{k}_{j}^{max}+1}^{\mathrm{learnt}}$, which is contradict to the fact that ${\lambda}_{{k}_{j}^{max}}^{\mathrm{learnt}}$ (which is strictly less that ${\lambda}_{{k}_{j}^{max}+1}^{\mathrm{learnt}}$) is the maximum learning outcome for the ${j}^{th}$ deception interval.

- Case 2: ${k}_{j}^{max}=K$ or $l{b}_{{k}_{j}^{max}+1}>u{b}_{{k}_{j}^{min}}$. Note that when $l{b}_{{k}_{j}^{max}+1}>u{b}_{{k}_{j}^{min}}$, the upper bound of the ${j}^{th}$ deception interval must be at most $u{b}_{{k}_{j}^{min}}$. This is to ensure that this upper bound covers the learning outcome ${\lambda}_{{k}_{j}^{min}}^{\mathrm{learnt}}$.

#### 4.1.2. Divide and Conquer: (Divide ${\mathbf{P}}_{\mathrm{discrete}}^{\mathrm{dec}}$) into a $O\left(K\right)$ Polynomial Sub-Problems

**Lemma**

**4**

**Lemma**

**5.**

#### 4.2. Solution Quality Analysis

**Theorem**

**2.**

**Proof.**

#### 4.3. Heuristic to Improve Discretization

## 5. Defender Counter-Deception

**Main Result.**To date, we have not explicitly defined the objective function, ${U}^{d}({\mathcal{H}}^{\mathbf{I}},{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right))$, except that we know this utility depends on the defense function ${\mathcal{H}}^{\mathbf{I}}$ and the attacker’s deception response ${\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)$. Now, since ${\mathcal{H}}^{\mathbf{I}}$ maps each possible learning outcome ${\lambda}^{\mathrm{learnt}}$ to a defense strategy, we know that if ${\lambda}^{\mathrm{learnt}}\in {I}_{n}^{d}$, then ${U}^{d}({\mathcal{H}}^{\mathbf{I}},{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right))={U}^{d}({\mathbf{x}}_{n},{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right))$, which can be computed using Equation (3). However, due to the deviation of ${\lambda}^{\mathrm{learnt}}$ from the attacker’s deception choice, ${\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)$, different possible learning outcomes ${\lambda}^{\mathrm{learnt}}$ within $[{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)-\delta ,{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)+\delta ]$ may belong to different intervals ${I}_{n}^{d}$ (which correspond to different strategies ${\mathbf{x}}_{n}$), leading to different utility outcomes for the defender. One may argue that to cope with this deception-learning uncertainty, we can apply the maximin approach to determine the defender’s worst-case utility if the defender only has the common knowledge that ${\lambda}^{\mathrm{learnt}}\in [{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)-\delta ,{\lambda}^{\mathrm{dec}}\left({\mathcal{H}}^{\mathbf{I}}\right)+\delta ]$. Furthermore, perhaps, depending on any additional (private) knowledge the defender has regarding the relation between the attacker’s deception and the actual learning outcome of the defender, we can incorporate such knowledge into our model and algorithm to obtain an even better utility outcome for the defender. Interestingly, we show that there is, in fact, a universal optimal defense function for the defender, ${\mathcal{H}}_{*}$, regardless of any additional knowledge that he may have. That is, the defender obtains the highest utility by following this defense function, and additional knowledge besides the common knowledge cannot make the defender do better. Our main result is formally stated in Theorem 3.

**Theorem**

**3.**

- If ${\lambda}^{*}\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}{\lambda}^{max}$, choose the interval set ${\mathbf{I}}_{*}=\left\{{I}_{1}^{d}\right\}$ with ${I}_{1}^{d}\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}[0,{\lambda}^{max}\phantom{\rule{-0.166667em}{0ex}}+\phantom{\rule{-0.166667em}{0ex}}\delta ]$ covering the entire learning space, and function ${\mathcal{H}}_{*}^{{\mathbf{I}}_{*}}\left({I}_{1}^{d}\right)={\mathbf{x}}_{1}$ where ${\mathbf{x}}_{1}\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}{\mathbf{x}}^{*}$.
- If ${\lambda}^{*}\phantom{\rule{-0.166667em}{0ex}}<\phantom{\rule{-0.166667em}{0ex}}{\lambda}^{max}$, choose the interval set ${\mathbf{I}}_{*}=\{{I}_{1}^{d},{I}_{2}^{d}\}$ with ${I}_{1}^{d}\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}[0,{\lambda}^{*}\phantom{\rule{-0.166667em}{0ex}}+\phantom{\rule{-0.166667em}{0ex}}\delta ]$, ${I}_{2}^{d}\phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}({\lambda}^{*}\phantom{\rule{-0.166667em}{0ex}}+\phantom{\rule{-0.166667em}{0ex}}\delta ,{\lambda}^{max}+\delta ]$. In addition, choose the defender strategies ${\mathbf{x}}_{1}={\mathbf{x}}^{*}$ and ${\mathbf{x}}_{2}\in {argmin}_{\mathbf{x}\in \mathbf{X}}{U}^{a}(\mathbf{x},{\lambda}^{max})$ correspondingly.

**Example**

**1.**

- If the defender learns ${\lambda}^{\mathit{learnt}}\in [0,0.25]$, the defender will play a strategy ${\mathbf{x}}_{1}={\mathbf{x}}^{*}=[0,1,0]$.
- Otherwise, if the defender learns ${\lambda}^{\mathit{learnt}}\in (0.25,3.25]$, the defender then plays ${\mathbf{x}}_{2}=[0.34,0.20,0.46]\in {argmin}_{\mathbf{x}\in \mathbf{X}}{U}^{a}(\mathbf{x},{\lambda}^{max})$.

- If the attacker chooses ${\lambda}^{\mathit{dec}}={\lambda}^{*}=0$, the corresponding learning outcome for the defender can be any value within the range $[0,0,25]$. According to the defense function, the defender will always play the strategy ${\mathbf{x}}_{1}=[0,1,0]$. As a result, the attacker’s expected utility is $\frac{1}{3}\times 2+\frac{1}{3}\times (-2)+\frac{1}{3}\times 3=1.0$.
- Now, if the attacker chooses ${\lambda}^{\mathit{dec}}>{\lambda}^{*}=0$, the corresponding learning outcome for the defender may fall into either $[0,0.25]$ or $(0.25,3.25]$. In particular, if the learning outcome ${\lambda}^{\mathit{learnt}}\in (0.25,3.25]$, it means the defender plays ${\mathbf{x}}_{2}=[0.34,0.20,0.46]\in {argmin}_{\mathbf{x}\in \mathbf{X}}{U}^{a}(\mathbf{x},{\lambda}^{max})$. In this case, the resulting attacker utility is ${U}^{a}({\mathbf{x}}_{2},{\lambda}^{\mathrm{dec}})\le {U}^{a}({\mathbf{x}}_{2},$${\lambda}^{max})=0.33$ (this inequality is due to the fact that the attacker utility is an increasing function of ${\lambda}^{\mathit{dec}}$). As a result, the worst-case utility of the attacker is no more than $0.33$ which is strictly lower than the utility of $1.0$ when the attacker mimics ${\lambda}^{\mathit{dec}}={\lambda}^{*}=0$.

**Corollary**

**1.**

#### 5.1. Analyzing Attacker Deception Adaptation

#### 5.1.1. Decomposability of Attacker Deception Space

Algorithm 2: Counter-deception—Decomposition of QR parameter into sub-intervals |

**Lemma**

**6.**

**Lemma**

**7.**

#### 5.1.2. Characteristics of Attacker Optimal Deception

**Lemma**

**8.**

**Lemma**

**9.**

**Proof.**

**Lemma**

**10.**

**Proof.**

**Remark**

**1.**

- Anyjsuch that there is a${j}^{\prime}>j$with${n}_{{j}^{\prime}}^{max}\le {n}_{j}^{max}$
- Any$j<M$such that${n}_{j}^{max}=N$

#### 5.2. Finding Optimal Defense Function ${\mathcal{H}}^{I}$ Given Fixed I: Divide-and-Conquer

**Proposition**

**2**

- For all $n>{n}_{{j}^{\mathrm{fea}}}^{max}$, choose ${\mathbf{x}}_{n}={\mathbf{x}}_{>}^{*}$ where ${\mathbf{x}}_{>}^{*}$ is an optimal solution of the following optimization problem:$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& {min}_{\mathbf{x}\in \mathbf{X}}{U}^{a}(\mathbf{x},{\lambda}^{max})\hfill \end{array}$$
- For all $n\le {n}_{{j}^{\mathrm{fea}}}^{max}$, choose ${\mathbf{x}}_{n}={\mathbf{x}}_{<}^{*}$ where ${\mathbf{x}}_{<}^{*}$ is the optimal solution of the following optimization problem:$$\begin{array}{cc}\hfill {U}_{*}^{d}={max}_{\mathbf{x}\in \mathbf{X}}\phantom{\rule{0.277778em}{0ex}}& {U}^{d}(\mathbf{x},{\overline{\lambda}}_{{j}^{\mathrm{fea}}}^{\mathrm{dec}})\hfill \\ \hfill \mathrm{s}.\mathrm{t}.\phantom{\rule{4.pt}{0ex}}& {U}^{a}(\mathbf{x},{\overline{\lambda}}_{{j}^{\mathrm{fea}}}^{\mathrm{dec}})\ge {U}^{a}({\mathbf{x}}_{>}^{*},{\lambda}^{max})\hfill \end{array}$$

**Proof.**

**Proposition**

**3**

**Proof**

**.**

#### 5.3. Completing the Proof of Theorem 3

## 6. Experimental Evaluation

**Algorithms.**We compare three cases: (i) $\mathtt{Non}-\mathtt{Dec}$: the attacker is non deceptive and the defender also assumes so. As a result, both play Strong Stackelberg equilibrium strategies; (ii) $\mathtt{Dec}-\delta $: the attacker is deceptive, while the defender does not handle the attacker’s deception (Section 4). We examine different uncertainty ranges by varying values of $\delta $; and (iii) $\mathtt{Dec}-\mathtt{Counter}$: the attacker is deceptive while the defender tackle the attacker’s deception (Section 5).

#### Additional Experiment Results

## 7. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Notes

1 | We use a uniform discretization for the sake of solution quality analysis (as we will describe later). Our approach can be generalized to any non-uniform discretization. |

2 | Lemma 7 is stated for the general case $n>1$ when the defender’s interval ${I}_{n}^{d}$ is left-open. When $n=1$ with the left bound is included, we have $l{b}_{n}\le {\lambda}^{\mathrm{dec}}\le u{b}_{n+1}$. |

3 | There is a degenerate case in which ${U}^{a}(\mathbf{x},\lambda )$ is constant for all $\lambda $, when the defense strategy $\mathbf{x}$ leads to an identical expected utility for the attacker across all targets. To avoid this case, we can add a small noise to such defense strategy $\mathbf{x}$ so that these attacker expected utilities vary across the targets, while ensuring that this noise only leads to a small change in the defender’s utility. |

## References

- Tambe, M. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Yang, R.; Kiekintveld, C.; Ordonez, F.; Tambe, M.; John, R. Improving resource allocation strategy against human adversaries in security games. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spain, 16–22 July 2011. [Google Scholar]
- Nguyen, T.H.; Yang, R.; Azaria, A.; Kraus, S.; Tambe, M. Analyzing the effectiveness of adversary modeling in security games. In Proceedings of the AAAI Conference on Artificial Intelligence, Bellevue, WA, USA, 41–18 October 2013. [Google Scholar]
- Nguyen, T.H.; Vu, N.; Yadav, A.; Nguyen, U. Decoding the Imitation Security Game: Handling Attacker Imitative Behavior Deception. In Proceedings of the 24th European Conference on Artificial Intelligence, Compostela, Spain, 29 August–8 September 2020. [Google Scholar]
- Gholami, S.; Yadav, A.; Tran-Thanh, L.; Dilkina, B.; Tambe, M. Do not Put All Your Strategies in One Basket: Playing Green Security Games with Imperfect Prior Knowledge. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems, Montreal, QC, Canada, 13–17 May 2019; pp. 395–403. [Google Scholar]
- McFadden, D. Conditional Logit Analysis of Qualitative Choice Behavior. In Frontiers in Econometrics New York; Zarembka, P., Ed.; Academic Press: Cambridge, MA, USA, 1973. [Google Scholar]
- McKelvey, R.D.; Palfrey, T.R. Quantal response equilibria for normal form games. Games Econ. Behav.
**1995**, 10, 6–38. [Google Scholar] [CrossRef] [Green Version] - Kar, D.; Nguyen, T.H.; Fang, F.; Brown, M.; Sinha, A.; Tambe, M.; Jiang, A.X. Trends and applications in Stackelberg security games. Handb. Dyn. Game Theory
**2017**. [Google Scholar] [CrossRef] - An, B.; Shieh, E.; Yang, R.; Tambe, M.; Baldwin, C.; DiRenzo, J.; Maule, B.; Meyer, G. A Deployed Quantal Response Based Patrol Planning System for the US Coast Guard. Interfaces
**2013**, 43, 400–420. [Google Scholar] [CrossRef] [Green Version] - Carroll, T.E.; Grosu, D. A game theoretic investigation of deception in network security. Secur. Commun. Netw.
**2011**, 4, 1162–1172. [Google Scholar] [CrossRef] - Fraunholz, D.; Anton, S.D.; Lipps, C.; Reti, D.; Krohmer, D.; Pohl, F.; Tammen, M.; Schotten, H.D. Demystifying Deception Technology: A Survey. arXiv
**2018**, arXiv:1804.06196. [Google Scholar] - Horák, K.; Zhu, Q.; Bošanskỳ, B. Manipulating adversary’s belief: A dynamic game approach to deception by design for proactive network security. In Proceedings of the International Conference on Decision and Game Theory for Security, Vienna, Austria, 23–25 October 2017; Springer: Cham, Switzerland, 2017; pp. 273–294. [Google Scholar]
- Zhuang, J.; Bier, V.M.; Alagoz, O. Modeling secrecy and deception in a multiple-period attacker–defender signaling game. Eur. J. Oper. Res.
**2010**, 203, 409–418. [Google Scholar] [CrossRef] - Han, X.; Kheir, N.; Balzarotti, D. Deception techniques in computer security: A research perspective. ACM Comput. Surv. (CSUR)
**2018**, 51, 1–36. [Google Scholar] [CrossRef] - Fugate, S.; Ferguson-Walter, K. Artificial Intelligence and Game Theory Models for Defending Critical Networks with Cyber Deception. AI Mag.
**2019**, 40, 49–62. [Google Scholar] [CrossRef] - Guo, Q.; An, B.; Bosansky, B.; Kiekintveld, C. Comparing strategic secrecy and Stackelberg commitment in security games. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), Melbourne, Australia, 19–25 August 2017. [Google Scholar]
- Rabinovich, Z.; Jiang, A.X.; Jain, M.; Xu, H. Information disclosure as a means to security. In Proceedings of the the 2015 International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey, 4–8 May 2015; pp. 645–653. [Google Scholar]
- Sinha, A.; Fang, F.; An, B.; Kiekintveld, C.; Tambe, M. Stackelberg Security Games: Looking Beyond a Decade of Success. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden, 13–19 July 2018; pp. 5494–5501. [Google Scholar]
- Xu, H.; Rabinovich, Z.; Dughmi, S.; Tambe, M. Exploring Information Asymmetry in Two-Stage Security Games. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 1057–1063. [Google Scholar]
- Gan, J.; Xu, H.; Guo, Q.; Tran-Thanh, L.; Rabinovich, Z.; Wooldridge, M. Imitative Follower Deception in Stackelberg Games. arXiv
**2019**, arXiv:1903.02917. [Google Scholar] - Nguyen, T.H.; Wang, Y.; Sinha, A.; Wellman, M.P. Deception in finitely repeated security games. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, Hi, USA, 27 January–1 February 2019. [Google Scholar]
- Estornell, A.; Das, S.; Vorobeychik, Y. Deception Through Half-Truths. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020. [Google Scholar]
- Nguyen, T.H.; Sinha, A.; He, H. Partial Adversarial Behavior Deception in Security Games. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Virtual Conference, 7–15 January 2021. [Google Scholar]
- Biggio, B.; Nelson, B.; Laskov, P. Poisoning attacks against support vector machines. arXiv
**2012**, arXiv:1206.6389. [Google Scholar] - Huang, L.; Joseph, A.D.; Nelson, B.; Rubinstein, B.I.; Tygar, J.D. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, Chicago, IL, USA, 21 October 2011; ACM: New York, NY, USA, 2011; pp. 43–58. [Google Scholar]
- Steinhardt, J.; Koh, P.W.W.; Liang, P.S. Certified defenses for data poisoning attacks. In Proceedings of the Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 3517–3529. [Google Scholar]
- Tong, L.; Yu, S.; Alfeld, S. Adversarial Regression with Multiple Learners. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 4946–4954. [Google Scholar]
- Kiekintveld, C.; Jain, M.; Tsai, J.; Pita, J.; Ordóñez, F.; Tambe, M. Computing optimal randomized resource allocations for massive security games. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems, Budapest, Hungary, 10–15 May 2009; pp. 689–696. [Google Scholar]

**Figure 1.**An example of discretizing ${\lambda}^{\mathrm{learnt}}$, ${\mathsf{\Lambda}}^{\mathrm{learnt}}=\{0,0.9,1.7,2.3\}$, and the six resulting attacker sub-intervals and corresponding uncertainty sets, with ${\lambda}^{max}=2,\delta =0.5$. In particular, the first sub-interval of deceptive ${\lambda}^{\mathrm{dec}}$ is $in{t}_{1}=[0,0.4)$ in which any ${\lambda}^{\mathrm{dec}}$ corresponds to the same uncertainty set of possible learning outcomes ${\mathsf{\Lambda}}_{1}^{\mathrm{learnt}}=\left\{0\right\}$.

**Figure 2.**An example of a defense function with corresponding sub-intervals and uncertainty sets of the attacker, where ${\lambda}^{max}=2.0$ and $\delta =0.4$. The defense function is determined as: ${I}_{1}^{d}=[0,1.4]$, ${I}_{2}^{d}=(1.4,2.4]$ with corresponding defense strategies $\{{\mathbf{x}}_{1},{\mathbf{x}}_{2}\}$. Then the deception range of the attacker can be divided into three sub-intervals: $in{t}_{1}^{\mathrm{dec}}=[0,1],in{t}_{2}^{\mathrm{dec}}=(1,1.8],in{t}_{3}^{\mathrm{dec}}=(1.8,2]$ with corresponding uncertainty sets ${\mathbf{X}}_{1}^{\mathrm{def}}=\left\{{\mathbf{x}}_{1}\right\},{\mathbf{X}}_{2}^{\mathrm{def}}=\{{\mathbf{x}}_{1},{\mathbf{x}}_{2}\},{\mathbf{X}}_{3}^{\mathrm{def}}=\left\{{\mathbf{x}}_{2}\right\}$. For example, if the attacker plays any ${\lambda}^{\mathrm{dec}}\in in{t}_{2}^{\mathrm{dec}}$, it will lead the defender to play either ${\mathbf{x}}_{1}$ or ${\mathbf{x}}_{2}$, depending on the actual learning outcome of the defender.

Target 1 | Target 2 | Target 3 | |
---|---|---|---|

Def. Reward | 2 | 3 | 1 |

Def. Penalty | −1 | −2 | 0 |

Att. Reward | 2 | 1 | 3 |

Att. Penalty | −3 | −2 | −3 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Nguyen, T.H.; Yadav, A.
A Complete Analysis on the Risk of Using Quantal Response: When Attacker Maliciously Changes Behavior under Uncertainty. *Games* **2022**, *13*, 81.
https://doi.org/10.3390/g13060081

**AMA Style**

Nguyen TH, Yadav A.
A Complete Analysis on the Risk of Using Quantal Response: When Attacker Maliciously Changes Behavior under Uncertainty. *Games*. 2022; 13(6):81.
https://doi.org/10.3390/g13060081

**Chicago/Turabian Style**

Nguyen, Thanh Hong, and Amulya Yadav.
2022. "A Complete Analysis on the Risk of Using Quantal Response: When Attacker Maliciously Changes Behavior under Uncertainty" *Games* 13, no. 6: 81.
https://doi.org/10.3390/g13060081