# The Effects of Personalized Nudges on Cognitively Disengaged Student Behavior in Low-Stakes Assessments

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### The Present Study

^{®}) to predict and explain learner behavior (for a list of publications, see http://act-r.psy.cmu.edu/publication/, accessed on 30 June 2023). ACT-R has default parameters for timing of perception, action, and cognitive processes based on empirical research. For example, it takes 50 ms to encode a visual chunk of information, 200 ms to create a working memory chunk, and 250 ms for motor movement preparation after deciding to press a button.

## 2. Methods

#### 2.1. Participants

#### 2.2. Materials

#### 2.3. Design and Procedure

#### Experimental Conditions

#### 2.4. Analyses

#### 2.4.1. Data Preprocessing and Analysis Approach

_{previous model}− AIC

_{next model}), and selected the next model if AIC was greater than the rule of thumb number of 2 (Burnham and Anderson 2002). We ran the models with bobyqa optimizer. If a model did not converge, we tried different optimizers (i.e., Nelder_Mead, nloptwrap). If the model still did not converge, we selected a simpler model.

#### 2.4.2. Response Disengagement

#### 2.4.3. Performance

#### 2.4.4. Students Behavior after Receiving a Nudge

#### 2.4.5. Metacognitive Measures

## 3. Results

#### 3.1. Response Disengagement

_{Block 1}= 0.945, SD = 0.104; M

_{Block 2}= 0.888, SD = 0.180) compared to the control condition (M

_{Block 1}= 0.931, SD = 0.136; M

_{Block 2}= 0.857, SD = 0.196) and the effort instruction condition (M

_{Block 1}= 0.919, SD = 0.155; M

_{Block 2}= 0.854, SD = 0.213).

#### 3.2. Performance

_{Block 1}= 10.297, SD = 4.550, Mdn

_{Block 1}= 9.87; M

_{Block 2}= 10.296, SD = 4.535, Mdn

_{Block 2}= 9.80) compared to the control condition (M

_{Block 1}= 10.165, SD = 4.358, Mdn

_{Block 1 }= 9.11; M

_{Block 2}= 10.104, SD = 4.283, Mdn

_{Block 2}= 9.23). The same pattern did not hold for the effort instruction condition (M

_{Block 1}= 9.576, SD = 4.161, Mdn

_{Block 1}= 9.33; M

_{Block 2}= 9.756, SD = 4.094, Mdn

_{Block 2}= 9.37).

#### 3.3. Student Behavior after Receiving a Nudge

_{observations}= 126; M = 1.77; SD = 1.77; Max = 14; Min = 1) and 114 students (80%) selected “I will go back to the item” at least once (N

_{observations}= 221; M = 1.94; SD = 1.61; Max = 9; Min = 1) regardless of their selection about their effort. Figure 2 presents a Sankey diagram showing the number and percentage of students’ selections among overall observations (N = 512).

_{observations}= 221), 84 students (74%) interacted with the response field (i.e., they changed their answer, or they kept their previous answer after initially changing their answer) at least in one item that they received a nudge (N

_{observations}= 115; M = 1.37; SD = 0.71; Max = 5; Min = 1), and 26 students (23%) left the item without any interaction at least once (N

_{observations}= 44; M = 1.69; SD = 1.41; Max = 7; Min = 1). Note that due to the technical problems in the data, we do not have information about students’ behavior for 62 observations (28%).

#### 3.4. Metacognitive Measures

## 4. Discussion

#### 4.1. Theoretical Implications

#### 4.2. Practical Implications

#### 4.3. Limitations and Future Directions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Figure A1.**A comparison of item time thresholds between the theory-driven method that we used in this study and the normative threshold method (NM) with different percentages (10%, 20%, and 30%).

## Appendix B

**Figure A3.**A screenshot of the follow-up nudge screen which was presented after students clicked on one of the buttons on Figure A2.

## Appendix C

**Table A1.**Estimates, standard errors (SE), z-values, and p-values of the final generalized mixed-effects model (binomial) for item response disengagement and performance (i.e., item score).

Item Response Disengagement | Item Score | |||||||
---|---|---|---|---|---|---|---|---|

Predictors | Estimate | SE | z-Value | p-Value | Estimate | SE | z-Value | p-Value |

(Intercept) | −5.097 | 0.403 | −12,664 | <.001 | −0.15 | 0.298 | −0.504 | .614 |

Condition [Control] | 0.364 | 0.158 | 2.305 | .021 | 0.035 | 0.065 | 0.547 | .584 |

Condition [Instruction] | 0.470 | 0.158 | 2.981 | .003 | −0.058 | 0.065 | −0.900 | .368 |

Block [2] | 1.056 | 0.061 | 17.463 | <.001 | −0.133 | 0.036 | −3.703 | <.001 |

Current Math Grade [B] | 0.963 | 0.188 | 5.12 | <.001 | −0.515 | 0.073 | −7.038 | <.001 |

Current Math Grade [C] | 1.417 | 0.195 | 7.257 | <.001 | −0.956 | 0.078 | −12.239 | <.001 |

Current Math Grade [D] | 1.701 | 0.217 | 7.856 | <.001 | −1.128 | 0.09 | −12.590 | <.001 |

Current Math Grade [F] | 2.090 | 0.220 | 9.523 | <.001 | −1.262 | 0.093 | −13.553 | <.001 |

## Appendix D

**Table A2.**Estimates, standard errors (SE), z-values, and p-values of the final generalized mixed-effects model (binomial) for item response disengagement and item score after excluding students who had RTE = 1 (i.e., RTE < 1).

Item Response Disengagement for RTE < 1 | Item Score for RTE < 1 | |||||||
---|---|---|---|---|---|---|---|---|

Predictors | Estimate | SE | z-Value | p-Value | Estimate | SE | z-Value | p-Value |

(Intercept) | −3.531 | 0.291 | −12.157 | <.001 | −0.531 | 0.290 | −1.828 | .068 |

Condition [Control] | 0.383 | 0.134 | 2.862 | .004 | 0.053 | 0.075 | 0.715 | .475 |

Condition [Instruction] | 0.590 | 0.135 | 4.368 | <.001 | 0.027 | 0.076 | 0.350 | .726 |

Block [2] | 1.040 | 0.062 | 16.846 | <.001 | −0.182 | 0.048 | −3.769 | <.001 |

Current Math Grade [B] | 0.384 | 0.170 | 2.264 | .024 | −0.367 | 0.091 | −4.021 | <.001 |

Current Math Grade [C] | 0.659 | 0.173 | 3.807 | <.001 | −0.647 | 0.095 | −6.836 | <.001 |

Current Math Grade [D] | 0.964 | 0.190 | 5.083 | <.001 | −0.951 | 0.107 | −8.865 | <.001 |

Current Math Grade [F] | 1.109 | 0.187 | 5.942 | <.001 | −1.008 | 0.107 | −9.451 | <.001 |

**Table A3.**Estimates, standard errors (SE), z-values, and p-values of the final generalized mixed-effects model (binomial) for item response disengagement and performance after deserving first nudge.

Item Response Disengagement after Deserving First Nudge | Item Score after Deserving First Nudge | |||||||
---|---|---|---|---|---|---|---|---|

Predictors | Estimate | SE | z-Value | p-Value | Estimate | SE | z-Value | p-Value |

(Intercept) | −6.212 | 0.597 | −10.412 | <.001 | −0.080 | 0.308 | −0.259 | .068 |

Condition [Control] | 0.756 | 0.262 | 2.883 | .004 | −0.010 | 0.075 | −0.139 | .890 |

Condition [Instruction] | 0.820 | 0.261 | 3.138 | .002 | −0.146 | 0.075 | −1.946 | .052 |

Block [2] | 0.629 | 0.093 | 6.785 | <.001 | −0.133 | 0.044 | −3.006 | .003 |

Current Math Grade [B] | 1.292 | 0.319 | 4.056 | <.001 | −0.504 | 0.084 | −5.988 | <.001 |

Current Math Grade [C] | 2.063 | 0.329 | 6.264 | <.001 | −1.003 | 0.090 | −11.108 | <.001 |

Current Math Grade [D] | 2.538 | 0.360 | 7.052 | <.001 | −1.131 | 0.104 | −10.878 | <.001 |

Current Math Grade [F] | 2.910 | 0.363 | 8.009 | <.001 | −1.270 | 0.108 | −11.771 | <.001 |

## Appendix E

**Figure A4.**Percentages of each option in Likert Scale in each condition for (

**a**) perceived effort, (

**b**) perceived difficulty, (

**c**) expected performance, and (

**d**) feeling very nervous.

**Table A4.**Estimates, confidence intervals (CI), and p-values of the final linear regression models for each self-reported measure.

Perceived Effort | Perceived Difficulty | Expected Performance | Feeling Very Nervous | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Predictors | Estimates | CI (95%) | p-Value | Estimates | CI (95%) | p-Value | Odds Ratios | CI (95%) | p-Value | Estimates | CI (95%) | p-Value |

(Intercept) | 2.57 | 1.83–3.32 | <.001 | 8.66 | 8.05–9.28 | <.001 | 0.39 | 0.21–0.71 | .002 | 3.56 | 2.52–4.60 | <.001 |

Condition [Control] | −0.13 | −0.44–0.18 | .403 | 0.14 | 0.36–0.64 | .587 | 0.79 | 0.49–1.26 | .320 | −0.21 | −0.65–0.22 | .339 |

Condition [Instruction] | −0.09 | −0.43–0.25 | .614 | −0.00 | −0.55–0.54 | .989 | 0.96 | 0.57–1.62 | .884 | −0.08 | −0.56–0.40 | .744 |

RTE | 1.94 | 1.12–2.76 | <.001 | - | - | - | - | - | - | 0.32 | −0.84–1.48 | .587 |

Math Score | - | - | - | −3.39 | −4.95–−1.82 | <.001 | 17.03 | 3.75–82.00 | <.001 | - | - | - |

Observations | 416 | Observations | 397 | Observations | 401 | Observations | 410 |

## Notes

1 | The final models reported in the manuscript include all student responses to the items. To further investigate the effects of nudges, we also conducted two additional analyses which included a subset of data that excluded the students who did not receive or deserve any nudge (i.e., RTE = 1) and a subset of data that only included the data after students’ first detected not-fully-effortful response (i.e., after deserving first nudge). The results of both analyses showed similar patterns with the overall results that were reported in the manuscript although the effects of nudges were larger after excluding students who had RTE = 1 (i.e., RTE < 1). For brevity and simplicity, we included these additional analyses on disengaged responses and performance in Appendix D. |

2 | Alternatively, it might also be the case that students felt that they have found out and denied that they did not give their best effort, or alternatively they did not put effort to evaluate their effort and just clicked on the option that they gave their best effort (although, if it was the case, we would expect a more random selection of one of the presented options, possibly skewed by the last location of the mouse to decrease the effort to click on an option without effort). |

## References

- Anderson, John R., and Jon M. Fincham. 2014. Extending problem-solving procedures through reflection. Cognitive Psychology 74: 1–34. [Google Scholar] [CrossRef]
- Anderson, John R. 2007. How Can the Human Mind Occur in the Physical Universe? New York: Oxford University Press. [Google Scholar]
- Arslan, Burcu, Blair Lehman, Jesse R. Sparks, and Jonathan Steinberg. 2021. Application of a theory-driven approach to detect cognitively disengaged test-taker behavior. NERA Conference Proceedings 3. [Google Scholar]
- Bates, Douglas, Martin Maechler, Ben Bolker, and Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67: 1–48. [Google Scholar] [CrossRef]
- Bennett, Randy Elliot. 2011. Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice 18: 5–25. [Google Scholar]
- Brown, Ann L. 1987. Metacognition, executive control, self-regulation, and other more mysterious mechanisms. In Metacognition, Motivation, and Understanding. Edited by F. Weinert and R. Kluwe. Hillside: Erlbaum, pp. 65–116. [Google Scholar]
- Bulut, Okan, Guher Gorgun, Tarid Wongvorachan, and Bin Tan. 2023. Rapid guessing in low-stakes assessments: Finding the optimal response time threshold with random search and genetic algorithm. Algorithms 16: 89. [Google Scholar] [CrossRef]
- Burnham, Kenneth P., and David R. Anderson. 2002. Model Selection and Multimodel Inference: A Practical-Theoretical Approach, 2nd ed. New York: Springer. [Google Scholar]
- Carver, Ronald P. 1992. Reading rate: Theory, research, and practical implications. Journal of Reading 36: 84–95. [Google Scholar]
- Damgaard, Mette Trier, and Helena Skyt Nielsen. 2018. Nudging in education. Economics of Education Review 64: 313–42. [Google Scholar] [CrossRef]
- Deribo, Tobias, Frank Goldhammer, and Ulf Kroehne. 2023. Changes in the speed–ability relation through different treatments of rapid guessing. Educational and Psychological Measurement 83: 473–94. [Google Scholar] [CrossRef]
- Diedenhofen, Birk. 2016. cocron: Statistical Comparisons of Two or more Alpha Coefficients (Version 1.0-1). Available online: http://comparingcronbachalphas.org (accessed on 11 August 2023).
- Efklides, Anastasia. 2011. Interactions of metacognition with motivation and affect in self-regulated learning: The MASRL model. Educational Psychologist 46: 6–25. [Google Scholar] [CrossRef]
- Evans, Jonathan St B. T. 2008. Dual-processing accounts of reasoning, judgment, and social cognition. Annual Review of Psychology 59: 255–78. [Google Scholar] [CrossRef]
- Finn, Bridgid. 2015. Measuring motivation in low-stakes assessments. ETS Research Report Series 2015: 1–17. [Google Scholar] [CrossRef]
- Gneezy, Uri, John A. List, Jeffrey A. Livingston, Xiangdong Qin, Sally Sadoff, and Yang Xu. 2019. Measuring success in education: The role of effort on the test itself. American Economic Review: Insights 1: 291–308. [Google Scholar]
- Goldhammer, Frank, Thomas Martens, and Oliver Lüdtke. 2017. Conditioning factors of test-taking engagement in PIAAC: An exploratory IRT modelling approach considering person and item characteristics. Large-Scale Assessments in Education 5: 18. [Google Scholar] [CrossRef]
- Harris, Charles R., K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, and et al. 2020. Array programming with NumPy. Nature 585: 357–62. [Google Scholar] [CrossRef] [PubMed]
- Hertwig, R., and T. Grüne-Yanoff. 2017. Nudging and boosting: Steering or empowering good decisions. Perspectives on Psychological Science 12: 973–86. [Google Scholar] [CrossRef]
- Johnson, David Richard, and James C. Creech. 1983. Ordinal measures in multiple indicator models: A simulation study of categorization error. American Sociological Review 48: 398–407. [Google Scholar] [CrossRef]
- Kahneman, Daniel. 2011. Thinking, Fast and Slow. New York: Farrar, Straus, and Giroux. [Google Scholar]
- Kolers, Paul A., and Sandra R. Palef. 1977. Knowing not. Memory & Cognition 5: 553–58. [Google Scholar]
- Lau, Abigail R., Peter J. Swerdzewski, Andrew T. Jones, Robin D. Anderson, and Ross E. Markle. 2009. Proctors matter: Strategies for increasing examinee effort on general education program assessments. The Journal of General Education 58: 196–217. [Google Scholar] [CrossRef]
- Lehman, Blair, Jesse R. Sparks, Jonathan Steinberg, and Jennifer Bailey. 2022. Impact of Varied Item Position on Item-Level Test-Taking Engagement [PowerPoint Slides]. Princeton: Research and Development, Educational Testing Service. [Google Scholar]
- Lenth, Russell V. 2023. emmeans: Estimated Marginal Means, aka Least-Squares Means. R Package Version 1.8.5. Available online: https://CRAN.R-project.org/package=emmeans (accessed on 11 August 2023).
- Levitt, Steven D., John A. List, Susanne Neckermann, and Sally Sadoff. 2016. The behaviouralist goes to school: Leveraging behavioral economics to improve economic performance. American Economic Journal: Applied Economics 4: 183–219. [Google Scholar]
- Lieder, Falk, and Thomas L. Griffiths. 2020. Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources. Behavioral and Brain Sciences 43: e1. [Google Scholar] [CrossRef]
- Lindner, Marlit Annalena, Oliver Lüdtke, and Gabriel Nagy. 2019. The onset of rapid-guessing behavior over the course of testing time: A matter of motivation and cognitive resources. Frontiers in Psychology 10: 1533. [Google Scholar] [CrossRef]
- Lüdecke, Daniel. 2023. sjPlot: Data Visualization for Statistics in Social Science. R Package Version 2.8.14. Available online: https://CRAN.R-project.org/package=sjPlot (accessed on 11 August 2023).
- Maier, Maximilian, František Bartoš, T. D. Stanley, David R. Shanks, Adam J. L. Harris, and Eric-Jan Wagenmakers. 2022. No evidence for nudging after adjusting for publication bias. Proceedings of the National Academy of Sciences 119: e2200300119. [Google Scholar] [CrossRef]
- McKinney, Wes. 2010. Data structures for statistical computing in python. Paper presented at 9th Python in Science Conference, Austin, TX, USA, June 28–July 3; vol. 445, pp. 51–56. [Google Scholar]
- Mertens, Stephanie, Mario Herberz, Ulf J. J. Hahnel, and Tobias Brosch. 2022. The effectiveness of nudging: A meta-analysis of choice architecture interventions across behavioral domains. Proceedings of the National Academy of Sciences 119: e2107346118. [Google Scholar] [CrossRef]
- Metcalfe, Janet, and Bridgid Finn. 2008. Familiarity and retrieval processes in delayed judgments of learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 34: 1084–97. [Google Scholar] [CrossRef] [PubMed]
- Metcalfe, Janet, and David Wiebe. 1987. Intuition in insight and noninsight problem solving. Memory & Cognition 15: 238–246. [Google Scholar]
- Miele, David B., and Abigail A. Scholer. 2018. The role of metamotivational monitoring in motivation regulation. Educational Psychologist 53: 1–21. [Google Scholar] [CrossRef]
- Münscher, Robert, Max Vetter, and Thomas Scheuerle. 2016. A review and taxonomy of choice architecture techniques. Journal of Behavioral Decision Making 29: 511–24. [Google Scholar]
- Nagy, Gabriel, Esther Ulitzsch, and Marlit Annalena Lindner. 2023. The role of rapid guessing and test-taking persistence in modelling test-taking engagement. Journal of Computer Assisted Learning 39: 751–66. [Google Scholar] [CrossRef]
- Norman, Geoff. 2010. Likert scales, levels of measurement and the “laws” of statistics. Advances in Health Sciences Education 15: 625–32. [Google Scholar] [CrossRef] [PubMed]
- Paynter, Christopher A., Lynne M. Reder, and Paul D. Kieffaber. 2009. Knowing we know before we know: ERP correlates of initial feeling-of-knowing. Neuropsychologia 47: 796–803. [Google Scholar] [CrossRef] [PubMed]
- R Development Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 11 August 2023).
- Ramkumar, Anjana, Pieter Jan Stappers, Wiro J. Niessen, Sonja Adebahr, Tanja Schimek-Jasch, Ursula Nestle, and Yu Song. 2017. Using GOMS and NASA-TLX to evaluate human-computer interaction process in interactive segmentation. International Journal of Human-Computer Interaction 33: 123–34. [Google Scholar] [CrossRef]
- Reder, Lynne M., and Frank E. Ritter. 1992. What determines initial feeling of knowing? Familiarity with question terms, not with the answer. Journal of Experimental Psychology: Learning, Memory, and Cognition 18: 435. [Google Scholar] [CrossRef]
- Rowland, Christopher A. 2014. The effect of testing versus restudy on retention: A meta-analytic review of the testing effect. Psychological Bulletin 140: 1432. [Google Scholar] [CrossRef] [PubMed]
- Schneider, Wolfgang, and Cordula Artelt. 2010. Metacognition and mathematics education. ZDM 42: 149–61. [Google Scholar] [CrossRef]
- Simon, Herbert A. 1955. A behavioral model of rational choice. The Quarterly Journal of Economics 69: 99–118. [Google Scholar] [CrossRef]
- Soland, James, Megan Kuhfeld, and Joseph Rios. 2021. Comparing different response time threshold setting methods to detect low effort on a large-scale assessment. Large-Scale Assessments in Education 9: 8. [Google Scholar] [CrossRef]
- Son, Lisa K., and Janet Metcalfe. 2005. Judgments of learning: Evidence for a two-stage process. Memory & Cognition 33: 1116–29. [Google Scholar]
- Stanovich, K. E., R. F. West, and R. Hertwig. 2000. Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences 23: 645–726. [Google Scholar] [CrossRef]
- Sullivan, Gail M., and Anthony R. Artino Jr. 2013. Analyzing and interpreting data from Likert-type scales. Journal of Graduate Medical Education 5: 541–42. [Google Scholar] [CrossRef]
- Thaler, R. H., and C. R. Sunstein. 2008. Nudge: Improving Decisions About Health, Wealth, and Happiness. London: Penguin. [Google Scholar]
- Thaler, Richard H., Cass R. Sunstein, and John P. Balz. 2013. Choice architecture. The Behavioral Foundations of Public Policy 25: 428–39. [Google Scholar]
- van der Sande, Lisa, Ilona Wildeman, Adriana G. Bus, and Roel van Steensel. 2023. Nudging to stimulate reading in primary and secondary education. Sage Open 13: 2. [Google Scholar] [CrossRef]
- Van Rossum, Guido, and Fred L. Drake. 2009. Python 3 Reference Manual. Scotts Valley: CreateSpace. [Google Scholar]
- Weijers, Robert J., Björn B. de Koning, and Fred Paas. 2021. Nudging in education: From theory towards guidelines for successful implementation. European Journal of Psychology of Education 36: 883–902. [Google Scholar] [CrossRef]
- Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, Alex Hayes, Lionel Henry, Jim Hester, and et al. 2019. Welcome to the tidyverse. Journal of Open Source Software 4: 1686. [Google Scholar] [CrossRef]
- Wise, Steven L. 2017. Rapid-guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice 36: 52–61. [Google Scholar] [CrossRef]
- Wise, Steven L. 2019. An information-based approach to identifying rapid-guessing thresholds. Applied Measurement in Education 32: 325–36. [Google Scholar] [CrossRef]
- Wise, Steven L., and Christine E. DeMars. 2005. Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment 10: 1–17. [Google Scholar] [CrossRef]
- Wise, Steven L., and Lingling Ma. 2012. Setting response time thresholds for a CAT item pool: The normative threshold method. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Vancouver, BC, Canada, April 4–7. [Google Scholar]
- Wise, Steven L., and Lingyun Gao. 2017. A general approach to measuring test-taking effort on computer-based tests. Applied Measurement in Education 30: 343–54. [Google Scholar] [CrossRef]
- Wise, Steven L., and Lisa F. Smith. 2011. A model of examinee test-taking effort. In High-Stakes Testing in Education: Science and Practice in K-12 Settings. Edited by James A. Bovaird, Kurt F. Geisinger and Chad W. Buckendahl. Washington, DC: American Psychological Association, pp. 139–53. [Google Scholar]
- Wise, Steven L., and Megan R. Kuhfeld. 2020. Using retest data to evaluate and improve effort-moderated scoring. Journal of Educational Measurement 58: 130–49. [Google Scholar] [CrossRef]
- Wise, Steven L., and Xiaojing Kong. 2005. Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education 18: 163–83. [Google Scholar] [CrossRef]
- Wise, Steven L., Dena A. Pastor, and Xiaojing J. Kong. 2009. Correlates of rapid-guessing behavior in low-stakes testing: Implications for test development and measurement practice. Applied Measurement in Education 22: 185–205. [Google Scholar] [CrossRef]
- Wise, Steven L., Dennison S. Bhola, and Sheng-Ta Yang. 2006. Taking the time to improve the validity of low-stakes tests: The effort-monitoring CBT. Educational Measurement: Issues and Practice 25: 21–30. [Google Scholar] [CrossRef]
- Wise, Steven L., Megan R. Kuhfeld, and James Soland. 2019. The effects of effort monitoring with proctor notification on test-taking engagement, test performance, and validity. Applied Measurement in Education 32: 183–92. [Google Scholar] [CrossRef]
- Wise, Steven, and Megan Kuhfeld. 2021. A method for identifying partial test-taking engagement. Applied Measurement in Education 34: 150–61. [Google Scholar] [CrossRef]
- Zumbo, Bruno D., and Donald W. Zimmerman. 1993. Is the selection of statistical methods governed by level of measurement? Canadian Psychology 34: 390–400. [Google Scholar] [CrossRef]

**Figure 2.**A Sankey diagram showing the number and percentage of students’ selections among overall observations.

**Figure 3.**Percentage of students’ ratings in the nudge condition for the statements: (

**a**) “I received warning messages whenever I did not give my best effort”, and (

**b**) “I received warning messages whenever I did not know the answer”.

**Figure 4.**The relationship between (

**a**) perceived effort and response time effort (RTE); (

**b**) perceived difficulty and actual math score; (

**c**) expected performance and actual performance; and (

**d**) feeling very nervous due to the mathematics assessment and RTE in each condition.

Features | Wise et al. (2006) | The Present Study |
---|---|---|

Sample | University students | 8th-grade students |

Domain | Scientific reasoning and fine arts | Mathematics |

Experimental manipulations | Warning vs. Control | Nudge vs. Control Nudge vs. Instruction |

Disengagement measure | Rapid guessing behavior | Not-fully-effortful responses |

Method to detect disengagement | Data-driven | Theory-driven |

Item navigation | Students had to answer each item and they were not allowed to go back to an item after they submitted their answer. | Students could omit answers and they were able to navigate between items within a block. |

Warning/Nudge algorithm | The first warning was presented after detecting three consecutive RGBs. The second warning was presented if the students had another three consecutive RGBs. | Students were presented with a nudge to give their best effort following each first-attempt response that was both incorrect and not-fully-effortful. |

Outcome measures | Response time effort, total score | Item response disengagement, item score |

**Table 2.**Odds ratios, confidence intervals (CI), and p-values of the final generalized mixed-effects model (binomial) for item response disengagement and item score.

Item Response Disengagement | Item Score | |||||
---|---|---|---|---|---|---|

Predictors | Odds Ratios | CI (95%) | p-Value | Odds Ratios | CI (95%) | p-Value |

(Intercept) | 0.01 | 0.00–0.01 | <.001 | 0.89 | 0.50–1.60 | .700 |

Condition (Control) | 1.44 | 1.06–1.96 | .021 | 0.91 | 0.80–1.03 | .144 |

Condition (Instruction) | 1.60 | 1.17–2.18 | .003 | 0.97 | 0.85–1.10 | .584 |

Block (2) | 2.87 | 2.55–3.24 | <.001 | 0.28 | 0.24–0.34 | <.001 |

Current MATH Grade (B) | 2.62 | 1.81–3.79 | <.001 | 0.88 | 0.82–0.94 | <.001 |

Current MATH Grade (C) | 4.13 | 2.81–6.05 | <.001 | 0.6 | 0.52–0.69 | <.001 |

Current MATH Grade (D) | 5.48 | 3.58–8.37 | <.001 | 0.38 | 0.33–0.45 | <.001 |

Current MATH Grade (F) | 8.09 | 5.26–12.43 | <.001 | 0.32 | 0.27–0.39 | <.001 |

Random Effects | Random Effects | |||||

σ^{2} | 3.29 | σ^{2} | 3.29 | |||

τ_{00 UserID:Teacher} | 1.99 | τ_{00 UserID:Teacher} | 0.28 | |||

τ_{00 ItemID} | 0.63 | τ_{00 ItemID} | 1.54 | |||

τ_{00 Teacher} | 1.18 | τ_{00 Teacher} | 0.29 | |||

ICC | 0.54 | ICC | 0.39 | |||

N _{UserID} | 782 | N _{UserID} | 780 | |||

N _{Teacher} | 12 | N _{Teacher} | 12 | |||

N _{ItemID} | 26 | N _{ItemID} | 26 | |||

Observations | 18,295 | Observations | 18,278 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Arslan, B.; Finn, B.
The Effects of Personalized Nudges on Cognitively Disengaged Student Behavior in Low-Stakes Assessments. *J. Intell.* **2023**, *11*, 204.
https://doi.org/10.3390/jintelligence11110204

**AMA Style**

Arslan B, Finn B.
The Effects of Personalized Nudges on Cognitively Disengaged Student Behavior in Low-Stakes Assessments. *Journal of Intelligence*. 2023; 11(11):204.
https://doi.org/10.3390/jintelligence11110204

**Chicago/Turabian Style**

Arslan, Burcu, and Bridgid Finn.
2023. "The Effects of Personalized Nudges on Cognitively Disengaged Student Behavior in Low-Stakes Assessments" *Journal of Intelligence* 11, no. 11: 204.
https://doi.org/10.3390/jintelligence11110204