# A Comparative Study of Item Response Theory Models for Mixed Discrete-Continuous Responses

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data

#### 2.2. Zero-and-One-Inflated Item Response Models for Bounded Continuous Data

_{B}distribution, the Continuous Response Model is a special case within Samejima’s Graded Response Model framework. The Simplex IRT model utilizes the simplex distribution, which, while less common, offers an alternative modeling approach to bounded continuous data. This model is beneficial in contexts such as response time analysis, where the data is naturally bounded within a specific range. In this section, we provide a general introduction to the overall model structure for all these models. All three models operate under the same structure, and the only difference is the model-specific density function utilized when modeling the continuous part of the distribution.

#### 2.3. Incorporating Collateral Information

#### 2.4. Model Fitting in Stan

#### 2.5. Disclosure of the Use of AI or AI-Assisted Technologies

## 3. Results

#### 3.1. Model Comparison and Prediction Error

#### 3.2. Model Fit

#### 3.3. Parameter Estimates

## 4. Discussion

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Model-Specific Probability Density Functions

**SB_IRT Model**

**Beta IRT Model**

**Simplex IRT Model**

## References

- Adams, Raymond J., Mark Wilson, and Margaret Wu. 1997. Multilevel Item Response Models: An Approach to Errors in Variables Regression. Journal of Educational and Behavioral Statistics 22: 47–76. [Google Scholar] [CrossRef]
- Bejar, Isaac I. 1977. An Application of the Continuous Response Level Model to Personality Measurement. Applied Psychological Measurement 1: 509–21. [Google Scholar] [CrossRef]
- Betancourt, Michael. 2018. A Conceptual Introduction to Hamiltonian Monte Carlo. arXiv arXiv:1701.02434. Available online: http://arxiv.org/abs/1701.02434 (accessed on 25 December 2020).
- Bommasani, Rishi, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, and et al. 2021. On the Opportunities and Risks of Foundation Models. arXiv arXiv:2108.07258. Available online: http://arxiv.org/abs/2108.07258 (accessed on 18 August 2021).
- Bonifay, Wes, and Li Cai. 2017. On the complexity of item response theory models. Multivariate Behavioral Research 52: 465–84. [Google Scholar] [CrossRef] [PubMed]
- Brooks, Steve P., and Andrew Gelman. 1998. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics 7: 434–55. [Google Scholar] [CrossRef]
- Brumfitt, Shelagh M., and Paschal Sheeran. 1999. The development and validation of the Visual Analogue Self-Esteem Scale (VASES). British Journal of Clinical Psychology 38: 387–400. [Google Scholar] [CrossRef] [PubMed]
- Buck, Gary. 2001. Assessing Listening. Cambridge: Cambridge University Press. [Google Scholar]
- Cardwell, Ramsey, Ben Naismith, Geoffrey T. LaFlair, and Steven Nydick. 2023. Duolingo English Test: Technical Manual. Pittsburgh: Duolingo. Available online: https://duolingo-papers.s3.amazonaws.com/other/technical_manual.pdf (accessed on 30 March 2023).
- de la Torre, Jimmy. 2003. Improving the Accuracy of Item Response Theory Parameter Estimates through Simultaneous Estimation and Incorporation of Ancillary Variables. Ph.D. dissertation, University of Illinois at Urbana-Champaign, Urbana, IL, USA. Available online: https://www.proquest.com/docview/288199771/abstract/5B88C5006775440APQ/1 (accessed on 29 April 2023).
- Ergin, Ezgi Ayturk. 2020. Fitting Propensities of Item Response Theory Models. Ph.D. dissertation, Fordham University, New York, NY, USA. Available online: https://www.proquest.com/docview/2416910996 (accessed on 17 January 2024).
- Flores, Sandra, Jorge Luis Bazán, and Heleno Bolfarine. 2020. A Hierarchical Joint Model for Bounded Response Time and Response Accuracy. In Quantitative Psychology. Edited by Marie Wiberg, Dylan Molenaar, Jorge González, Ulf Böckenholt and Jee-Seon Kim. Cham: Springer International Publishing, pp. 95–109. [Google Scholar] [CrossRef]
- Guthrie, John T., Mary Seifert, Nancy A. Burnham, and Ronald. I. Caplan. 1974. The Maze Technique to Assess, Monitor Reading Comprehension. The Reading Teacher 28: 161–68. [Google Scholar]
- Hall, Erika. 2007. Using Collateral Item and Examinee Information to Improve IRT Item Parameter Estimation. Ph.D. dissertation, The University of Iowa, Iowa City, IA, USA. Available online: https://www.proquest.com/docview/304856966/abstract/CA86530008C542E7PQ/1 (accessed on 29 April 2023).
- Hoffman, Matthew D., and Andrew Gelman. 2011. The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. arXiv arXiv:1111.4246. Available online: http://arxiv.org/abs/1111.4246 (accessed on 25 December 2020).
- Joo, Seang-Hwane, Philseo Lee, and Stephen Stark. 2022. The Explanatory Generalized Graded Unfolding Model: Incorporating Collateral Information to Improve the Latent Trait Estimation Accuracy. Applied Psychological Measurement 46: 3–18. [Google Scholar] [CrossRef] [PubMed]
- Levenshtein, Vladimir I. 1965. Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163: 845–48, English translation in Soviet Physics Doklady 10: 707–10, 1966. [Google Scholar]
- McElreath, Richard. 2017. Markov Chains: Why Walk When You Can Flow? Elements of Evolutionary Anthropology. November 28. Available online: https://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/ (accessed on 25 December 2020).
- Mellenbergh, Gideon J. 1994. A Unidimensional Latent Trait Model for Continuous Item Responses. Multivariate Behavioral Research 29: 223–36. [Google Scholar] [CrossRef] [PubMed]
- Mislevy, Robert J., and Kathleen M. Sheehan. 1989. The role of collateral information about examinees in item parameter estimation. Psychometrika 54: 661–79. [Google Scholar] [CrossRef]
- Molenaar, Dylan, Mariana Cúri, and Jorge L. Bazán. 2022. Zero and One Inflated Item Response Theory Models for Bounded Continuous Data. Journal of Educational and Behavioral Statistics 47: 693–735. [Google Scholar] [CrossRef]
- Müller, Hans. 1987. A rasch model for continuous ratings. Psychometrika 52: 165–81. [Google Scholar] [CrossRef]
- Noel, Yvonnick, and Bruno Dauvier. 2007. A Beta Item Response Model for Continuous Bounded Responses. Applied Psychological Measurement 31: 47–73. [Google Scholar] [CrossRef]
- R Core Team. 2022. R: A Language and Environment for Statistical Computing. [Computer Software]. Vienna: R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 11 May 2023).
- Raatz, Ulhich, and Christine Klein-Braley. 1981. The C-Test—A Modification of the Cloze Procedure. In Practice and Problems in Language Testing. Edited by Terry Culhane, Christine Klein-Braley and Douglas Keith Stevenson. Essex: University of Essex Occasional Papers, pp. 113–48. [Google Scholar]
- Rubin, Donald B. 1984. Bayesianly justifiable and relevant frequency calculations for the applied statistician. Annals of Statistics 12: 1151–72. [Google Scholar] [CrossRef]
- Samejima, Fumiko. 1969. Estimation of latent ability using a response pattern of graded scores. Psychometrika 34: 1–97. [Google Scholar] [CrossRef]
- Samejima, Fumiko. 1973. Homogeneous case of the continuous response model. Psychometrika 38: 203–19. [Google Scholar] [CrossRef]
- Sinharay, Sandip, Matthew S. Johnson, and Hal S. Stern. 2006. Posterior predictive assessment of item response theory models. Applied Psychological Measurement 30: 298–321. [Google Scholar] [CrossRef]
- Stan Development Team. 2022a. CmdStan User’s Guide (2.32) [Computer Software]. Available online: https://mc-stan.org/docs/2_32/cmdstan-guide-2_32.pdf (accessed on 11 May 2023).
- Stan Development Team. 2022b. RStan: The R interface to Stan [Computer Software]. Available online: https://mc-stan.org/rstan/index.html (accessed on 11 May 2023).
- Stan Development Team. 2022c. Stan User’s Guide (2.29) [Computer Software]. Available online: https://mc-stan.org (accessed on 14 July 2021).
- Stenhaug, Benjamin A., and Benjamin W. Domingue. 2022. Predictive fit metrics for item response models. Applied Psychological Measurement 46: 136–55. [Google Scholar] [CrossRef] [PubMed]
- Tao, Shuqin. 2009. Using Collateral Information in the Estimation of Sub-Scores—A Fully Bayesian Approach. Ph.D. dissertation, University of Iowa, Iowa City, IA, USA. [Google Scholar] [CrossRef]
- University of Oregon. 2018–2020. 8th Edition of Dynamic Indicators of Basic Early Literacy Skills. [Technical Manual]. Eugene: University of Oregon. Available online: https://dibels.uoregon.edu/sites/default/files/DIBELS8-TechnicalManual_04152020.pdf (accessed on 27 April 2023).

**Figure 1.**Comparison of model-generated response distributions for the Beta, SB, and Simplex IRT models. Latent proficiency is assumed to follow a standard normal distribution. All item parameters except the dispersion parameter were the same across models.

**Figure 2.**Comparison of the sum of the squared error of predictions across six folds for the Beta, SB, and Simplex IRT models with and without latent regression. The horizontal line for each fold represents the baseline prediction error when an average response is used. A smaller sum of squared error indicates better performance.

**Figure 3.**Density plots of observed sum score distribution (dashed line) and distributions of sum scores from 3000 posterior samples (gray area) for each model.

**Figure 4.**Comparison of average item scores from observed data and posterior predictive distributions of model-generated data.

**Figure 5.**Comparison of standard deviations of item scores from observed data and posterior predictive distributions of model-generated data.

**Figure 6.**The relationships among the item parameter estimates obtained from Beta, SB, and Simplex IRT models.

**Figure 7.**The relationships among the person parameter estimates obtained from Beta, SB, and Simplex IRT models.

**Table 1.**Descriptive statistics for the sum scores from observed data and the average of posterior distribution of sum scores.

Mean | SD | Skewness | Kurtosis | |
---|---|---|---|---|

Beta IRT | 5.56 | 0.36 | −1.96 | 7.76 |

SB IRT | 5.56 | 0.36 | −1.87 | 6.72 |

Simplex IRT | 5.56 | 0.37 | −1.93 | 6.90 |

Observed Data | 5.56 | 0.38 | −2.90 | 22.63 |

**Table 2.**Descriptive statistics for item and person parameters estimated from the Beta, SB, and Simplex IRT models with latent regression.

Beta IRT Model with Latent Regression | SB IRT Model with Latent Regression | Simplex IRT Model with Latent Regression | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Parameters | Mean | SD | Min | Max | Mean | SD | Min | Max | Mean | SD | Min | Max |

$\theta $ | 0.00 | 0.86 | −11.86 | 3.05 | 0.00 | 0.85 | −6.99 | 3.17 | 0.00 | 0.85 | −10.17 | 3.20 |

$\beta $ | 2.18 | 0.53 | −0.09 | 5.15 | 2.41 | 0.54 | −0.16 | 5.27 | 2.20 | 0.51 | 0.12 | 4.52 |

$\alpha $ | 0.61 | 0.23 | 0.02 | 1.69 | 0.62 | 0.25 | 0.03 | 1.72 | 0.59 | 0.22 | 0.00 | 1.77 |

${\gamma}_{0}$ | −11.62 | 1.97 | −14.51 | −4.77 | −11.65 | 1.95 | −15.05 | −4.81 | −11.64 | 1.96 | −14.56 | −4.77 |

${\gamma}_{1}$ | 0.10 | 1.27 | −4.47 | 4.62 | 0.09 | 1.28 | −4.64 | 4.35 | 0.08 | 1.25 | −4.93 | 4.38 |

δ * | 3.53 | 0.97 | −0.05 | 13.21 | 0.66 | 0.28 | 0.04 | 3.30 | 8.41 | 5.16 | 0.26 | 36.45 |

$\mathbf{Writing}({\mathit{\xi}}_{1})$ | $\mathbf{Speaking}({\mathit{\xi}}_{2})$ | |||
---|---|---|---|---|

Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | |

Beta IRT | 0.351 | (0.346, 0.354) | 0.345 | (0.340, 0.349) |

SB IRT | 0.347 | (0.342, 0.351) | 0.352 | (0.347, 0.356) |

Simplex IRT | 0.330 | (0.325, 0.334) | 0.346 | (0.342, 0.351) |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zopluoglu, C.; Lockwood, J.R.
A Comparative Study of Item Response Theory Models for Mixed Discrete-Continuous Responses. *J. Intell.* **2024**, *12*, 26.
https://doi.org/10.3390/jintelligence12030026

**AMA Style**

Zopluoglu C, Lockwood JR.
A Comparative Study of Item Response Theory Models for Mixed Discrete-Continuous Responses. *Journal of Intelligence*. 2024; 12(3):26.
https://doi.org/10.3390/jintelligence12030026

**Chicago/Turabian Style**

Zopluoglu, Cengiz, and J. R. Lockwood.
2024. "A Comparative Study of Item Response Theory Models for Mixed Discrete-Continuous Responses" *Journal of Intelligence* 12, no. 3: 26.
https://doi.org/10.3390/jintelligence12030026