# Statistical Properties of Estimators of the RMSD Item Fit Statistic

^{1}

^{2}

## Abstract

**:**

## 1. Introduction

## 2. RMSD Item Fit Statistic

#### 2.1. Unbiasedness of the Population Value of the RMSD Statistic for a Correctly Specified IRT Model

#### 2.2. Population RMSD Statistic for Misspecified IRT Models

#### 2.3. On the Positive Bias of the Sample-Based RMSD Statistic

## 3. Bias-Corrected RMSD Estimators

#### 3.1. Analytical Bias Correction

#### 3.2. Bootstrap and Jackknife Bias Correction

## 4. Numerical Experiments

#### 4.1. Study 1: Correctly Specified IRT Model

#### 4.2. Study 2: Simulated 2PL Model, but Fitted 1PL Model

#### 4.3. Study 3: Unbalanced Differential Item Functioning

#### 4.4. Study 4: Comparing Balanced and Unbalanced Differential Item Functioning

## 5. Discussion

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

1PL | one-parameter logistic model |

2PL | two-parameter logistic model |

DIF | differential item functioning |

IRF | item response function |

IRT | item response theory |

LSA | large-scale assessment |

PISA | programme for international student assessment |

RMSD | root mean square deviation |

RMSE | root mean square error |

## References

- van der Linden, W.J.; Hambleton, R.K. (Eds.) Handbook of Modern Item Response Theory; Springer: New York, NY, USA, 1997. [Google Scholar] [CrossRef]
- Rutkowski, L.; von Davier, M.; Rutkowski, D. (Eds.) A Handbook of International Large-Scale Assessment: Background, Technical Issues, and Methods of Data Analysis; Chapman Hall/CRC Press: London, UK, 2013. [Google Scholar] [CrossRef]
- OECD. PISA 2009. Technical Report; OECD: Paris, France, 2012; Available online: https://bit.ly/3xfxdwD (accessed on 29 April 2022).
- Bock, R.D.; Moustaki, I. Item response theory in a general framework. In Handbook of Statistics, Vol. 26: Psychometrics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; pp. 469–513. [Google Scholar] [CrossRef]
- Yen, W.M.; Fitzpatrick, A.R. Item response theory. In Educational Measurement; Brennan, R.L., Ed.; Praeger Publishers: Westport, CT, USA, 2006; pp. 111–154. [Google Scholar]
- Casabianca, J.M.; Lewis, C. IRT item parameter recovery with marginal maximum likelihood estimation using loglinear smoothing models. J. Educ. Behav. Stat.
**2015**, 40, 547–578. [Google Scholar] [CrossRef][Green Version] - Woods, C.M. Empirical histograms in item response theory with ordinal data. Educ. Psychol. Meas.
**2007**, 67, 73–87. [Google Scholar] [CrossRef] - Xu, X.; von Davier, M. Fitting the Structured General Diagnostic Model to NAEP Data; (Research Report No. RR-08-28); Educational Testing Service: Princeton, NJ, USA, 2008. [Google Scholar] [CrossRef]
- Yen, W.M. Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Appl. Psychol. Meas.
**1984**, 8, 125–145. [Google Scholar] [CrossRef] - Rasch, G. Probabilistic Models for Some Intelligence and Attainment Tests; Danish Institute for Educational Research: Copenhagen, Denmark, 1960. [Google Scholar]
- Birnbaum, A. Some latent trait models and their use in inferring an examinee’s ability. In Statistical Theories of Mental Test Scores; Lord, F.M., Novick, M.R., Eds.; MIT Press: Reading, MA, USA, 1968; pp. 397–479. [Google Scholar]
- Bock, R.D.; Aitkin, M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika
**1981**, 46, 443–459. [Google Scholar] [CrossRef] - Aitkin, M. Expectation maximization algorithm and extensions. In Handbook of Item Response Theory, Vol. 2: Statistical Tools; van der Linden, W.J., Ed.; CRC Press: Boca Raton, FL, USA, 2016; pp. 217–236. [Google Scholar] [CrossRef]
- Robitzsch, A. A note on a computationally efficient implementation of the EM algorithm in item response models. Quant. Comput. Methods Behav. Sc.
**2021**, 1, e3783. [Google Scholar] [CrossRef] - Sinharay, S.; Haberman, S.J. How often is the misfit of item response theory models practically significant? Educ. Meas.
**2014**, 33, 23–35. [Google Scholar] [CrossRef] - Swaminathan, H.; Hambleton, R.K.; Rogers, H.J. Assessing the fit of item response theory models. In Handbook of Statistics, Vol. 26: Psychometrics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; pp. 683–718. [Google Scholar] [CrossRef]
- Khorramdel, L.; Shin, H.J.; von Davier, M. GDM software mdltm including parallel EM algorithm. In Handbook of Diagnostic Classification Models; von Davier, M., Lee, Y.S., Eds.; Springer: Cham, Switzerland, 2019; pp. 603–628. [Google Scholar] [CrossRef]
- Kunina-Habenicht, O.; Rupp, A.A.; Wilhelm, O. A practical illustration of multidimensional diagnostic skills profiling: Comparing results from confirmatory factor analysis and diagnostic classification models. Stud. Educ. Eval.
**2009**, 35, 64–70. [Google Scholar] [CrossRef] - Joo, S.H.; Khorramdel, L.; Yamamoto, K.; Shin, H.J.; Robin, F. Evaluating item fit statistic thresholds in PISA: Analysis of cross-country comparability of cognitive items. Educ. Meas.
**2021**, 40, 37–48. [Google Scholar] [CrossRef] - Sueiro, M.J.; Abad, F.J. Assessing goodness of fit in item response theory with nonparametric models: A comparison of posterior probabilities and kernel-smoothing approaches. Educ. Psychol. Meas.
**2011**, 71, 834–848. [Google Scholar] [CrossRef] - Köhler, C.; Robitzsch, A.; Hartig, J. A bias-corrected RMSD item fit statistic: An evaluation and comparison to alternatives. J. Educ. Behav. Stat.
**2020**, 45, 251–273. [Google Scholar] [CrossRef] - Tijmstra, J.; Bolsinova, M.; Liaw, Y.L.; Rutkowski, L.; Rutkowski, D. Sensitivity of the RMSD for detecting item-level misfit in low-performing countries. J. Educ. Meas.
**2020**, 57, 566–583. [Google Scholar] [CrossRef] - Buchholz, J.; Hartig, J. Comparing attitudes across groups: An IRT-based item-fit statistic for the analysis of measurement invariance. Appl. Psychol. Meas.
**2019**, 43, 241–250. [Google Scholar] [CrossRef] - Buchholz, J.; Hartig, J. Measurement invariance testing in questionnaires: A comparison of three multigroup-CFA and IRT-based approaches. Psych. Test Assess. Model.
**2020**, 62, 29–53. Available online: https://bit.ly/38kswHh (accessed on 29 April 2022). - Köhler, C.; Robitzsch, A.; Fährmann, K.; von Davier, M.; Hartig, J. A semiparametric approach for item response function estimation to detect item misfit. Brit. J. Math. Stat. Psychol.
**2021**, 74, 157–175. [Google Scholar] [CrossRef] [PubMed] - Monroe, S. Testing latent variable distribution fit in IRT using posterior residuals. J. Educ. Behav. Stat.
**2021**, 46, 374–398. [Google Scholar] [CrossRef] - Köhler, C.; Hartig, J. Practical significance of item misfit in educational assessments. Appl. Psychol. Meas.
**2017**, 41, 388–400. [Google Scholar] [CrossRef] [PubMed] - Robitzsch, A.; Lüdtke, O. A review of different scaling approaches under full invariance, partial invariance, and noninvariance for cross-sectional country comparisons in large-scale assessments. Psych. Test Assess. Model.
**2020**, 62, 233–279. Available online: https://bit.ly/3ezBB05 (accessed on 29 April 2022). - Robitzsch, A.; Lüdtke, O. Mean comparisons of many groups in the presence of DIF: An evaluation of linking and concurrent scaling approaches. J. Educ. Behav. Stat.
**2022**, 47, 36–68. [Google Scholar] [CrossRef] - Shin, H.J.; Kerzabi, E.; Joo, S.H.; Robin, F.; Yamamoto, K. Comparability of response time scales in PISA. Psych. Test Assess. Model.
**2020**, 62, 107–135. [Google Scholar] - Haberman, S.J.; Sinharay, S. Generalized residuals for general models for contingency tables with application to item response theory. J. Am. Stat. Assoc.
**2013**, 108, 1435–1444. [Google Scholar] [CrossRef] - Haberman, S.J.; Sinharay, S.; Chon, K.H. Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions. Psychometrika
**2013**, 78, 417–440. [Google Scholar] [CrossRef] [PubMed] - van Rijn, P.W.; Sinharay, S.; Haberman, S.J.; Johnson, M.S. Assessment of fit of item response theory models used in large-scale educational survey assessments. Large-Scale Assess. Educ.
**2016**, 4, 10. [Google Scholar] [CrossRef][Green Version] - Lim, H.; Choe, E.M.; Han, K.T. A residual-based differential item functioning detection framework in item response theory. J. Educ. Meas. 2022; Epub ahead of print. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 11 January 2022).
- George, A.C.; Robitzsch, A.; Kiefer, T.; Groß, J.; Ünlü, A. The R package CDM for cognitive diagnosis models. J. Stat. Softw.
**2016**, 74, 1–24. [Google Scholar] [CrossRef][Green Version] - Robitzsch, A.; George, A.C. The R package CDM for diagnostic modeling. In Handbook of Diagnostic Classification Models; von Davier, M., Lee, Y.S., Eds.; Springer: Cham, Switzerland, 2019; pp. 549–572. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
- Kolenikov, S. Resampling variance estimation for complex survey data. Stata J.
**2010**, 10, 165–199. [Google Scholar] [CrossRef][Green Version] - Ellis, J.L.; Van den Wollenberg, A.L. Local homogeneity in latent trait models. A characterization of the homogeneous monotone IRT model. Psychometrika
**1993**, 58, 417–429. [Google Scholar] [CrossRef] - Holland, P.W.; Wainer, H. (Eds.) Differential Item Functioning: Theory and Practice; Lawrence Erlbaum: Hillsdale, NJ, USA, 1993. [Google Scholar] [CrossRef]
- Penfield, R.D.; Camilli, G. Differential item functioning and item bias. In Handbook of Statistics, Vol. 26: Psychometrics; Rao, C.R., Sinharay, S., Eds.; Elsevier: Amsterdam, The Netherlands, 2007; pp. 125–167. [Google Scholar] [CrossRef]
- von Davier, M.; Yamamoto, K.; Shin, H.J.; Chen, H.; Khorramdel, L.; Weeks, J.; Davis, S.; Kong, N.; Kandathil, M. Evaluating item response theory linking and model fit for data from PISA 2000–2012. Assess. Educ.
**2019**, 26, 466–488. [Google Scholar] [CrossRef] - Pohl, S.; Schulze, D. Assessing group comparisons or change over time under measurement non-invariance: The cluster approach for nonuniform DIF. Psych. Test Assess. Model.
**2020**, 62, 281–303. Available online: https://bit.ly/3ANjH3V (accessed on 29 April 2022). - Chalmers, R.P.; Counsell, A.; Flora, D.B. It might not make a big DIF: Improved differential test functioning statistics that account for sampling variability. Educ. Psychol. Meas.
**2016**, 76, 114–140. [Google Scholar] [CrossRef][Green Version] - Chalmers, R.P.; Ng, V. Plausible-value imputation statistics for detecting item misfit. Appl. Psychol. Meas.
**2017**, 41, 372–387. [Google Scholar] [CrossRef] - Maydeu-Olivares, A.; Shi, D.; Rosseel, Y. Assessing fit in structural equation models: A Monte-Carlo evaluation of RMSEA versus SRMR confidence intervals and tests of close fit. Struct. Equ. Modeling
**2018**, 25, 389–402. [Google Scholar] [CrossRef] - Shi, D.; Maydeu-Olivares, A.; DiStefano, C. The relationship between the standardized root mean square residual and model misspecification in factor analysis models. Multivar. Behav. Res.
**2018**, 53, 676–694. [Google Scholar] [CrossRef] [PubMed] - Maydeu-Olivares, A. Assessing the size of model misfit in structural equation models. Psychometrika
**2017**, 82, 533–558. [Google Scholar] [CrossRef] [PubMed] - Wright, B.D.; Masters, G.N. Computation of OUTFIT and INFIT statistics. Rasch Meas. Trans.
**1990**, 3, 84–85. Available online: https://bit.ly/3Nyfzv1 (accessed on 29 April 2022). - Oshima, T.C.; Morris, S.B. Raju’s differential functioning of items and tests (DFIT). Educ. Meas.
**2008**, 27, 43–50. [Google Scholar] [CrossRef] - von Davier, M.; Bezirhan, U. A robust method for detecting item misfit in large scale assessments. PsyArXiv
**2021**. [Google Scholar] [CrossRef] - Robitzsch, A. Robust and nonrobust linking of two groups for the Rasch model with balanced and unbalanced random DIF: A comparative simulation study and the simultaneous assessment of standard errors and linking errors with resampling techniques. Symmetry
**2021**, 13, 2198. [Google Scholar] [CrossRef] - Robitzsch, A.; Lüdtke, O. Reflections on analytical choices in the scaling model for test scores in international large-scale assessment studies. PsyArXiv
**2021**. [Google Scholar] [CrossRef]

**Table 1.**Study 1: Mean, standard deviation (SD) and root mean square error (RMSE) for different estimators of the RMSD statistic in a test with $I=9$ items as a function of sample size N.

Mean | SD | RMSE | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Item | N | orig | abc | bbc | jbc | orig | abc | bbc | jbc | orig | abc | bbc | jbc |

1 | 125 | 0.042 | 0.020 | 0.013 | 0.013 | 0.020 | 0.025 | 0.023 | 0.023 | 0.046 | 0.032 | 0.026 | 0.026 |

250 | 0.029 | 0.014 | 0.009 | 0.009 | 0.014 | 0.018 | 0.016 | 0.016 | 0.032 | 0.022 | 0.018 | 0.018 | |

500 | 0.021 | 0.010 | 0.006 | 0.006 | 0.010 | 0.013 | 0.011 | 0.011 | 0.023 | 0.016 | 0.013 | 0.013 | |

1000 | 0.015 | 0.007 | 0.005 | 0.005 | 0.007 | 0.009 | 0.008 | 0.008 | 0.017 | 0.012 | 0.010 | 0.010 | |

2000 | 0.010 | 0.005 | 0.003 | 0.003 | 0.005 | 0.006 | 0.006 | 0.006 | 0.011 | 0.008 | 0.007 | 0.007 | |

2 | 125 | 0.044 | 0.021 | 0.014 | 0.014 | 0.021 | 0.026 | 0.024 | 0.024 | 0.048 | 0.034 | 0.028 | 0.028 |

250 | 0.031 | 0.014 | 0.009 | 0.009 | 0.014 | 0.019 | 0.017 | 0.017 | 0.034 | 0.024 | 0.019 | 0.019 | |

500 | 0.022 | 0.011 | 0.007 | 0.007 | 0.010 | 0.013 | 0.012 | 0.012 | 0.025 | 0.017 | 0.014 | 0.014 | |

1000 | 0.015 | 0.007 | 0.005 | 0.005 | 0.007 | 0.009 | 0.008 | 0.008 | 0.017 | 0.012 | 0.009 | 0.009 | |

2000 | 0.011 | 0.006 | 0.004 | 0.004 | 0.005 | 0.007 | 0.006 | 0.006 | 0.012 | 0.009 | 0.007 | 0.007 | |

3 | 125 | 0.039 | 0.023 | 0.013 | 0.012 | 0.018 | 0.023 | 0.021 | 0.021 | 0.043 | 0.033 | 0.025 | 0.024 |

250 | 0.028 | 0.017 | 0.009 | 0.009 | 0.013 | 0.017 | 0.015 | 0.015 | 0.031 | 0.024 | 0.018 | 0.018 | |

500 | 0.019 | 0.011 | 0.006 | 0.006 | 0.009 | 0.012 | 0.011 | 0.011 | 0.022 | 0.016 | 0.012 | 0.012 | |

1000 | 0.014 | 0.008 | 0.004 | 0.004 | 0.006 | 0.008 | 0.007 | 0.007 | 0.015 | 0.011 | 0.008 | 0.008 | |

2000 | 0.010 | 0.006 | 0.003 | 0.003 | 0.005 | 0.006 | 0.005 | 0.005 | 0.011 | 0.008 | 0.006 | 0.006 |

**Table 2.**Study 2: Population value of the original RMSD estimator in a test with $I=9$ items as a function of item discriminations of misfitting items and the number of misfitting items.

$\mathit{a}=0$ | $\mathit{a}=0.2$ | $\mathit{a}=0.4$ | $\mathit{a}=0.6$ | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | |||||||||

Item | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 |

1 | 0.011 | 0.018 | 0.036 | 0.008 | 0.014 | 0.033 | 0.006 | 0.011 | 0.026 | 0.004 | 0.007 | 0.018 |

2 | 0.079 | 0.057 | 0.036 | 0.061 | 0.047 | 0.033 | 0.043 | 0.035 | 0.027 | 0.027 | 0.023 | 0.018 |

3 | 0.009 | 0.057 | 0.036 | 0.007 | 0.046 | 0.033 | 0.005 | 0.033 | 0.025 | 0.003 | 0.021 | 0.016 |

4 | 0.011 | 0.018 | 0.019 | 0.008 | 0.014 | 0.017 | 0.006 | 0.011 | 0.014 | 0.004 | 0.007 | 0.009 |

5 | 0.012 | 0.019 | 0.021 | 0.009 | 0.015 | 0.019 | 0.006 | 0.011 | 0.015 | 0.004 | 0.007 | 0.010 |

6 | 0.009 | 0.014 | 0.016 | 0.007 | 0.011 | 0.015 | 0.005 | 0.008 | 0.012 | 0.003 | 0.005 | 0.008 |

**Table 3.**Study 2: Population value of the original RMSD estimator in a test with one misfitting item with an item discrimination of $a=0.2$ as a function of the number of items I.

Item | $\mathit{I}=6$ | $\mathit{I}=9$ | $\mathit{I}=12$ | $\mathit{I}=15$ |
---|---|---|---|---|

1 | 0.008 | 0.008 | 0.008 | 0.007 |

2 | 0.037 | 0.061 | 0.078 | 0.090 |

3 | 0.007 | 0.007 | 0.007 | 0.006 |

4 | 0.008 | 0.008 | 0.008 | 0.007 |

5 | 0.008 | 0.009 | 0.009 | 0.008 |

6 | 0.007 | 0.007 | 0.007 | 0.006 |

**Table 4.**Study 2: Mean, standard deviation (SD) and root mean square error (RMSE) for different estimators of the RMSD statistic in a test with $I=9$ items for ${I}_{\mathrm{misfit}}=1$ or ${I}_{\mathrm{misfit}}=3$ misfitting items with an item discrimination of $a=0.2$ as a function of sample size N.

Mean | SD | RMSE | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Item | ${\mathit{I}}_{\mathbf{misfit}}$ | N | orig | abc | bbc | jbc | orig | abc | bbc | jbc | orig | abc | bbc | jbc |

2 | 1 | 125 | 0.077 | 0.063 | 0.055 | 0.055 | 0.033 | 0.040 | 0.042 | 0.042 | 0.037 | 0.040 | 0.042 | 0.042 |

250 | 0.071 | 0.064 | 0.060 | 0.060 | 0.026 | 0.029 | 0.031 | 0.031 | 0.028 | 0.030 | 0.031 | 0.031 | ||

500 | 0.070 | 0.066 | 0.064 | 0.064 | 0.019 | 0.020 | 0.021 | 0.021 | 0.021 | 0.021 | 0.021 | 0.021 | ||

1000 | 0.068 | 0.067 | 0.066 | 0.066 | 0.013 | 0.014 | 0.014 | 0.014 | 0.015 | 0.015 | 0.015 | 0.015 | ||

2000 | 0.068 | 0.067 | 0.067 | 0.067 | 0.009 | 0.009 | 0.010 | 0.010 | 0.012 | 0.011 | 0.011 | 0.011 | ||

3 | 125 | 0.066 | 0.050 | 0.043 | 0.043 | 0.030 | 0.038 | 0.038 | 0.038 | 0.045 | 0.041 | 0.040 | 0.040 | |

250 | 0.061 | 0.052 | 0.048 | 0.048 | 0.024 | 0.029 | 0.030 | 0.030 | 0.037 | 0.034 | 0.033 | 0.033 | ||

500 | 0.057 | 0.053 | 0.051 | 0.050 | 0.018 | 0.020 | 0.021 | 0.021 | 0.030 | 0.028 | 0.027 | 0.027 | ||

1000 | 0.057 | 0.055 | 0.054 | 0.054 | 0.013 | 0.013 | 0.013 | 0.013 | 0.027 | 0.025 | 0.025 | 0.024 | ||

2000 | 0.056 | 0.055 | 0.055 | 0.055 | 0.009 | 0.009 | 0.009 | 0.009 | 0.024 | 0.024 | 0.023 | 0.023 | ||

5 | 1 | 125 | 0.044 | 0.020 | 0.014 | 0.013 | 0.020 | 0.026 | 0.023 | 0.023 | 0.040 | 0.028 | 0.024 | 0.024 |

250 | 0.032 | 0.017 | 0.012 | 0.012 | 0.015 | 0.020 | 0.018 | 0.018 | 0.028 | 0.021 | 0.019 | 0.018 | ||

500 | 0.023 | 0.012 | 0.009 | 0.009 | 0.011 | 0.014 | 0.014 | 0.013 | 0.018 | 0.015 | 0.014 | 0.013 | ||

1000 | 0.018 | 0.010 | 0.007 | 0.007 | 0.008 | 0.011 | 0.010 | 0.010 | 0.012 | 0.011 | 0.010 | 0.010 | ||

2000 | 0.013 | 0.008 | 0.006 | 0.006 | 0.006 | 0.008 | 0.008 | 0.008 | 0.008 | 0.008 | 0.009 | 0.009 | ||

3 | 125 | 0.049 | 0.028 | 0.022 | 0.022 | 0.023 | 0.030 | 0.029 | 0.029 | 0.038 | 0.031 | 0.029 | 0.029 | |

250 | 0.037 | 0.023 | 0.018 | 0.018 | 0.018 | 0.023 | 0.023 | 0.023 | 0.025 | 0.023 | 0.023 | 0.023 | ||

500 | 0.032 | 0.023 | 0.019 | 0.019 | 0.014 | 0.018 | 0.018 | 0.018 | 0.019 | 0.018 | 0.018 | 0.018 | ||

1000 | 0.029 | 0.024 | 0.022 | 0.022 | 0.011 | 0.013 | 0.014 | 0.014 | 0.014 | 0.014 | 0.014 | 0.014 | ||

2000 | 0.028 | 0.025 | 0.024 | 0.024 | 0.008 | 0.009 | 0.009 | 0.009 | 0.011 | 0.011 | 0.010 | 0.010 |

**Table 5.**Study 3: Population value of the original RMSD estimator in a test with $I=9$ items as a function of uniform differential item functioning of misfitting items and the number of misfitting items.

$\mathit{\delta}=0.2$ | $\mathit{\delta}=0.4$ | $\mathit{\delta}=0.6$ | $\mathit{\delta}=1.0$ | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | ${\mathit{I}}_{\mathrm{misfit}}$ | |||||||||

Item | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 |

1 | 0.005 | 0.006 | 0.026 | 0.009 | 0.012 | 0.054 | 0.013 | 0.018 | 0.083 | 0.019 | 0.027 | 0.144 |

2 | 0.035 | 0.032 | 0.027 | 0.069 | 0.062 | 0.053 | 0.101 | 0.092 | 0.077 | 0.160 | 0.146 | 0.122 |

3 | 0.004 | 0.018 | 0.017 | 0.008 | 0.035 | 0.031 | 0.012 | 0.049 | 0.043 | 0.019 | 0.072 | 0.062 |

4 | 0.005 | 0.006 | 0.012 | 0.009 | 0.012 | 0.025 | 0.013 | 0.018 | 0.036 | 0.019 | 0.027 | 0.059 |

5 | 0.006 | 0.009 | 0.014 | 0.011 | 0.018 | 0.027 | 0.016 | 0.026 | 0.040 | 0.026 | 0.040 | 0.065 |

6 | 0.004 | 0.007 | 0.009 | 0.008 | 0.014 | 0.017 | 0.012 | 0.020 | 0.026 | 0.019 | 0.032 | 0.042 |

**Table 6.**Study 3: Mean, standard deviation (SD) and root mean square error (RMSE) for different estimators of the RMSD statistic in a test with $I=9$ items for ${I}_{\mathrm{misfit}}=1$ or ${I}_{\mathrm{misfit}}=3$ misfitting items with a uniform DIF effect of $\delta =0.6$ as a function of sample size N.

Mean | SD | RMSE | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Item | ${\mathit{I}}_{\mathbf{misfit}}$ | N | orig | abc | bbc | jbc | orig | abc | bbc | jbc | orig | abc | bbc | jbc |

2 | 1 | 125 | 0.105 | 0.096 | 0.090 | 0.089 | 0.034 | 0.039 | 0.042 | 0.042 | 0.035 | 0.039 | 0.043 | 0.043 |

250 | 0.103 | 0.100 | 0.097 | 0.097 | 0.025 | 0.027 | 0.028 | 0.028 | 0.025 | 0.027 | 0.028 | 0.028 | ||

500 | 0.102 | 0.100 | 0.099 | 0.099 | 0.018 | 0.019 | 0.019 | 0.019 | 0.018 | 0.019 | 0.019 | 0.019 | ||

1000 | 0.101 | 0.100 | 0.099 | 0.099 | 0.013 | 0.013 | 0.013 | 0.013 | 0.013 | 0.013 | 0.014 | 0.014 | ||

2000 | 0.102 | 0.101 | 0.101 | 0.101 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | ||

3 | 125 | 0.083 | 0.072 | 0.064 | 0.063 | 0.033 | 0.039 | 0.042 | 0.041 | 0.033 | 0.040 | 0.044 | 0.044 | |

250 | 0.081 | 0.076 | 0.071 | 0.071 | 0.025 | 0.027 | 0.029 | 0.029 | 0.025 | 0.027 | 0.030 | 0.030 | ||

500 | 0.078 | 0.076 | 0.074 | 0.074 | 0.018 | 0.019 | 0.020 | 0.020 | 0.018 | 0.019 | 0.020 | 0.020 | ||

1000 | 0.077 | 0.076 | 0.075 | 0.075 | 0.013 | 0.014 | 0.014 | 0.014 | 0.013 | 0.014 | 0.014 | 0.014 | ||

2000 | 0.078 | 0.077 | 0.077 | 0.077 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | 0.009 | ||

5 | 1 | 125 | 0.046 | 0.024 | 0.017 | 0.017 | 0.022 | 0.028 | 0.027 | 0.027 | 0.037 | 0.029 | 0.027 | 0.027 |

250 | 0.034 | 0.018 | 0.014 | 0.013 | 0.017 | 0.021 | 0.020 | 0.020 | 0.024 | 0.021 | 0.021 | 0.021 | ||

500 | 0.026 | 0.016 | 0.012 | 0.012 | 0.013 | 0.016 | 0.016 | 0.016 | 0.016 | 0.016 | 0.017 | 0.017 | ||

1000 | 0.021 | 0.014 | 0.011 | 0.011 | 0.010 | 0.012 | 0.012 | 0.012 | 0.011 | 0.012 | 0.013 | 0.013 | ||

2000 | 0.019 | 0.015 | 0.013 | 0.013 | 0.008 | 0.010 | 0.010 | 0.010 | 0.008 | 0.010 | 0.011 | 0.011 | ||

3 | 125 | 0.057 | 0.037 | 0.030 | 0.029 | 0.027 | 0.034 | 0.035 | 0.035 | 0.032 | 0.035 | 0.036 | 0.036 | |

250 | 0.048 | 0.036 | 0.031 | 0.031 | 0.022 | 0.027 | 0.028 | 0.028 | 0.023 | 0.027 | 0.030 | 0.029 | ||

500 | 0.044 | 0.038 | 0.034 | 0.034 | 0.018 | 0.021 | 0.022 | 0.022 | 0.018 | 0.021 | 0.023 | 0.023 | ||

1000 | 0.041 | 0.038 | 0.036 | 0.036 | 0.012 | 0.014 | 0.015 | 0.015 | 0.013 | 0.014 | 0.015 | 0.015 | ||

2000 | 0.041 | 0.040 | 0.039 | 0.039 | 0.009 | 0.009 | 0.010 | 0.010 | 0.009 | 0.009 | 0.010 | 0.010 |

**Table 7.**Study 4: Population value of the original RMSD estimator in a test with two misfitting items with an uniform DIF effects of $\left|\delta \right|=0.6$ for balanced DIF and unbalanced DIF as a function of the number of items I.

Balanced DIF | Unbalanced DIF | |||||||
---|---|---|---|---|---|---|---|---|

Item | $\mathit{I}=6$ | $\mathit{I}=9$ | $\mathit{I}=12$ | $\mathit{I}=15$ | $\mathit{I}=6$ | $\mathit{I}=9$ | $\mathit{I}=12$ | $\mathit{I}=15$ |

1 | 0.009 | 0.006 | 0.004 | 0.003 | 0.026 | 0.018 | 0.013 | 0.011 |

2 | 0.112 | 0.114 | 0.115 | 0.115 | 0.079 | 0.092 | 0.098 | 0.102 |

3 | 0.092 | 0.092 | 0.092 | 0.091 | 0.039 | 0.049 | 0.054 | 0.057 |

4 | 0.009 | 0.006 | 0.004 | 0.003 | 0.026 | 0.018 | 0.013 | 0.011 |

5 | 0.006 | 0.004 | 0.003 | 0.003 | 0.039 | 0.026 | 0.019 | 0.015 |

6 | 0.002 | 0.002 | 0.001 | 0.001 | 0.030 | 0.020 | 0.015 | 0.012 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Robitzsch, A.
Statistical Properties of Estimators of the RMSD Item Fit Statistic. *Foundations* **2022**, *2*, 488-503.
https://doi.org/10.3390/foundations2020032

**AMA Style**

Robitzsch A.
Statistical Properties of Estimators of the RMSD Item Fit Statistic. *Foundations*. 2022; 2(2):488-503.
https://doi.org/10.3390/foundations2020032

**Chicago/Turabian Style**

Robitzsch, Alexander.
2022. "Statistical Properties of Estimators of the RMSD Item Fit Statistic" *Foundations* 2, no. 2: 488-503.
https://doi.org/10.3390/foundations2020032