Computational Aspects and Software in Psychometrics II

A special issue of Psych (ISSN 2624-8611). This special issue belongs to the section "Psychometrics and Educational Measurement".

Deadline for manuscript submissions: closed (31 March 2023) | Viewed by 63842

Special Issue Editor


E-Mail Website
Guest Editor
IPN – Leibniz Institute for Science and Mathematics Education, University of Kiel, Olshausenstraße 62, 24118 Kiel, Germany
Interests: item response models; linking; methodology in large-scale assessments; multilevel models; missing data; cognitive diagnostic models; Bayesian methods and regularization

Special Issue Information

Dear Colleagues,

Statistical software in psychometrics has made tremendous progress in providing open-source solutions. The focus of this Special Issue is on computational aspects and statistical algorithms for psychometric methods. Software articles introducing new software packages or tutorials are particularly welcome. Simulation studies as well as implementation aspects of psychometric models would fit this Special Issue perfectly. We would also like to invite researchers to submit articles of software reviews that review one software package, several packages, or empirical comparisons of several packages. Potential psychometric models include (but are not limited to) item response models, structural equation models, multilevel models, latent class models, cognitive diagnostic models, or mixture models.

Dr. Alexander Robitzsch
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Psych is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • statistical software
  • estimation algorithms
  • software tutorials
  • software reviews
  • item response models
  • multilevel models
  • structural equation models
  • latent class models
  • cognitive diagnostic models
  • mixture models
  • open source software

Published Papers (24 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review

5 pages, 197 KiB  
Editorial
Editorial for the Special Issue “Computational Aspects and Software in Psychometrics II”
by Alexander Robitzsch
Psych 2023, 5(3), 996-1000; https://doi.org/10.3390/psych5030065 - 12 Sep 2023
Viewed by 638
Abstract
There has been tremendous progress in statistical software in the field of psychometrics in providing open-source solutions [...] Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)

Research

Jump to: Editorial, Review

17 pages, 1697 KiB  
Article
A SAS Macro for Automated Stopping of Markov Chain Monte Carlo Estimation in Bayesian Modeling with PROC MCMC
by Wolfgang Wagner, Martin Hecht and Steffen Zitzmann
Psych 2023, 5(3), 966-982; https://doi.org/10.3390/psych5030063 - 05 Sep 2023
Cited by 2 | Viewed by 733
Abstract
A crucial challenge in Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation is to diagnose the convergence of the chains so that the draws can be expected to closely approximate the posterior distribution on which inference is based. A close approximation guarantees [...] Read more.
A crucial challenge in Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation is to diagnose the convergence of the chains so that the draws can be expected to closely approximate the posterior distribution on which inference is based. A close approximation guarantees that the MCMC error exhibits only a negligible impact on model estimates and inferences. However, determining whether convergence has been achieved can often be challenging and cumbersome when relying solely on inspecting the trace plots of the chain(s) or manually checking the stopping criteria. In this article, we present a SAS macro called %automcmc that is based on PROC MCMC and that automatically continues to add draws until a user-specified stopping criterion (i.e., a certain potential scale reduction and/or a certain effective sample size) is reached for the chain(s). Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

18 pages, 750 KiB  
Article
RMX/PIccc: An Extended Person–Item Map and a Unified IRT Output for eRm, psychotools, ltm, mirt, and TAM
by Milica Kabic and Rainer W. Alexandrowicz
Psych 2023, 5(3), 948-965; https://doi.org/10.3390/psych5030062 - 05 Sep 2023
Cited by 2 | Viewed by 1388
Abstract
A constituting feature of item response models is that item and person parameters share a latent scale and are therefore comparable. The Person–Item Map is a useful graphical tool to visualize the alignment of the two parameter sets. However, the “classical” variant has [...] Read more.
A constituting feature of item response models is that item and person parameters share a latent scale and are therefore comparable. The Person–Item Map is a useful graphical tool to visualize the alignment of the two parameter sets. However, the “classical” variant has some shortcomings, which are overcome by the new RMX package (Rasch models—eXtended). The package provides the RMX::plotPIccc() function, which creates an extended version of the classical PI Map, termed “PIccc”. It juxtaposes the person parameter distribution to various item-related functions, like category and item characteristic curves and category, item, and test information curves. The function supports many item response models and processes the return objects of five major R packages for IRT analysis. It returns the used parameters in a unified form, thus allowing for their further processing. The R package RMX is freely available at osf.io/n9c5r. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

20 pages, 357 KiB  
Article
Parameter Estimation of KST-IRT Model under Local Dependence
by Sangbeak Ye, Augustin Kelava and Stefano Noventa
Psych 2023, 5(3), 908-927; https://doi.org/10.3390/psych5030060 - 22 Aug 2023
Cited by 2 | Viewed by 761
Abstract
A mantra often repeated in the introductory material to psychometrics and Item Response Theory (IRT) is that a Rasch model is a probabilistic version of a Guttman scale. The idea comes from the observation that a sigmoidal item response function provides a probabilistic [...] Read more.
A mantra often repeated in the introductory material to psychometrics and Item Response Theory (IRT) is that a Rasch model is a probabilistic version of a Guttman scale. The idea comes from the observation that a sigmoidal item response function provides a probabilistic version of the characteristic function that models an item response in the Guttman scale. It appears, however, more difficult to reconcile the assumption of local independence, which traditionally accompanies the Rasch model, with the item dependence existing in a Guttman scale. In recent work, an alternative probabilistic version of a Guttman scale was proposed, combining Knowledge Space Theory (KST) with IRT modeling, here referred to as KST-IRT. The present work has, therefore, a two-fold aim. Firstly, the estimation of the parameters involved in KST-IRT models is discussed. More in detail, two estimation methods based on the Expectation Maximization (EM) procedure are suggested, i.e., Marginal Maximum Likelihood (MML) and Gibbs sampling, and are compared on the basis of simulation studies. Secondly, for a Guttman scale, the estimates of the KST-IRT models are compared with those of the traditional combination of the Rasch model plus local independence under the interchange of the data generation processes. Results show that the KST-IRT approach might be more effective in capturing local dependence as it appears to be more robust under misspecification of the data generation process, but it comes with the price of an increased number of parameters. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
20 pages, 950 KiB  
Article
Expanding NAEP and TIMSS Analysis to Include Additional Variables or a New Scoring Model Using the R Package Dire
by Paul Dean Bailey and Blue Webb
Psych 2023, 5(3), 876-895; https://doi.org/10.3390/psych5030058 - 17 Aug 2023
Cited by 1 | Viewed by 861
Abstract
The R packages Dire and EdSurvey allow analysts to make a conditioning model with new variables and then draw new plausible values. This is important because results for a variable not in the conditioning model are biased. For regression-type analyses, users can also [...] Read more.
The R packages Dire and EdSurvey allow analysts to make a conditioning model with new variables and then draw new plausible values. This is important because results for a variable not in the conditioning model are biased. For regression-type analyses, users can also use direct estimation to estimate parameters without generating new plausible values. Dire is distinct from other available software in R in that it requires fixed item parameters and simplifies calculation of high-dimensional integrals necessary to calculate composite or subscales. When used with EdSurvey, it is very easy to use published item parameters to estimate a new conditioning model. We show the theory behind the methods in Dire and a coding example where we perform an analysis that includes simple process data variables. Because the process data is not used in the conditioning model, the estimator is biased if a new conditioning model is not added with Dire. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

19 pages, 2671 KiB  
Article
Evaluating Model Fit in Two-Level Mokken Scale Analysis
by Letty Koopman, Bonne J. H. Zijlstra and L. Andries Van der Ark
Psych 2023, 5(3), 847-865; https://doi.org/10.3390/psych5030056 - 07 Aug 2023
Cited by 2 | Viewed by 899
Abstract
Currently, two-level Mokken scale analysis for clustered test data is being developed. This paper contributes to this development by providing model-fit procedures for two-level Mokken scale analysis. New theoretical insights suggested that the existing model-fit procedure from traditional (one-level) Mokken scale analyses can [...] Read more.
Currently, two-level Mokken scale analysis for clustered test data is being developed. This paper contributes to this development by providing model-fit procedures for two-level Mokken scale analysis. New theoretical insights suggested that the existing model-fit procedure from traditional (one-level) Mokken scale analyses can be used for investigating model fit at both level 1 (respondent level) and level 2 (cluster level) of two-level Mokken scale analysis. However, the traditional model-fit procedure requires some modifications before it can be used at level 2. In this paper, we made these modifications and investigated the resulting model-fit procedure. For two model assumptions, monotonicity and invariant item ordering, we investigated the false-positive count and the sensitivity count of the level 2 model-fit procedure, with respect to the number of model violations detected, and the number of detected model violations deemed statistically significant. For monotonicity, the detection of model violations was satisfactory, but the significance test lacked power. For invariant item ordering, both aspects were satisfactory. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

17 pages, 431 KiB  
Article
An Introduction to Bayesian Knowledge Tracing with pyBKT
by Okan Bulut, Jinnie Shin, Seyma N. Yildirim-Erbasli, Guher Gorgun and Zachary A. Pardos
Psych 2023, 5(3), 770-786; https://doi.org/10.3390/psych5030050 - 23 Jul 2023
Cited by 3 | Viewed by 2276
Abstract
This study aims to introduce Bayesian Knowledge Tracing (BKT), a probabilistic model used in educational data mining to estimate learners’ knowledge states over time. It also provides a practical guide to estimating BKT models using the pyBKT library available in Python. The first [...] Read more.
This study aims to introduce Bayesian Knowledge Tracing (BKT), a probabilistic model used in educational data mining to estimate learners’ knowledge states over time. It also provides a practical guide to estimating BKT models using the pyBKT library available in Python. The first section presents an overview of BKT by explaining its theoretical foundations and advantages in modeling individual learning processes. In the second section, we describe different variants of the standard BKT model based on item response theory (IRT). Next, we demonstrate the estimation of BKT with the pyBKT library in Python, outlining data pre-processing steps, parameter estimation, and model evaluation. Different cases of knowledge tracing tasks illustrate how BKT estimates learners’ knowledge states and evaluates prediction accuracy. The results highlight the utility of BKT in capturing learners’ knowledge states dynamically. We also show that the model parameters of BKT resemble the parameters from logistic IRT models. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

13 pages, 271 KiB  
Article
Accurate Standard Errors in Multilevel Modeling with Heteroscedasticity: A Computationally More Efficient Jackknife Technique
by Steffen Zitzmann, Sebastian Weirich and Martin Hecht
Psych 2023, 5(3), 757-769; https://doi.org/10.3390/psych5030049 - 21 Jul 2023
Cited by 1 | Viewed by 967
Abstract
In random-effects models, hierarchical linear models, or multilevel models, it is typically assumed that the variances within higher-level units are homoscedastic, meaning that they are equal across these units. However, this assumption is often violated in research. Depending on the degree of violation, [...] Read more.
In random-effects models, hierarchical linear models, or multilevel models, it is typically assumed that the variances within higher-level units are homoscedastic, meaning that they are equal across these units. However, this assumption is often violated in research. Depending on the degree of violation, this can lead to biased standard errors of higher-level parameters and thus to incorrect inferences. In this article, we describe a resampling technique for obtaining standard errors—Zitzmann’s jackknife. We conducted a Monte Carlo simulation study to compare the technique with the commonly used delete-1 jackknife, the robust standard error in Mplus, and a modified version of the commonly used delete-1 jackknife. Findings revealed that the resampling techniques clearly outperformed the robust standard error in rather small samples with high levels of heteroscedasticity. Moreover, Zitzmann’s jackknife tended to perform somewhat better than the two versions of the delete-1 jackknife and was much faster. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
27 pages, 540 KiB  
Article
Approximate Invariance Testing in Diagnostic Classification Models in the Presence of Attribute Hierarchies: A Bayesian Network Approach
by Alfonso J. Martinez and Jonathan Templin
Psych 2023, 5(3), 688-714; https://doi.org/10.3390/psych5030045 - 13 Jul 2023
Cited by 1 | Viewed by 924
Abstract
This paper demonstrates the process of invariance testing in diagnostic classification models in the presence of attribute hierarchies via an extension of the log-linear cognitive diagnosis model (LCDM). This extension allows researchers to test for measurement (item) invariance as well as attribute (structural) [...] Read more.
This paper demonstrates the process of invariance testing in diagnostic classification models in the presence of attribute hierarchies via an extension of the log-linear cognitive diagnosis model (LCDM). This extension allows researchers to test for measurement (item) invariance as well as attribute (structural) invariance simultaneously in a single analysis. The structural model of the LCDM was parameterized as a Bayesian network, which allows attribute hierarchies to be modeled and tested for attribute invariance via a series of latent regression models. We illustrate the steps for carrying out the invariance analyses through an in-depth case study with an empirical dataset and provide JAGS code for carrying out the analysis within the Bayesian framework. The analysis revealed that a subset of the items exhibit partial invariance, and evidence of full invariance was found at the structural level. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

17 pages, 299 KiB  
Article
Detecting Differential Item Functioning in 2PL Multistage Assessments
by Rudolf Debelak, Sebastian Appelbaum, Dries Debeer and Martin J. Tomasik
Psych 2023, 5(2), 461-477; https://doi.org/10.3390/psych5020031 - 31 May 2023
Cited by 1 | Viewed by 961
Abstract
The detection of differential item functioning is crucial for the psychometric evaluation of multistage tests. This paper discusses five approaches presented in the literature: logistic regression, SIBTEST, analytical score-based tests, bootstrap score-based tests, and permutation score-based tests. First, using an simulation study inspired [...] Read more.
The detection of differential item functioning is crucial for the psychometric evaluation of multistage tests. This paper discusses five approaches presented in the literature: logistic regression, SIBTEST, analytical score-based tests, bootstrap score-based tests, and permutation score-based tests. First, using an simulation study inspired by a real-life large-scale educational assessment, we compare the five approaches with respect to their type I error rate and their statistical power. Then, we present an application to an empirical data set. We find that all approaches show type I error rates close to the nominal alpha level. Furthermore, all approaches are shown to be sensitive to uniform and non-uniform DIF effects, with the score-based tests showing the highest power. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

11 pages, 2210 KiB  
Article
SAS PROC IRT and the R mirt Package: A Comparison of Model Parameter Estimation for Multidimensional IRT Models
by Ki Cole and Insu Paek
Psych 2023, 5(2), 416-426; https://doi.org/10.3390/psych5020028 - 15 May 2023
Cited by 1 | Viewed by 1244
Abstract
This study investigates the performance of estimation methods for multidimensional IRT models with dichotomous and polytomous data in two well-known IRT programs: SAS PROC IRT and the mirt package in R. A simulation study was used to compare performance on a simple structure [...] Read more.
This study investigates the performance of estimation methods for multidimensional IRT models with dichotomous and polytomous data in two well-known IRT programs: SAS PROC IRT and the mirt package in R. A simulation study was used to compare performance on a simple structure Rasch model, complex structure 2PL model, and bifactor graded response model. Under RMSE and bias criteria regarding item parameter recovery, PROC IRT and mirt showed nearly identical performance in the simple structure condition. When a complex structure was used, mirt performed better in terms of the recovery of intercept parameters, while the recovery of slope parameters depended on the program and the sample sizes: PROC IRT tended to be better with small samples (N=500) according to RMSE, and mirt was better for larger samples (N=1000 and 2500) according to RMSE and bias for the slope parameter recovery. When a bifactor structure was used, mirt was preferred in all cases; differences lessened as sample size increased. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

20 pages, 5180 KiB  
Article
Bayesian Estimation of Latent Space Item Response Models with JAGS, Stan, and NIMBLE in R
by Jinwen Luo, Ludovica De Carolis, Biao Zeng and Minjeong Jeon
Psych 2023, 5(2), 396-415; https://doi.org/10.3390/psych5020027 - 11 May 2023
Cited by 3 | Viewed by 2054
Abstract
The latent space item response model (LSIRM) is a newly-developed approach to analyzing and visualizing conditional dependencies in item response data, manifested as the interactions between respondents and items, between respondents, and between items. This paper provides a practical guide to the Bayesian [...] Read more.
The latent space item response model (LSIRM) is a newly-developed approach to analyzing and visualizing conditional dependencies in item response data, manifested as the interactions between respondents and items, between respondents, and between items. This paper provides a practical guide to the Bayesian estimation of LSIRM using three open-source software options, JAGS, Stan, and NIMBLE in R. By means of an empirical example, we illustrate LSIRM estimation, providing details on the model specification and implementation, convergence diagnostics, model fit evaluations and interaction map visualizations. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

10 pages, 390 KiB  
Article
bmtest: A Jamovi Module for Brunner–Munzel’s Test—A Robust Alternative to Wilcoxon–Mann–Whitney’s Test
by Julian D. Karch
Psych 2023, 5(2), 386-395; https://doi.org/10.3390/psych5020026 - 10 May 2023
Cited by 5 | Viewed by 3139
Abstract
In psychological research, comparisons between two groups are frequently made to demonstrate that one group exhibits higher values. Although Welch’s unequal variances t-test has become the preferred parametric test for this purpose, surpassing Student’s equal variances t-test, the Wilcoxon–Mann–Whitney test remains [...] Read more.
In psychological research, comparisons between two groups are frequently made to demonstrate that one group exhibits higher values. Although Welch’s unequal variances t-test has become the preferred parametric test for this purpose, surpassing Student’s equal variances t-test, the Wilcoxon–Mann–Whitney test remains the predominant nonparametric approach despite sharing similar limitations with Student’s t-test. Specifically, the Wilcoxon–Mann–Whitney test is associated with strong, unrealistic assumptions and lacks robustness when these assumptions are violated. The Brunner–Munzel test overcomes these limitations, featuring fewer assumptions, akin to Welch’s t-test in the parametric domain, and has thus been recommended over the Wilcoxon–Mann–Whitney test. However, the Brunner–Munzel test is currently unavailable in user-friendly statistical software, such as SPSS, making it inaccessible to many researchers. In this paper, I introduce the bmtest module for jamovi, a freely available user-friendly software. By making the Brunner–Munzel test accessible to a wide range of researchers, the bmtest module has the potential to improve nonparametric statistical analysis in psychology and other disciplines. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

10 pages, 1754 KiB  
Article
Applications and Extensions of Metric Stability Analysis
by Leah Feuerstahler
Psych 2023, 5(2), 376-385; https://doi.org/10.3390/psych5020025 - 04 May 2023
Cited by 1 | Viewed by 1076
Abstract
Item response theory models and applications are affected by many sources of variability, including errors associated with item parameter estimation. Metric stability analysis (MSA) is one method to evaluate the effects of item parameter standard errors that quantifies how well a model determines [...] Read more.
Item response theory models and applications are affected by many sources of variability, including errors associated with item parameter estimation. Metric stability analysis (MSA) is one method to evaluate the effects of item parameter standard errors that quantifies how well a model determines the latent trait metric. This paper describes how to evaluate MSA in dichotomous and polytomous data and describes a Bayesian implementation of MSA that does not require a positive definite variance–covariance matrix among item parameters. MSA analyses are illustrated in the context of an oral-health-related quality of life measure administered before and after prosthodontic treatment. The R code to implement the methods described in this paper is provided. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

26 pages, 511 KiB  
Article
dexter: An R Package to Manage and Analyze Test Data
by Ivailo Partchev, Jesse Koops, Timo Bechger, Remco Feskens and Gunter Maris
Psych 2023, 5(2), 350-375; https://doi.org/10.3390/psych5020024 - 28 Apr 2023
Cited by 1 | Viewed by 1967
Abstract
In this study, we present a package for R that is intended as a professional tool for the management and analysis of data from educational tests and useful both in high-stakes assessment programs and survey research. Focused on psychometric models based on the [...] Read more.
In this study, we present a package for R that is intended as a professional tool for the management and analysis of data from educational tests and useful both in high-stakes assessment programs and survey research. Focused on psychometric models based on the sum score as the scoring rule and having sufficient statistics for their parameters, dexter fully exploits the many theoretical and practical advantages of this choice: lack of unnecessary assumptions, stable and fast estimation, and powerful and sensible diagnostic techniques. It includes an easy to use data management system tailored to the structure of test data and compatible with the current paradigm of tidy data. Companion packages currently include a graphical user interface and support for multi-stage testing. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

14 pages, 1271 KiB  
Article
Exploring Approaches for Estimating Parameters in Cognitive Diagnosis Models with Small Sample Sizes
by Miguel A. Sorrel, Scarlett Escudero, Pablo Nájera, Rodrigo S. Kreitchmann and Ramsés Vázquez-Lira
Psych 2023, 5(2), 336-349; https://doi.org/10.3390/psych5020023 - 27 Apr 2023
Cited by 1 | Viewed by 1616
Abstract
Cognitive diagnostic models (CDMs) are increasingly being used in various assessment contexts to identify cognitive processes and provide tailored feedback. However, the most commonly used estimation method for CDMs, marginal maximum likelihood estimation with Expectation–Maximization (MMLE-EM), can present difficulties when sample sizes are [...] Read more.
Cognitive diagnostic models (CDMs) are increasingly being used in various assessment contexts to identify cognitive processes and provide tailored feedback. However, the most commonly used estimation method for CDMs, marginal maximum likelihood estimation with Expectation–Maximization (MMLE-EM), can present difficulties when sample sizes are small. This study compares the results of different estimation methods for CDMs under varying sample sizes using simulated and empirical data. The methods compared include MMLE-EM, Bayes modal, Markov chain Monte Carlo, a non-parametric method, and a parsimonious parametric model such as Restricted DINA. We varied the sample size, and assessed the bias in the estimation of item parameters, the precision in attribute classification, the bias in the reliability estimate, and computational cost. The findings suggest that alternative estimation methods are preferred over MMLE-EM under low sample-size conditions, whereas comparable results are obtained under large sample-size conditions. Practitioners should consider using alternative estimation methods when working with small samples to obtain more accurate estimates of CDM parameters. This study aims to maximize the potential of CDMs by providing guidance on the estimation of the parameters. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

29 pages, 4935 KiB  
Article
COPS in Action: Exploring Structure in the Usage of the Youth Psychotherapy MATCH
by Thomas Rusch, Katherine Venturo-Conerly, Gioia Baja and Patrick Mair
Psych 2023, 5(2), 274-302; https://doi.org/10.3390/psych5020020 - 19 Apr 2023
Cited by 1 | Viewed by 1049
Abstract
This article is an introduction to Cluster Optimized Proximity Scaling (COPS) aimed at practitioners, as well as a tutorial on the usage of the corresponding R package cops. COPS is a variant of multidimensional scaling (MDS) that aims at providing a clustered [...] Read more.
This article is an introduction to Cluster Optimized Proximity Scaling (COPS) aimed at practitioners, as well as a tutorial on the usage of the corresponding R package cops. COPS is a variant of multidimensional scaling (MDS) that aims at providing a clustered configuration while still representing multivariate dissimilarities faithfully. It subsumes most popular MDS versions as special cases. We illustrate the ideas, use, flexibility and versatility of the method and the package with data from clinical psychology on how modules of the Modular Approach to Therapy for Children (MATCH) are used by clinicians in the wild. We supplement the COPS analyses with density-based hierarchical clustering in the original space and faceting with support vector machines. We find that scaling with COPS gives a sensible and insightful spatial arrangement of the modules, allows easy identification of clusters of modules and provides clear facets of modules corresponding to the MATCH protocols. In that respect COPS works better than both standard MDS and clustering. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

25 pages, 689 KiB  
Article
Using Structural Equation Modeling to Reproduce and Extend ANOVA-Based Generalizability Theory Analyses for Psychological Assessments
by Walter P. Vispoel, Hyeryung Lee, Tingting Chen and Hyeri Hong
Psych 2023, 5(2), 249-273; https://doi.org/10.3390/psych5020019 - 13 Apr 2023
Cited by 8 | Viewed by 1475
Abstract
Generalizability theory provides a comprehensive framework for determining how multiple sources of measurement error affect scores from psychological assessments and using that information to improve those assessments. Although generalizability theory designs have traditionally been analyzed using analyses of variance (ANOVA) procedures, the same [...] Read more.
Generalizability theory provides a comprehensive framework for determining how multiple sources of measurement error affect scores from psychological assessments and using that information to improve those assessments. Although generalizability theory designs have traditionally been analyzed using analyses of variance (ANOVA) procedures, the same analyses can be replicated and extended using structural equation models. We collected multi-occasion data from inventories measuring numerous dimensions of personality, self-concept, and socially desirable responding to compare variance components, generalizability coefficients, dependability coefficients, and proportions of universe score and measurement error variance using structural equation modeling versus ANOVA techniques. We further applied structural equation modeling techniques to continuous latent response variable metrics and derived Monte Carlo-based confidence intervals for those indices on both observed score and continuous latent response variable metrics. Results for observed scores estimated using structural equation modeling and ANOVA procedures seldom varied. Differences in reliability between raw score and continuous latent response variable metrics were much greater for scales with dichotomous responses, thereby highlighting the value of doing analyses on both metrics to evaluate gains that might be achieved by increasing response options. We provide detailed guidelines for applying the demonstrated techniques using structural equation modeling and ANOVA-based statistical software. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

15 pages, 317 KiB  
Article
Effect Sizes for Estimating Differential Item Functioning Influence at the Test Level
by W. Holmes Finch and Brian F. French
Psych 2023, 5(1), 133-147; https://doi.org/10.3390/psych5010013 - 15 Feb 2023
Cited by 2 | Viewed by 1300
Abstract
Differential item functioning (DIF) is a critical step in providing evidence to support a scoring inference in building a validity argument for a psychological or educational assessment. Effect sizes can assist in understanding the accumulation of DIF at the test score level. The [...] Read more.
Differential item functioning (DIF) is a critical step in providing evidence to support a scoring inference in building a validity argument for a psychological or educational assessment. Effect sizes can assist in understanding the accumulation of DIF at the test score level. The current simulation study investigated the performance of several proposed effect size measures under a variety of conditions. Conditions under study included varied sample sizes, DIF effect sizes, the proportion of items with DIF, and the type of DIF (additive vs. non-additive). DIF effect sizes under study included sDTF%, uDTF%, τ^w2, d, R¯Δ2, IDIF2*, and SDIFV. The results of this study suggest that across study conditions, τ^w2, IDIF2*, and d were consistently the most accurate measures of the DIF effects. The effect sizes were also estimated in an empirical example. Recommendations and implications for practice are discussed. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
12 pages, 321 KiB  
Article
A Cautionary Note Regarding Multilevel Factor Score Estimates from Lavaan
by Steffen Zitzmann
Psych 2023, 5(1), 38-49; https://doi.org/10.3390/psych5010004 - 09 Jan 2023
Cited by 3 | Viewed by 1985
Abstract
To compute factor score estimates, lavaan version 0.6–12 offers the function lavPredict( ) that can not only be applied in single-level modeling but also in multilevel modeling, where characteristics of higher-level units such as working environments or team leaders are often assessed by [...] Read more.
To compute factor score estimates, lavaan version 0.6–12 offers the function lavPredict( ) that can not only be applied in single-level modeling but also in multilevel modeling, where characteristics of higher-level units such as working environments or team leaders are often assessed by ratings of employees. Surprisingly, the function provides results that deviate from the expected ones. Specifically, whereas the function yields correct EAP estimates of higher-level factors, the ML estimates are counterintuitive and possibly incorrect. Moreover, the function does not provide the expected standard errors. I illustrate these issues using an example from organizational research where team leaders are evaluated by their employees, and I discuss these issues from a measurement perspective. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
28 pages, 10038 KiB  
Article
A Tutorial on How to Conduct Meta-Analysis with IBM SPSS Statistics
by Sedat Sen and Ibrahim Yildirim
Psych 2022, 4(4), 640-667; https://doi.org/10.3390/psych4040049 - 22 Sep 2022
Cited by 15 | Viewed by 26307
Abstract
Meta-analysis has started to take place among the most used methodologies in psychological research. Such a technique allows researchers to combine the data sets obtained from several individual studies on the same topic and thus is particularly useful for finding solutions to controversial [...] Read more.
Meta-analysis has started to take place among the most used methodologies in psychological research. Such a technique allows researchers to combine the data sets obtained from several individual studies on the same topic and thus is particularly useful for finding solutions to controversial issues that cannot be solved with individual studies. This paper presents a detailed tutorial of the IBM SPSS software, which enables one to implement the statistical analyses for meta-analysis. Examples are also provided to highlight the main analyses conducted in the meta-analysis. The tutorial ends by discussing the differences between IBM SPSS capabilities and those of other software packages. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

14 pages, 351 KiB  
Article
What Is the Maximum Likelihood Estimate When the Initial Solution to the Optimization Problem Is Inadmissible? The Case of Negatively Estimated Variances
by Steffen Zitzmann, Julia-Kim Walther, Martin Hecht and Benjamin Nagengast
Psych 2022, 4(3), 343-356; https://doi.org/10.3390/psych4030029 - 30 Jun 2022
Cited by 3 | Viewed by 1673
Abstract
The default procedures of the software programs Mplus and lavaan tend to yield an inadmissible solution (also called a Heywood case) when the sample is small or the parameter is close to the boundary of the parameter space. In factor models, a [...] Read more.
The default procedures of the software programs Mplus and lavaan tend to yield an inadmissible solution (also called a Heywood case) when the sample is small or the parameter is close to the boundary of the parameter space. In factor models, a negatively estimated variance does often occur. One strategy to deal with this is fixing the variance to zero and then estimating the model again in order to obtain the estimates of the remaining model parameters. In the present article, we present one possible approach for justifying this strategy. Specifically, using a simple one-factor model as an example, we show that the maximum likelihood (ML) estimate of the variance of the latent factor is zero when the initial solution to the optimization problem (i.e., the solution provided by the default procedure) is a negative value. The basis of our argument is the very definition of ML estimation, which requires that the log-likelihood be maximized over the parameter space. We present the results of a small simulation study, which was conducted to evaluate the proposed ML procedure and compare it with Mplus’ default procedure. We found that the proposed ML procedure increased estimation accuracy compared to Mplus’ procedure, rendering the ML procedure an attractive option to deal with inadmissible solutions. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
25 pages, 3717 KiB  
Article
Dealing with Missing Responses in Cognitive Diagnostic Modeling
by Shenghai Dai and Dubravka Svetina Valdivia
Psych 2022, 4(2), 318-342; https://doi.org/10.3390/psych4020028 - 14 Jun 2022
Cited by 1 | Viewed by 2299
Abstract
Missing data are a common problem in educational assessment settings. In the implementation of cognitive diagnostic models (CDMs), the presence and/or inappropriate treatment of missingness may yield biased parameter estimates and diagnostic information. Using simulated data, this study evaluates ten approaches for handling [...] Read more.
Missing data are a common problem in educational assessment settings. In the implementation of cognitive diagnostic models (CDMs), the presence and/or inappropriate treatment of missingness may yield biased parameter estimates and diagnostic information. Using simulated data, this study evaluates ten approaches for handling missing data in a commonly applied CDM (the deterministic inputs, noisy “and” gate (DINA) model): treating missing data as incorrect (IN), person mean (PM) imputation, item mean (IM) imputation, two-way (TW) imputation, response function (RF) imputation, logistic regression (LR), expectation-maximization (EM) imputation, full information maximum likelihood (FIML) estimation, predictive mean matching (PMM), and random imputation (RI). Specifically, the current study investigates how the estimation accuracy of item parameters and examinees’ attribute profiles from DINA are impacted by the presence of missing data and the selection of missing data methods across conditions. While no single method was found to be superior to other methods across all conditions, the results suggest the use of FIML, PMM, LR, and EM in recovering item parameters. The selected methods, except for PM, performed similarly across conditions regarding attribute classification accuracy. Recommendations for the treatment of missing responses for CDMs are provided. Limitations and future directions are discussed. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

Review

Jump to: Editorial, Research

22 pages, 870 KiB  
Review
Bayesian Regularized SEM: Current Capabilities and Constraints
by Sara van Erp
Psych 2023, 5(3), 814-835; https://doi.org/10.3390/psych5030054 - 03 Aug 2023
Cited by 2 | Viewed by 1638
Abstract
An important challenge in statistical modeling is to balance how well our model explains the phenomenon under investigation with the parsimony of this explanation. In structural equation modeling (SEM), penalization approaches that add a penalty term to the estimation procedure have been proposed [...] Read more.
An important challenge in statistical modeling is to balance how well our model explains the phenomenon under investigation with the parsimony of this explanation. In structural equation modeling (SEM), penalization approaches that add a penalty term to the estimation procedure have been proposed to achieve this balance. An alternative to the classical penalization approach is Bayesian regularized SEM in which the prior distribution serves as the penalty function. Many different shrinkage priors exist, enabling great flexibility in terms of shrinkage behavior. As a result, different types of shrinkage priors have been proposed for use in a wide variety of SEMs. However, the lack of a general framework and the technical details of these shrinkage methods can make it difficult for researchers outside the field of (Bayesian) regularized SEM to understand and apply these methods in their own work. Therefore, the aim of this paper is to provide an overview of Bayesian regularized SEM, with a focus on the types of SEMs in which Bayesian regularization has been applied as well as available software implementations. Through an empirical example, various open-source software packages for (Bayesian) regularized SEM are illustrated and all code is made available online to aid researchers in applying these methods. Finally, reviewing the current capabilities and constraints of Bayesian regularized SEM identifies several directions for future research. Full article
(This article belongs to the Special Issue Computational Aspects and Software in Psychometrics II)
Show Figures

Figure 1

Back to TopTop