Next Article in Journal
Multiplicity of Normalized Solutions for the Fractional Schrödinger Equation with Potentials
Previous Article in Journal
Navigating the Dynamics of Squeeze Film Dampers: Unraveling Stiffness and Damping Using a Dual Lens of Reynolds Equation and Neural Network Models for Sensitivity Analysis and Predictive Insights
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Examining Differences of Invariance Alignment in the Mplus Software and the R Package Sirt

by
Alexander Robitzsch
1,2
1
IPN–Leibniz Institute for Science and Mathematics Education, Olshausenstraße 62, 24118 Kiel, Germany
2
Centre for International Student Assessment (ZIB), Olshausenstraße 62, 24118 Kiel, Germany
Mathematics 2024, 12(5), 770; https://doi.org/10.3390/math12050770
Submission received: 17 February 2024 / Revised: 24 February 2024 / Accepted: 3 March 2024 / Published: 5 March 2024

Abstract

:
Invariance alignment (IA) is a multivariate statistical technique to compare the means and standard deviations of a factor variable in a one-dimensional factor model across multiple groups. To date, the IA method is most frequently estimated using the commercial Mplus software. IA has also been implemented in the R package sirt. In this article, the performance of IA in the software packages Mplus and R are compared. It is argued and empirically shown in a simulation study and an empirical example that differences between software packages are primarily the cause of different identification constraints in IA. With a change of the identification constraint employing an argument in the IA function in sirt, Mplus and sirt resulted in comparable performance. Moreover, in line with previous work, the simulation study also highlighted that the tuning parameter ε = 0.001 in IA is preferable to ε = 0.01 . Furthermore, an empirical example raises the question of whether IA, in its current implementations, behaves as expected in the case of many groups.

1. Introduction

In the comparison of multiple groups in confirmatory factor analysis (CFA) regarding factor variables, some identifying assumptions have to be made. It is frequently assumed that item parameters are equal across groups, denoted as measurement invariance [1]. The concept of invariance has been very prominent in psychology and the social sciences in general [2,3]. For example, in international large-scale assessment studies in education, like the Programme for International Student Assessment (PISA), the necessity of invariance is strongly emphasized [4].
In the violation of measurement invariance, the invariance alignment (IA) method [5,6] (also referred to as alignment optimization [7,8]) has been proposed to tackle such situations. The IA method tries to make item parameters as invariant as possible while allowing a few deviations from invariance. By doing so, group comparisons can be made more robust against violations of measurement invariance.
Nowadays, the IA method is frequently applied in social sciences for analyzing questionnaire data [9,10,11,12]. Unfortunately, most methodological developments of IA (but see [13,14,15] for exceptions) are strongly coupled to the popular but commercial (and closed-source) Mplus software [16]. Previous simulation studies for one-dimensional factor models investigated the case of continuous items [5,8,17,18,19], dichotomous items [20,21], and polytomous items [14,22]. IA to multidimensional factor models with continuous items has been investigated in [23,24]. Moreover, IA was studied in longitudinal models in [25,26,27]. The optimization function used in IA also gave rise to extending it to a general framework used in penalized structural equation models [28].
Besides the Mplus software, there exists an alternative implementation in the R package sirt [29]. However, several researchers have pointed out that there could be subtle differences of IA between Mplus and sirt. Unfortunately, there is no systematic comparison of the performance of invariance alignment implementations in Mplus and sirt. This article tries to shed some light on the subtleties of implementation differences of IA. It turns out that different identification constraints are likely the cause of the different results of software packages. By changing the default identification constraint in sirt, Mplus and sirt provided much more similar results. Moreover, the results from a simulation study also question the default choices of tuning parameters in the software packages.
The rest of this article is structured as follows. In Section 2, the background of IA is reviewed. Section 3 discusses the syntax code and estimation options of IA in Mplus and sirt. In Section 4, the two software packages are compared by means of a simulation study. An empirical example is presented in Section 5. Finally, the paper closes with a discussion in Section 6.

2. Invariance Alignment

Let the random variable X i g denote item i ( i = 1 , , I ) in group g ( g = 1 , , G ). A one-dimensional factor model [30] is defined as
X i g = ν i g + λ i g F g + ϵ i g , F g N ( μ g , σ g 2 ) , ϵ i g N ( 0 , ω i g ) ,
where λ i g are item loadings, and  ν i g are item intercepts. Item loadings can be assumed to be positive. If some loading is negative, the corresponding random variable X i g must be multiplied by  1 . The factor variables F g and all residual variables  ϵ i g are independent and univariate normally distributed. The factor variable F g has a factor mean μ g and a factor standard deviation σ g .
Without additional assumptions, the parameters in (1) are not identified. An identified model is obtained by assuming a standardized latent variable F g (i.e., with a mean of 0 and a standard deviation of 1):
X i g = ν i g , 0 + λ i g , 0 F g + ϵ i g , F g N ( 0 , 1 ) , ϵ i g N ( 0 , ω i g ) .
The parameters in (1) and (2) are related to each other by
λ i g , 0 = λ i g σ g and ν i g , 0 = ν i g + λ i g μ g = ν i g + λ i g , 0 σ g μ g .
In many applications, the factor means μ g and factor standard deviations σ g should be compared across groups. To achieve this, a typical assumption in the social sciences is the property is measurement invariance [1,3]. Measurement invariance presupposes that item loadings λ i g and item intercepts ν i g are equal across groups. That is, there exist common item loadings λ i such that λ i = λ i g for all g = 1 , , G and common item intercepts ν i such that ν i = ν i g for g = 1 , , G for all items i = 1 , , I . The absence of measurement invariance is also labeled as differential item functioning (DIF; [2,31]) in the item response theory literature. If measurement invariance holds, (3) can be rewritten as
λ i g , 0 = λ i σ g and ν i g , 0 = ν i + λ i g , 0 σ g μ g .
The IA method of Asparouhov and Muthén [5,6] tackles situations under sparse violations of measurement invariance. In this case, a few item loadings or item intercepts are allowed to differ across groups, while the majority of items (approximately) fulfills the invariance assumption [32]. This situation is called partial invariance in the literature [33].
The IA estimation method proceeds in two steps. In the first step, the one-dimensional factor model (2) is separately estimated by the maximum likelihood method for all groups in the first step. The estimated item parameters λ ^ i g , 0 and ν ^ i g , 0 ( i = 1 , , I ; g = 1 , , G ) are used as the input of the IA. By rewriting (3) and inserting the estimated item loadings and item intercepts, we obtain
λ i g λ i h = λ ^ i g , 0 σ g λ ^ i h , 0 σ h and ν i g ν i h = ν ^ i g , 0 ν ^ i h , 0 λ ^ i g , 0 σ g μ g + λ ^ i h , 0 σ h μ h .
These relations motivate the minimization of the following linking function in IA to determine group means μ = ( μ 1 , , μ G ) and standard deviations σ = ( σ 1 , , σ G ) :
H ( μ , σ ) = i = 1 I g = 1 G 1 h = g + 1 G w i 1 , g h ρ λ ^ i g , 0 σ g λ ^ i h , 0 σ h + i = 1 I g = 1 G 1 h = g + 1 G w i 2 , g h ρ ν ^ i g , 0 ν ^ i h , 0 λ ^ i g , 0 σ g μ g + λ ^ i h , 0 σ h μ h ,
where the weights w i 1 , g h and w i 2 , g h are known, and  ρ is a nonnegative, symmetric loss function with ρ ( 0 ) = 0 and is monotonically increasing for nonnegative x values. Asparouhov and Muthén [5] proposed using w i 1 , g h = w i 2 , g h = n g n h and ρ ( x ) = | x | , where n g denotes the sample size of group g.
In the minimization of (6), additional identification constraints must be imposed. As a first alternative, the distribution parameters of the first (or any other) group can be fixed. That is, we set μ 1 = 0 and σ 1 = 1 . As a second alternative, one can simultaneously constrain all estimated parameters. Then, the following identification constraints can be imposed:
i = 1 G μ g = 0 and g = 1 G σ g = 1 .
The constraints in (7) state that the arithmetic mean of the factor means equals zero, and the geometric mean of the factor standard deviation equals one.
Note that the optimization function H of IA defined in (6) can be rewritten as
H ( μ , σ ) = H 1 ( σ ) + H 2 ( μ , σ ) ,   where
H 1 ( σ ) = i = 1 I g = 1 G 1 h = g + 1 G ρ λ ^ i g , 0 σ g λ ^ i h , 0 σ h and H 2 ( μ , σ ) = i = 1 I g = 1 G 1 h = g + 1 G ρ ν ^ i g , 0 ν ^ i h , 0 λ ^ i g , 0 σ g μ g + λ ^ i h , 0 σ h μ h .
However, the function H 2 can be conveniently substituted by an alternative. Note that Equation (3) can be rewritten as
log λ i g , 0 = log λ i g + log σ g and ν i g , 0 = ν i g + λ i g , 0 σ g μ g .
This motivates the alternative optimization function H 1 * for determining standard deviations, which employs logarithmized item loadings (see [34,35])
H 1 * ( σ * ) = i = 1 I g = 1 G 1 h = g + 1 G ρ log λ ^ i g , 0 log λ ^ i h , 0 σ g * + σ h * ,
where σ g * = log σ g for g = 1 , , G . Due to the required identification constraints, we fix σ 1 * = 0 (i.e.,  σ 1 = exp ( σ 1 * ) = 1 ). By minimizing H 1 * , a vector of standard deviations σ ^ * on the logarithm metric is obtained; that is, σ ^ * = ( σ ^ 1 * , , σ ^ G * ) . The vector of estimated standard deviations  σ ^ can be obtained by exponentiating all entries in σ ^ * .

2.1. Numerical Optimization

As mentioned above, IA uses the loss function ρ ( x ) = | x | = | x | 0.5 as the default in the Mplus software package [16]. However, the loss function ρ ( x ) = | x | 0.25 is also available in Mplus [16]. The more general L p loss function ρ ( x ) = | x | p for p > 0 has been studied for IA in [13,35]. It has been shown that values of the power p smaller than 0.5 can be advantageous in some situations [13].
In the practical minimization of H involved in IA, the nondifferentiable L p loss function ρ ( x ) = | x | p (for 0 < p 1 ) is replaced by a differentiable approximation ρ D (see [5,35])
ρ D ( x ) = ( x 2 + ε ) p / 2 ,
where ε > 0 is a tuning parameter that controls the approximation error of ρ D for ρ . The approximation error becomes smaller with ε values close to zero. However, the minimization of H in IA becomes more difficult when choosing too small values of ε . Practical experience led to proposals ε = 0.01 [5] or ε = 0.001 [35]. The choice ε = 0.01 is the default in Mplus (see [13]).

2.2. A More in-Depth Look into the Identification Constraint for Standard Deviations for Many Groups

The IA method measures the similarity between item loadings in the optimization function H 1 by
H 1 ( σ ) = i = 1 I g = 1 G 1 h = g + 1 G ρ λ ^ i g , 0 σ g λ ^ i h , 0 σ h .
As mentioned above, an identification constraint could be to fix the standard deviation of the first group to 1 or to fix the product of standard deviations to 1. Regarding the choice of the chosen identification constraint in their Mplus software, Asparouhov and Muthén [5] state that “[…] in Mplus by default the parameters are indeed reported in that metric, however, the alignment optimization is carried out using Equation (10) [i.e., the product identification constraint in (7)] to ensure full symmetry between the different groups”. To illustrate this motivation a bit, we rewrite (13) as
H 1 ( σ ) = i = 1 I h = 2 G ρ λ ^ i 1 , 0 σ 1 λ ^ i h , 0 σ h + i = 1 I g = 2 G 1 h = g + 1 G ρ λ ^ i g , 0 σ g λ ^ i h , 0 σ h ,
where we decomposed the terms that do involve and do not involve the first group, respectively. If the optimization would only have been carried out based on the second term in (14), the optimization value would tend to zero if standard deviations tend to infinity. Hence, fixing the standard deviation σ 1 to 1 prevents obtaining infinite estimates of  σ g for g = 2 , , G . If  σ 1 = 1 is specified in the minimization of (14), it becomes clear that the first term in the sum involving the first group becomes less relevant if the number of groups increases. Hence, there is a danger that estimated standard deviations are larger if more groups are involved in the analysis. For this reason, the identification constraint σ 1 = 1 is likely not appropriate in the case of many groups. In contrast, the constraint g = 1 G σ g = 1 would be preferable in this case. The behavior of IA for many groups is analyzed in a simulation study in Section 4 and an empirical example in Section 5.

3. Implementation of Invariance Alignment in Mplus and Sirt

We now describe how IA can be estimated with the commercial Mplus software (Version 8.9; [16]) and the R (Version 4.3; [36]) package sirt [29].
Listing 1 contains command-line syntax for the specification of IA in Mplus (see [16,37]). The dataset is locally saved in mydata.dat (see Line 4 in Listing 1) in an appropriate working directory. The IA method should be applied for five items I1, …, I5 (see Line 6 in Listing 1). The numeric grouping variable group is included in the dataset. The grouping variable has to be specified as a known class variable in Mplus (see Lines 8 and 9 in Listing 1).
Listing 1. Specification of invariance alignment in Mplus software.
1
TITLE :
2
Invariance Alignment ;
3
DATA :
4
FILE IS mydata.dat ;
5
VARIABLE :
6
NAMES ARE group I1 I2 I3 I4 I5;
7
USEVARIABLES ARE group I1 I2 I3 I4 I5;
8
CLASSES = c(3);
9
KNOWNCLASS = c(group = 1 group = 2 group = 3);
10
ANALYSIS :
11
 TYPE = MIXTURE;
12
 ESTIMATOR = MLR;
13
 ALIGNMENT = FIXED(2);  ! group=2 is reference group with zero mean;
14
                        ! ALIGNMENT = FREE for method ’FREE’;
15
 
16
 TOLERANCE = 0.01;      ! epsilon value;
17
 SIMPLICITY = SQRT;     ! for p=0.5;
18
                        ! SIMPLICITY = FOURTHRT for p=0.25;
19
MODEL :
20
 %overall%
21
 f1 BY I1 I2 I3 I4 I5;
22
OUTPUT :
23
 alignment ;
Mplus has only implemented the product constraint g = 1 G σ g = 1 for standard deviations. The method FIXED (i.e., Line 13 in Listing 1 that states ALIGNMENT=FIXED) utilizes the zero constraint of the factor of the first group; that is, μ 1 = 0 . The reference to the first group can be changed using the command ALIGNMENT=FIXED(2) (see Line 13 in Listing 1). In this case, Group 2 is used as the reference group. Alternatively, “the FREE alignment optimization estimates α 1 as an additional parameter” [5]. This specification seems to be overparametrized, and Mplus must have implemented some fix to prevent nonconvergence of the IA optimization problem. The Mplus manual states, “In the FREE setting, all factor means are estimated. FREE is the most general approach” [16]. This statement does not certainly provide enough details for an independent implementation of the black-box algorithms in the Mplus software. Furthermore, the TOLERANCE argument in Line 15 in Listing 1 specifies the tuning parameter ε that appears in the differentiable approximation (12). The default in Mplus is ε = 0.01 . Finally, the SIMPLICITY argument can either choose the power p = 0.5 (i.e., square root SQRT) or p = 0.25 (i.e., fourth root FOURTHRT).
Listing 2 shows how IA can be estimated in the R package sirt [29,38,39]. In the first step, group-specific estimation of the one-dimensional factor models can be carried out with the function sirt::invariance_alignment_cfa_config() (see Line 5 in Listing 2). The group-specific estimated item loadings lambda and item intercepts nu can be extracted from the output of this function (see Lines 9 and 10 in Listing 2). Moreover, the weights w g 1 , g 2 in IA (see Equation (6)) are specified in Line 14 in Listing 2. The specification in this listing ensures the same chosen weights as in Mplus. The function sirt::invariance.alignment() performs IA based on estimated item loadings lambda and item intercepts nu (see Line 17 in Listing 2). The power p in IA can be separately chosen for item loadings (first entry in align.pow) and item intercepts (second entry in align.pow). If the power p = 0.25 instead of the default p = 0.5 should be used in the analysis, users have to specify the argument align.pow=c(0.25,0.25) in the sirt::invariance.alignment() function. The tuning parameter ε in Equation (12) can be specified with the argument eps.
Listing 2. Specification of invariance alignment in the R package sirt.
1
#∗ define items
2
items <- paste0(‘‘I’’, 1:5)
3
 
4
#∗ separate estimate of factor model in groups
5
prep <- sirt::invariance_alignment_cfa_config(dat=dat[,items],
6
                  group=dat$group )
7
 
8
extract item loadings and item intercepts
9
lambda <- prep$lambda
10
nu <- prep$nu
11
 
12
#- define weights
13
Ng <- prep$N
14
wgts <- matrix(sqrt(Ng), length(Ng), ncol(nu))
15
 
16
#∗ perform invariance alignment
17
res <- sirt::invariance.alignment(lambda=lambda, nu=nu,
18
                  align.pow=c(.5, .5), eps=0.01, wgt=wgts, meth=3)
19
 
20
#- extract estimated means and standard deviations
21
res$pars
The IA function in the sirt package has four different estimation methods that can be requested with the argument meth. The default meth = 1 uses the optimization Function (6) with the identification constraints μ 1 = 0 and σ 1 = 1 . The method meth = 2 performs IA on logarithmized item loadings (see Equation (11)), also using the constraints μ 1 = 0 and σ 1 = 1 . The method meth = 3 implements the product constraint g = 1 G σ g = 1 for standard deviations and the zero mean constraint for the first group (i.e., μ 1 = 0 ). Hence, this method is expected to perform similarly to Mplus’ FIXED alignment method. Finally, meth = 4 also utilizes the product constraint for standard deviations but freely estimates the first group mean μ 1 . To identify the model, a penalty term ω W g = 1 G μ g 2 is added to the optimization function, where W is the sum of the involved weights in the IA optimization function and ω = 0.01 is a small factor to achieve convergence in optimization. Likely, this method has only conceptual similarity with Mplus’ FREE method, and no equivalent performance can be expected.
The estimated distributed parameters can be requested by the list entry $pars (see Line 21 in Listing 2).

4. Simulation Study

4.1. Method

The datasets in this simulation study were simulated from a one-dimensional factor model consisting of I = 5 items and G = 3 , 6, 9, or 12 groups. In the case of three groups, the group means were 0, 0.3, and 0.8, and the group standard deviations were 1, 1.225, and 1.095, respectively. With more than three groups, all parameters (i.e., distribution and item parameters) were replicated accordingly. For example, for six groups, the parameters were twice replicated.
All measurement error variances were set to 1 in all groups and uncorrelated with each other. The factor variable and residual variables were normally distributed. There was noninvariance in item intercepts and item loadings. All item intercepts had a value of zero except for a few cases. In the first group, the fifth item intercept was 0.5 . In the second group, the first item intercept was 0.5 , while the second item had an intercept of 0.5 in the third group. All item loadings had a value of one except for a few cases. In the first group, the third item loading was 1.5 . In the second group, the fifth item loading was 0.5 , while the fourth item loading was 0.5 in the third group. These parameters were duplicated with more than three groups as described above.
The sample size per group was chosen as N = 250 , N = 500 , N = 1000 , N = 2000 , or  N = Inf (i.e., infinite sample size). In the case of an infinite sample size, there was no sampling error, and the population parameters were the data-generating parameters. The mean vectors and the covariance matrices are sufficient statistics for the IA method. Datasets with a sample size of N = 9999 , whose empirical means and covariances equaled the population means and covariances, respectively, were simulated in this case.
The IA method was applied in the Mplus software (Version 8.9; [16]), and the function invariance.alignment() in the R package sirt (Version 4.1-15; [29]) was applied. Both software packages utilized the power p = 0.5 and the tuning parameter choices ε = 0.01 and ε = 0.001 . Mplus was used with the FIXED or the FREE methods, while the method meth in sirt was specified as meth = 1, meth = 2, meth = 3, or meth = 4. To compare the performance across methods, the estimates were linearly transformed such that the mean and the SD of the first group were 0 and 1, respectively.
In total, R = 1000 replications were conducted for each cell of the simulation study. Bias, standard deviation (SD), root mean square error (RMSE), and relative RMSE were computed to assess the performance of the different estimators for factor means and factor standard deviations. To ease the comparability between the different estimation methods, we computed a relative RMSE value, which was defined as the quotient of the RMSE for a particular method and the RMSE of a reference method. This quotient was multiplied by 100 afterward. The reference method was Mplus’ FIXED method with p = 0.5 and ε = 0.01 , which is the default in this software package. We also computed the mean absolute difference between estimates of Mplus and sirt to determine possible differences between software packages. Information about model specifications can be found in the material located at https://osf.io/84ne5 (accessed on 17 February 2024).

4.2. Results

In this section, we only present results for the distribution parameters for the second group. The findings for the other groups were very similar.
Table 1 contains the bias of the estimated factor mean μ 2 for different estimation methods in Mplus and sirt. Overall, noticeable bias occurred for ε = 0.01 and p = 0.5. However, the bias decreased with increasing sample size but still appeared in infinite sample sizes. Moreover, note that the bias did not disappear with an increasing number of groups. Interestingly, bias was substantially reduced with the tuning parameter ε = 0.001, particularly for sample sizes of at least 1000. For three, six, or nine groups, the method meth = 1 in sirt performed best in terms of bias. In general, the bias of both Mplus methods FIXED and FREE was similar to those obtained from the four methods implemented in sirt. Interestingly, sirt’s method meth = 1 had issues with an increasing number of groups. For G = 12 and N = 250, there was a large bias in estimated factor means, which showed that meth = 1 failed for a large number of groups.
Table 2 shows the relative RMSE of the estimated factor mean μ 2 in the second group. The FREE method in Mplus was slightly inferior to the FIXED method in Mplus for more than three groups. The tuning parameter ε = 0.001 outperformed ε = 0.01 in terms of relative RMSE. This observation was primarily an effect of the larger bias for ε = 0.01 . The simulation study also highlighted that the SD for the different estimates was larger for  ε = 0.001 than for  ε = 0.01 .
Table 3 presents the average absolute difference between the estimates of the factor mean in the second group between Mplus and sirt. It can be seen that Mplus’ FIXED method was closest to the sirt method meth = 3. The differences were larger to sirt’s meth = 1, which is the default in the R package sirt. Furthermore, the FREE method of Mplus turned out to perform most similarly to sirt’s meth = 4. However, the differences between the two methods are noticeable. Hence, it can be concluded that there is no equivalent implementation of the Mplus FREE method in the sirt package.
Table 4 shows the bias for the factor SD of the second group for p = 0.5. As for the factor mean, the tuning parameter ε = 0.001 had superior performance compared to ε = 0.01. For the SD, the Mplus methods FIXED and FREE as well as sirt’s meth = 3 and meth = 4 coincide. Overall, the sirt method meth = 1 was preferable for three or six groups, while its performance deteriorated for a larger number of groups. It should be emphasized that the bias did not even disappear in infinite sample sizes for ε = 0.01 .
Table 5 presents the relative RMSE for the factor SD in the second group. The specifications with ε = 0.001 were generally preferable over ε = 0.01 in terms of RMSE. The Mplus and sirt methods performed very similarly. Obviously, the bias issues of sirt’s meth = 1 for many groups (i.e., 9 or 12 groups) also translated into substantially increased RMSE values.
Table 6 displays the mean absolute difference for the estimate of the factor SD in the second group between Mplus and sirt. The Mplus method FIXED had a similar performance to the sirt meth = 3, while Mplus’ FREE method has comparable performance with sirt’s meth = 4.
To conclude, this simulation study demonstrated that the performance of IA estimates in Mplus can be similar to sirt if an appropriate estimation method meth in sirt is chosen. The default sirt method meth = 1 resulted in larger differences to Mplus. However, sirt’s meth = 1 can be preferred over Mplus and the other sirt methods for three or six groups but cannot be recommended for many groups (i.e., at least nine groups). Overall, the tuning parameter ε = 0.001 should be preferred over ε = 0.01 in terms of bias and RMSE.

5. Empirical Example: Asparouhov and Muthén (2014) Dataset

This empirical example uses a dataset that was previously also analyzed in [5,40,41]. The dataset came from the European social survey (ESS) conducted in the year 2005 (ESS 2005), which included subjects from 26 countries. The factor variable of tradition and conformity was assessed by four items presented in portrait format, where the scale of the items is such that a high value represents a low level of tradition conformity. The wording of the four items were as follows (see [5]): It is important for him to be humble and modest. He tries not to draw attention to himself (item TR9); Tradition is important to him. He tries to follow the customs handed down by his religion or family (item TR20); He believes that people should do what they’re told. He thinks people should follow rules at all times, even when no one is watching (item CO7); and It is important for him to always behave properly. He wants to avoid doing anything people would say is wrong (item CO16). The dataset for this empirical example (and used in [5]) was downloaded from https://www.statmodel.com/Alignment.shtml (accessed on 17 February 2024).

5.1. Original Data

We analyzed the original ESS dataset but included subjects with no missing values on the four items. The dataset used in this article can be found at https://osf.io/84ne5 (accessed on 17 February 2024). In the 26 countries, the sample sizes ranged between 1031 and 2963 persons with a mean of 1869.5 and an SD of 454.7. The IA method was applied with the specifications p = 0.5 and ε = 0.01 in Mplus and sirt. The same six estimation methods (i.e., FIXED and FREE in Mplus as well as meth = 1, meth = 2, meth = 3, and meth = 4 in sirt) were applied to the dataset.
Table 7 shows the estimated factor means and SDs for the 26 countries and the six estimation methods. It can be seen that sirt’s default meth = 1 provides implausible estimates in this example with many groups. However, the sirt methods meth = 2, meth = 3, and meth = 4 performed comparably to Mplus’ FIXED and FREE methods. It turned out that Mplus’ FIXED method was relatively close to sirt’s meth = 3 in terms of absolute differences in estimated factor means (M = 0.010, S D = 0.013, M i n = 0.000, M a x = 0.070). In addition, estimated factor means were also similar between the Mplus FIXED method and the sirt meth = 2 method (absolute differences: M = 0.012, S D = 0.014, M i n = 0.000, M a x = 0.068). Moreover, Mplus’ FREE method also performed similarly to sirt’s meth = 4 for estimated factor means (absolute differences: M = 0.010, S D = 0.016, M i n = 0.000, M a x = 0.086). There was also a close resemblance for estimated factor standard deviations between the Mplus FIXED and sirt meth = 3 methods (absolute differences: M = 0.007, S D = 0.006, M i n = 0.000, M a x = 0.020). However, the differences between the estimation methods FIXED and FREE in Mplus (or meth = 3 and meth = 4 in sirt) are noteworthy.

5.2. Pseudo-Datasets

In this section, the original ESS dataset is used to create pseudo-datasets that should provide more insights about the different behavior of the estimation methods implemented in Mplus and sirt. The first five countries from the original datasets with sample sizes 1525, 1695, 2320, 1468, and 1031 subjects are used in the creation of the datasets. It is investigated whether the size of the estimates depends on the number of groups. To enable clean but idealized settings, we varied the number of included groups by replicating the original dataset accordingly. For example, with G = 10 groups, the first five groups were the original five countries, while groups six to ten are also the five countries but labeled as unique groups in the IA estimation. Usually, one would expect that the results of the first five groups should not change if the same dataset appears as duplications in the pseudo-dataset.
Table 8 presents estimated factor means and SDs for the third and the sixth group in the pseudo-datasets involving G = 5, 10, 15, 20, 25, or 30 groups. Note that the sixth group coincided with the first group in the pseudo-datasets and the first country in the original dataset. The distribution parameter estimates were transformed such that the mean and the SD of the first group were 0 and 1, respectively.
The factor mean estimates changed as a function of a number of groups for the Mplus FIXED and all sirt methods. Only for the Mplus FREE method were the estimates invariant with respect to the number of groups. In particular, large differences in the estimates were observed when comparing results in a model with 25 and 30 groups. Because the first group had a (transformed) mean of 0, it would also be expected that Group 6 would have factor mean estimates of 0. However, this was not the case for the estimation method, except for Mplus’ FREE and sirt’s meth = 4 methods. Overall, this pattern is surprising because it implies that the choice of the reference group (i.e., the first group in our case) and the number of groups strongly affect the estimates of factor means. For the SD, only sirt’s meth = 1 had estimates that depended on the number of groups.

6. Discussion

In this article, we compared the performance of IA estimates of the Mplus software and the R package sirt. There are two alternative identification constraints for estimating standard deviations ψ g . Mplus uses the product constraint g = 1 G ψ g = 1 , which is used in the sirt methods meth = 3 and meth = 4. However, one can alternatively fix the standard deviation of the first group to 1. This is the default in the R package sirt (i.e., meth = 1. The differences between Mplus and the IA function in the sirt package can primarily be traced back to the different identification constraints for standard deviations. The difference between Mplus and sirt can be made smaller by choosing meth = 3, which mimics the identification constraint used in Mplus. Notably, the latter method is preferred for a large(r) number of groups (say, more than eight), while the default of meth = 1 might be preferable for at most six groups. The simulation study and the empirical example demonstrated that the default meth = 1 in the sirt package does not provide trustworthy results, and users are strongly recommended switching to meth = 2 or meth = 3.
Overall, it turned out in the simulation study that the tuning parameter ε = 0.001 generally outperforms the default Mplus choice ε = 0.01 . A previous study indicated that the choice of ε is more critical than the choice between the power p = 0.5 or p = 0.25 [15]. Minor reductions regarding bias can be obtained with the power p = 0.25 instead of p = 0.5 . However, for reasonably large sample sizes (e.g., more than 500 subjects per group), an L 0 loss function [42] can even outperform the L p loss function for p = 0.5 or p = 0.25 [15].
Regardless of the use of a particular estimation method in Mplus or sirt, we wonder whether the optimization function of IA is suitable in the case of many groups. The pairwise differences between model parameters in the optimization might lead to less stable estimates than a linear model specification that does not involve pairwise differences. There is some evidence that Haberman linking with the L 0.5 IA loss function could be superior in the estimation of many groups (say, more than 20 groups) in IA (see [35]). Further research is needed to explore possible adaptations of the IA method in the case of many groups.
In this article, we only examined estimation differences between Mplus and sirt for normally distributed data. It can be expected that estimation differences due to different identification constraints would similarly be present for ordinal data [6] because it uses item loadings and item thresholds from item response theory models instead of item loadings and item intercepts from a one-dimensional factor model based on the multivariate normal distribution as the model input.
The IA method can provide consistent estimation of factor means and standard deviations if there is a sparse pattern of parameters that are noninvariant across groups. It is debatable whether such a sparse pattern of noninvariant effects can be theoretically assumed in empirical datasets [43,44]. However, if researchers believe in such a sparsity assumption, IA can be deemed an effective data-driven method.
The simulation study conducted in this article assumed a sparse structure of noninvariant parameters. It could be that the differences between Mplus and the IA function in the sirt package were larger under different data-generating models. Future research could further investigate the software differences for more data-generating models and could also involve scenarios of a large number of groups.
As a cautionary remark, we would like to add that enough implementation details must appear in publications for commercial black-box software like Mplus to enable independent judgment, evaluation, and reimplementation of existing methods. We believe that non-documented or sparely documented modeling approaches in commercial software, like the IA method in Mplus, should not be used in substantive and methodological publications because it fundamentally contradicts the principles of open science.

Funding

This research received no external funding.

Data Availability Statement

Datasets and R code are available at https://osf.io/84ne5 (accessed on 17 February 2024).

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CFAconfirmatory factor analysis
DGMdata-generating model
IAinvariance alignment
RMSEroot mean square error
SDstandard deviation

References

  1. Meredith, W. Measurement invariance, factor analysis and factorial invariance. Psychometrika 1993, 58, 525–543. [Google Scholar] [CrossRef]
  2. Mellenbergh, G.J. Item bias and item response theory. Int. J. Educ. Res. 1989, 13, 127–143. [Google Scholar] [CrossRef]
  3. Millsap, R.E. Statistical Approaches to Measurement Invariance; Routledge: New York, NY, USA, 2011. [Google Scholar] [CrossRef]
  4. van de Vijver, F.J.R. Invariance Analyses in Large-Scale Studies; OECD: Paris, France, 2019. [Google Scholar] [CrossRef]
  5. Asparouhov, T.; Muthén, B. Multiple-group factor analysis alignment. Struct. Equ. Model. Multidiscip. J. 2014, 21, 495–508. [Google Scholar] [CrossRef]
  6. Muthén, B.; Asparouhov, T. IRT studies of many groups: The alignment method. Front. Psychol. 2014, 5, 978. [Google Scholar] [CrossRef] [PubMed]
  7. Cieciuch, J.; Davidov, E.; Schmidt, P. Alignment optimization. Estimation of the most trustworthy means in cross-cultural studies even in the presence of noninvariance. In Cross-Cultural Analysis: Methods and Applications; Davidov, E., Schmidt, P., Billiet, J., Eds.; Routledge: London, UK, 2018; pp. 571–592. [Google Scholar] [CrossRef]
  8. Pokropek, A.; Davidov, E.; Schmidt, P. A Monte Carlo simulation study to assess the appropriateness of traditional and newer approaches to test for measurement invariance. Struct. Equ. Model. Multidiscip. J. 2019, 26, 724–744. [Google Scholar] [CrossRef]
  9. Leitgöb, H.; Seddig, D.; Asparouhov, T.; Behr, D.; Davidov, E.; De Roover, K.; Jak, S.; Meitinger, K.; Menold, N.; Muthén, B.; et al. Measurement invariance in the social sciences: Historical development, methodological challenges, state of the art, and future perspectives. Soc. Sci. Res. 2023, 110, 102805. [Google Scholar] [CrossRef] [PubMed]
  10. Luong, R.; Flake, J.K. Measurement invariance testing using confirmatory factor analysis and alignment optimization: A tutorial for transparent analysis planning and reporting. Psychol. Methods 2023, 28, 905–924. [Google Scholar] [CrossRef]
  11. Sideridis, G.; Alghamdi, M.H. Bullying in middle school: Evidence for a multidimensional structure and measurement invariance across gender. Children 2023, 10, 873. [Google Scholar] [CrossRef]
  12. Tsaousis, I.; Jaffari, F.M. Identifying bias in social and health research: Measurement invariance and latent mean differences using the alignment approach. Mathematics 2023, 11, 4007. [Google Scholar] [CrossRef]
  13. Pokropek, A.; Lüdtke, O.; Robitzsch, A. An extension of the invariance alignment method for scale linking. Psychol. Test Assess. Model. 2020, 62, 303–334. Available online: https://bit.ly/2UEp9GH (accessed on 17 February 2024).
  14. Mansolf, M.; Vreeker, A.; Reise, S.P.; Freimer, N.B.; Glahn, D.C.; Gur, R.E.; Moore, T.M.; Pato, C.N.; Pato, M.T.; Palotie, A.; et al. Extensions of multiple-group item response theory alignment: Application to psychiatric phenotypes in an international genomics consortium. Educ. Psychol. Meas. 2020, 80, 870–909. [Google Scholar] [CrossRef] [PubMed]
  15. Robitzsch, A. Implementation aspects in invariance alignment. Stats 2023, 6, 1160–1178. [Google Scholar] [CrossRef]
  16. Muthén, L.; Muthén, B. Mplus User’s Guide, version 8.9, 1998–2023; Muthén & Muthén: Los Angeles, CA, USA, 2023.
  17. Kim, E.S.; Cao, C.; Wang, Y.; Nguyen, D.T. Measurement invariance testing with many groups: A comparison of five approaches. Struct. Equ. Model. Multidiscip. J. 2017, 24, 524–544. [Google Scholar] [CrossRef]
  18. Lai, M.H.C.; Liu, Y.; Tse, W.W.Y. Adjusting for partial invariance in latent parameter estimation: Comparing forward specification search and approximate invariance methods. Behav. Res. Methods 2022, 54, 414–434. [Google Scholar] [CrossRef] [PubMed]
  19. Muthén, B.; Asparouhov, T. Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociol. Methods Res. 2018, 47, 637–664. [Google Scholar] [CrossRef]
  20. DeMars, C.E. Alignment as an alternative to anchor purification in DIF analyses. Struct. Equ. Model. Multidiscip. J. 2020, 27, 56–72. [Google Scholar] [CrossRef]
  21. Finch, W.H. Detection of differential item functioning for more than two groups: A Monte Carlo comparison of methods. Appl. Meas. Educ. 2016, 29, 30–45. [Google Scholar] [CrossRef]
  22. Flake, J.K.; McCoach, D.B. An investigation of the alignment method with polytomous indicators under conditions of partial measurement invariance. Struct. Equ. Model. Multidiscip. J. 2018, 25, 56–70. [Google Scholar] [CrossRef]
  23. Byrne, B.M.; van de Vijver, F.J.R. The maximum likelihood alignment approach to testing for approximate measurement invariance: A paradigmatic cross-cultural application. Psicothema 2017, 29, 539–551. [Google Scholar] [CrossRef]
  24. Marsh, H.W.; Guo, J.; Parker, P.D.; Nagengast, B.; Asparouhov, T.; Muthén, B.; Dicke, T. What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychol. Methods 2018, 23, 524–545. [Google Scholar] [CrossRef]
  25. Kim, E.; Cao, C.; Liu, S.; Wang, Y.; Dedrick, R. Testing measurement invariance over time with intensive longitudinal data and identifying a source of non-invariance. Struct. Equ. Model. Multidiscip. J. 2023, 30, 393–411. [Google Scholar] [CrossRef]
  26. Lai, M.H.C. Adjusting for measurement noninvariance with alignment in growth modeling. Multivar. Behav. Res. 2023, 58, 30–47. [Google Scholar] [CrossRef] [PubMed]
  27. Winter, S.D.; Depaoli, S. An illustration of Bayesian approximate measurement invariance with longitudinal data and a small sample size. Int. J. Behav. Dev. 2020, 44, 371–382. [Google Scholar] [CrossRef]
  28. Asparouhov, T.; Muthén, B. Penalized Structural Equation Models. Struct. Equ. Model. A Multidiscip. J. 2023. [Google Scholar] [CrossRef]
  29. Robitzsch, A. Sirt: Supplementary Item Response Theory Models. 2024. R Package Version 4.1-15. 2024. Available online: https://CRAN.R-project.org/package=sirt (accessed on 6 February 2024).
  30. Bartholomew, D.J.; Knott, M.; Moustaki, I. Latent Variable Models and Factor Analysis: A Unified Approach; Wiley: New York, NY, USA, 2011. [Google Scholar] [CrossRef]
  31. Holland, P.W.; Wainer, H. (Eds.) Differential Item Functioning: Theory and Practice; Lawrence Erlbaum: Hillsdale, NJ, USA, 1993. [Google Scholar] [CrossRef]
  32. van de Schoot, R.; Kluytmans, A.; Tummers, L.; Lugtig, P.; Hox, J.; Muthén, B. Facing off with scylla and charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Front. Psychol. 2013, 4, 770. [Google Scholar] [CrossRef] [PubMed]
  33. Byrne, B.M.; Shavelson, R.J.; Muthén, B. Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychol. Bull. 1989, 105, 456–466. [Google Scholar] [CrossRef]
  34. Haberman, S.J. Linking Parameter Estimates Derived from an Item Response Model through Separate Calibrations; Research Report No. RR-09-40; Educational Testing Service: Princeton, NJ, USA, 2009. [Google Scholar] [CrossRef]
  35. Robitzsch, A. Lp loss functions in invariance alignment and Haberman linking with few or many groups. Stats 2020, 3, 246–283. [Google Scholar] [CrossRef]
  36. R Core Team. R: A Language and Environment for Statistical Computing. 2023. Available online: https://www.R-project.org/ (accessed on 15 March 2023).
  37. Rudnev, M. Alignment Method for Measurement Invariance: Tutorial. Internet Blog Entry. 2019. Available online: http://tinyurl.com/mry3vw99 (accessed on 17 February 2024).
  38. Fischer, R.; Karl, J.A. A primer to (cross-cultural) multi-group invariance testing possibilities in R. Front. Psychol. 2019, 10, 1507. [Google Scholar] [CrossRef]
  39. Han, H. Using measurement alignment in research on adolescence involving multiple groups: A brief tutorial with R J. Res. Adolesc. 2023, 34, 235–242. [Google Scholar] [CrossRef]
  40. Knoppen, D.; Saris, W. Do we have to combine values in the Schwartz’ human values scale? A comment on the Davidov studies. Surv. Res. Methods 2009, 3, 91–103. [Google Scholar] [CrossRef]
  41. Beierlein, C.; Davidov, E.; Schmidt, P.; Schwartz, S.H.; Rammstedt, B. Testing the discriminant validity of Schwartz’ portrait value questionnaire items—A replication and extension of Knoppen and Saris (2009). Surv. Res. Methods 2012, 6, 25–36. [Google Scholar] [CrossRef]
  42. O’Neill, M.; Burke, K. Variable selection using a smooth information criterion for distributional regression models. Surv. Res. Methods 2023, 33, 71. [Google Scholar] [CrossRef]
  43. Robitzsch, A. Model-robust estimation of multiple-group structural equation models. Algorithms 2023, 16, 210. [Google Scholar] [CrossRef]
  44. Robitzsch, A.; Lüdtke, O. Why full, partial, or approximate measurement invariance are not a prerequisite for meaningful and valid group comparisons. Struct. Equ. Model. Multidiscip. J. 2023, 30, 859–870. [Google Scholar] [CrossRef]
Table 1. Simulation Study: Bias of the estimated factor mean in the second group  μ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
Table 1. Simulation Study: Bias of the estimated factor mean in the second group  μ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
ε = 0.01 ε = 0.001
Mplus sirt, meth= Mplus sirt, meth=
G N FIXEDFREE 1234 FIXEDFREE 1234
3250−0.075−0.073−0.062−0.071−0.071−0.070−0.048−0.048−0.042−0.050−0.050−0.049
500−0.059−0.056−0.050−0.057−0.056−0.052−0.032−0.028−0.029−0.033−0.033−0.027
1000−0.046−0.045−0.037−0.043−0.042−0.041−0.017−0.016−0.014−0.017−0.017−0.016
2500−0.041−0.040−0.033−0.038−0.037−0.036−0.012−0.011−0.009−0.011−0.011−0.010
Inf−0.037−0.037−0.030−0.034−0.033−0.033−0.006−0.006−0.005−0.006−0.006−0.006
6250−0.073−0.069−0.052−0.070−0.070−0.066−0.046−0.045−0.033−0.046−0.047−0.045
500−0.065−0.064−0.051−0.063−0.063−0.060−0.037−0.036−0.029−0.036−0.037−0.036
1000−0.048−0.048−0.035−0.046−0.045−0.044−0.020−0.020−0.015−0.020−0.020−0.019
2500−0.040−0.040−0.029−0.038−0.037−0.037−0.012−0.011−0.008−0.011−0.011−0.011
Inf−0.036−0.037−0.027−0.034−0.033−0.033−0.006−0.006−0.005−0.006−0.006−0.006
9250−0.075−0.070−0.048−0.075−0.075−0.069−0.047−0.043−0.028−0.050−0.050−0.046
500−0.063−0.061−0.043−0.062−0.061−0.058−0.037−0.033−0.023−0.035−0.036−0.033
1000−0.050−0.049−0.033−0.048−0.048−0.046−0.022−0.020−0.014−0.022−0.022−0.020
2500−0.041−0.042−0.027−0.040−0.039−0.039−0.014−0.013−0.009−0.014−0.013−0.013
Inf−0.035−0.037−0.023−0.034−0.033−0.033−0.006−0.006−0.004−0.006−0.006−0.006
12250−0.086−0.0810.320−0.086−0.086−0.078−0.059−0.0533.014−0.060−0.060−0.055
500−0.054−0.053−0.024−0.054−0.053−0.050−0.027−0.024−0.009−0.027−0.028−0.024
1000−0.048−0.048−0.025−0.048−0.047−0.045−0.021−0.019−0.010−0.021−0.021−0.019
2500−0.041−0.043−0.023−0.041−0.040−0.040−0.014−0.014−0.008−0.014−0.014−0.014
Inf−0.034−0.037−0.019−0.034−0.033−0.033−0.006−0.006−0.003−0.006−0.006−0.006
Note: Inf = infinite sample size (i.e., using population parameters); Absolute biases larger than 0.03 are printed in bold font. After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 2. Simulation Study: Relative root mean square error of the estimated factor mean in the second group  μ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
Table 2. Simulation Study: Relative root mean square error of the estimated factor mean in the second group  μ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
ε = 0.01 ε = 0.001
Mplus sirt, meth= Mplus sirt, meth=
GNFIXEDFREE 1234 FIXEDFREE 1234
3250100 100.398.498.998.998.994.897.296.696.196.297.6
500100 98.397.299.199.095.492.089.793.794.394.489.3
1000100 99.493.796.996.494.984.384.584.083.983.982.3
2500100 99.190.195.894.893.271.470.471.071.571.569.4
6250100 101.099.2100.6100.599.795.995.999.398.598.396.2
500100 100.895.299.899.598.686.387.086.787.988.086.8
1000100 101.691.398.197.696.982.482.681.182.282.281.0
2500100 101.587.597.596.596.170.070.070.171.071.069.8
9250100 100.397.8100.8100.799.591.891.595.694.294.096.2
500100 102.894.7100.1100.0100.489.190.989.990.690.690.7
1000100 101.489.399.198.597.980.480.979.981.381.380.6
2500100 102.785.698.797.797.972.073.171.873.072.972.9
12250100 102.2734.9100.6100.4100.391.994.5598093.693.594.8
500100 105.096.3101.4101.2103.088.993.192.691.891.793.7
1000100 105.089.1100.7100.1101.782.586.383.284.384.286.2
2500100 104.980.699.998.9100.169.872.469.071.271.172.4
Note:  = The reference method for the computation of the relative RMSE was “Mplus, FIXED” with p = 0.5 and ε = 0.01 . Absolute RMSE values smaller than 100 are printed in bold font. After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 3. Simulation Study: Mean absolute difference between different Mplus and sirt estimation methods of the estimated factor mean in the second group  μ 2 for p = 0.5 and ε = 0.01 as a function of sample size per group N and number of groups G.
Table 3. Simulation Study: Mean absolute difference between different Mplus and sirt estimation methods of the estimated factor mean in the second group  μ 2 for p = 0.5 and ε = 0.01 as a function of sample size per group N and number of groups G.
Mplus, FIXED and sirt, meth Mplus, FIXED and sirt, meth
GN1234 1234
32500.01740.01110.01130.01520.02000.01600.01610.0109
5000.01230.00850.00870.01010.01350.01090.01100.0078
10000.01000.00640.00670.00670.01010.00720.00740.0061
25000.00830.00430.00490.00500.00810.00440.00490.0044
62500.02340.00450.00250.02800.04110.04350.04150.0110
5000.01590.00590.00460.07150.08470.09030.08770.0145
10000.01260.00500.00310.06510.07910.08310.08080.0134
25000.01090.00460.00260.04900.06250.06570.06360.0122
92500.02930.00450.00250.03650.05020.05290.05090.0119
5000.02180.00450.00250.02760.04110.04350.04150.0114
10000.01680.00610.00410.07280.08810.09390.09140.0153
25000.01400.00510.00290.06160.07660.08060.07840.0140
122500.40730.00930.00930.02140.40280.01710.01690.0106
5000.03020.00700.00710.01350.02980.00990.00980.0075
10000.02210.00530.00540.00970.02280.00700.00700.0055
25000.01820.00300.00320.00530.01980.00440.00460.0040
Note: After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 4. Simulation Study: Bias of the estimated factor standard deviation in the second group  σ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
Table 4. Simulation Study: Bias of the estimated factor standard deviation in the second group  σ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
ε = 0.01 ε = 0.001
Mplus sirt, meth= Mplus sirt, meth=
GNFIXEDFREE 1234 FIXEDFREE 1234
3250−0.064−0.064−0.020−0.067−0.066−0.066−0.037−0.037−0.007−0.041−0.042−0.042
500−0.058−0.058−0.028−0.062−0.059−0.059−0.032−0.032−0.015−0.035−0.036−0.036
1000−0.046−0.046−0.023−0.050−0.047−0.047−0.020−0.020−0.010−0.022−0.022−0.022
2500−0.041−0.041−0.022−0.045−0.041−0.041−0.015−0.015−0.008−0.015−0.015−0.015
Inf−0.033−0.033−0.018−0.038−0.033−0.033−0.006−0.006−0.003−0.006−0.006−0.006
6250−0.071−0.0710.014−0.074−0.073−0.073−0.043−0.0430.020−0.044−0.045−0.045
500−0.053−0.0530.003−0.057−0.054−0.054−0.026−0.0260.008−0.026−0.027−0.027
1000−0.049−0.049−0.006−0.053−0.050−0.050−0.023−0.023−0.002−0.024−0.024−0.024
2500−0.037−0.037−0.002−0.041−0.038−0.038−0.011−0.0110.002−0.011−0.011−0.011
Inf−0.033−0.033−0.005−0.038−0.033−0.033−0.006−0.006−0.001−0.006−0.006−0.006
9250−0.067−0.0670.069−0.071−0.070−0.070−0.040−0.0400.061−0.042−0.042−0.042
500−0.049−0.0490.042−0.053−0.051−0.051−0.021−0.0210.034−0.023−0.023−0.023
1000−0.042−0.0420.026−0.046−0.042−0.042−0.015−0.0150.018−0.016−0.016−0.016
2500−0.038−0.0380.016−0.042−0.038−0.038−0.011−0.0110.008−0.012−0.012−0.012
Inf−0.033−0.0330.010−0.038−0.033−0.033−0.006−0.0060.002−0.006−0.006−0.006
12250−0.062−0.0622.041−0.067−0.065−0.065−0.036−0.03614.39−0.037−0.038−0.038
500−0.047−0.0470.090−0.051−0.049−0.049−0.020−0.0200.059−0.021−0.021−0.021
1000−0.043−0.0430.055−0.048−0.044−0.044−0.017−0.0170.030−0.018−0.018−0.018
2500−0.039−0.0390.038−0.043−0.039−0.039−0.012−0.0120.014−0.013−0.013−0.013
Inf−0.033−0.0330.028−0.038−0.033−0.033−0.006−0.0060.004−0.006−0.006−0.006
Note: Inf = infinite sample size (i.e., using population parameters); Absolute biases larger than 0.03 are printed in bold font. After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 5. Simulation Study: Relative root mean square error of the estimated factor standard deviation in the second group  σ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
Table 5. Simulation Study: Relative root mean square error of the estimated factor standard deviation in the second group  σ 2 for different estimation methods in Mplus and sirt for p = 0.5 as a function of the invariance alignment parameter ε , sample size per group N, and number of groups G.
ε = 0.01 ε = 0.001
Mplus sirt, meth= Mplus sirt, meth=
GNFIXEDFREE 1234 FIXEDFREE 1234
3250100 100.087.5102.8102.2102.397.597.492.699.199.199.1
500100 100.082.8104.1102.4102.490.590.583.193.693.893.8
1000100 100.083.2104.8101.5101.583.983.980.585.185.485.4
2500100 100.075.3107.2101.3101.370.370.366.471.471.371.3
6250100 100.088.9102.9101.6101.693.593.592.894.895.195.1
500100 100.080.8102.7100.7100.788.288.284.388.188.288.2
1000100 100.073.5104.6101.5101.580.480.475.081.281.181.1
2500100 100.064.8107.8101.8101.868.468.465.769.769.569.5
9250100 100.0111.6102.8101.4101.493.593.5113.395.194.594.5
500100 100.0104.4103.4101.4101.487.187.1101.388.488.188.1
1000100 100.089.2104.7101.0101.080.080.083.780.780.680.6
2500100 100.074.5107.2100.6100.668.468.468.069.468.568.5
12250100 100.04125102.9101.5101.594.294.230,25695.895.395.3
500100 100.0151.4103.3101.1101.287.587.5122.688.088.188.1
1000100 100.0121.0106.0101.3101.379.179.193.279.979.479.4
2500100 100.0103.1107.1101.2101.367.067.070.767.367.167.1
Note:  = The reference method for the computation of the relative RMSE was “Mplus, FIXED” with p = 0.5 and ε = 0.01 . Absolute RMSE values smaller than 100 are printed in bold font. After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 6. Simulation Study: Mean absolute difference between different Mplus and sirt estimation methods of the estimated factor standard deviation in the second group  σ 2 for p = 0.5 and ε = 0.01 as a function of sample size per group N and number of groups G.
Table 6. Simulation Study: Mean absolute difference between different Mplus and sirt estimation methods of the estimated factor standard deviation in the second group  σ 2 for p = 0.5 and ε = 0.01 as a function of sample size per group N and number of groups G.
Mplus, FIXED and sirt, meth Mplus, FIXED and sirt, meth
GN12341234
32500.04380.00840.00760.00760.04380.00840.00760.0076
5000.03000.00640.00510.00510.03000.00640.00510.0051
10000.02220.00540.00370.00370.02220.00540.00370.0037
25000.01860.00460.00240.00240.01860.00460.00240.0024
62500.08500.00450.00250.02800.04110.04350.04150.0110
5000.05650.00590.00460.07150.08470.09030.08770.0145
10000.04310.00500.00310.06510.07910.08310.08080.0134
25000.03450.00460.00260.04900.06250.06570.06360.0122
92500.13620.00450.00250.03650.05020.05290.05090.0119
5000.09070.00450.00250.02760.04110.04350.04150.0114
10000.06800.00610.00410.07280.08810.09390.09140.0153
25000.05370.00510.00290.06160.07660.08060.07840.0140
122502.10330.00920.00690.00692.10330.00920.00690.0069
5000.13740.00650.00450.00450.13740.00650.00450.0045
10000.09840.00570.00350.00350.09840.00570.00350.0035
25000.07660.00460.00230.00230.07660.00460.00230.0023
Note: After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Table 7. Empirical Example, Original Data: Factor means and factor standard deviations estimates of invariance alignment for 26 countries estimated estimated with Mplus and sirt.
Table 7. Empirical Example, Original Data: Factor means and factor standard deviations estimates of invariance alignment for 26 countries estimated estimated with Mplus and sirt.
Mean Standard Deviation
Mplus sirt, meth= Mplus sirt, meth=
CountryNFIXEDFREE 1234 FIXEDFREE 1234
11525000000111111
216950.0790.0260.4700.0740.0740.0360.9940.9946.2700.9890.9930.992
32320−0.432−0.474−2.958−0.438−0.443−0.4741.1091.1097.3961.0961.1071.107
414680.2630.2051.6710.2600.2620.2201.0861.0866.9731.0851.0951.095
51031−0.579−0.635−3.841−0.588−0.589−0.6260.9870.9876.4560.9880.9900.989
622960.1340.0850.7860.1210.1210.0851.0811.0817.0021.0781.0761.076
729630.3160.2682.0070.3050.3090.2741.1351.1357.3881.1231.1371.137
815500.1680.1051.0300.1590.1640.1211.0981.0986.9891.0761.1121.112
917930.1520.1070.9300.1430.1430.1080.9890.9896.4470.9920.9890.989
101857−0.245−0.293−1.605−0.248−0.251−0.2870.9920.9926.3360.9790.9890.989
1116300.3530.3112.2130.3350.3360.3021.1701.1707.6771.1611.1641.164
1217030.3110.2551.9420.3050.3090.2661.2101.2107.5151.1791.1971.197
1323560.1060.0600.6360.0990.1010.0651.1451.1457.2861.1281.1571.157
142622−0.424−0.463−2.783−0.425−0.423−0.4511.0831.0837.0601.0781.0721.072
151562−0.149−0.185−1.052−0.158−0.157−0.1851.1381.1387.4751.1271.1181.118
161450−0.232−0.278−1.554−0.231−0.231−0.2661.1051.1057.3191.0911.0891.089
1723610.0830.0310.4950.0770.0760.0391.3741.3748.8901.3771.3691.369
182166−0.303−0.360−2.090−0.320−0.323−0.3611.1001.1007.1451.0951.1051.105
1917700.3340.2872.0970.3270.3250.2891.0651.0656.8331.0661.0591.059
201685−0.288−0.325−2.009−0.305−0.302−0.3321.0311.0316.7481.0241.0161.016
2121200.2830.2072.3760.3510.3530.2930.9710.9716.5010.9600.9640.964
222471−0.080−0.130−0.582−0.088−0.088−0.1231.0881.0887.1831.0821.0821.082
2314390.8780.8225.3340.8350.8770.8331.1361.1366.9501.0881.1431.140
241358−0.397−0.454−2.669−0.402−0.405−0.4440.9170.9176.0330.9090.9150.914
251783−0.330−0.377−2.274−0.326−0.333−0.3680.9970.9976.6690.9550.9770.977
2616320.1590.1100.9090.1420.1400.1031.2411.2418.0011.2451.2341.234
Note: N = sample size per country. Table entries with absolute differences smaller than 0.01 between the methods “Mplus, FIXED” and “sirt, meth = 3” are displayed in a gray-colored background. Table entries with absolute differences smaller than 0.01 between the methods “Mplus, FREE” and “sirt, meth = 4” are displayed in a yellow-colored background. After a linear transformation of the obtained parameter estimates, the first country had a factor mean of 0 and a factor standard deviation of 1.
Table 8. Empirical Example, Pseudo-Datasets: Factor mean and factor standard deviations estimates of invariance alignment for Groups 3 and 6 estimated with Mplus and sirt.
Table 8. Empirical Example, Pseudo-Datasets: Factor mean and factor standard deviations estimates of invariance alignment for Groups 3 and 6 estimated with Mplus and sirt.
Mean Standard Deviation
Mplus sirt, meth= Mplus sirt, meth=
GroupGFIXEDFREE 1 234 FIXEDFREE 1234
35−0.371−0.510−0.422−0.393−0.408−0.4701.0451.0451.1151.0411.0781.078
10−0.358−0.510−0.416−0.375−0.389−0.4621.0451.0451.1531.0411.0781.079
15−0.345−0.510−0.412−0.357−0.370−0.4551.0451.0451.2011.0411.0781.079
20−0.330−0.510−0.413−0.337−0.349−0.4491.0451.0451.2771.0401.0781.079
25−0.310−0.510−0.441−0.314−0.325−0.4451.0451.0451.4621.0401.0781.079
30−0.113−0.510−0.620−0.286−0.296−0.4411.0451.0452.2581.0411.0781.080
6100.0270.0000.0420.0400.0400.0001.0001.0001.0631.0001.0001.000
150.0420.0000.0670.0600.0600.0001.0001.0001.1051.0001.0001.000
200.0590.0000.0960.0820.0820.0001.0001.0001.1711.0001.0001.000
250.0820.0000.1420.1060.1060.0001.0001.0001.3341.0001.0001.000
300.2900.0000.2770.1370.1370.0001.0001.0002.0261.0001.0001.000
Note: G = number of groups. After a linear transformation of the obtained parameter estimates, the first group had a factor mean of 0 and a factor standard deviation of 1.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Robitzsch, A. Examining Differences of Invariance Alignment in the Mplus Software and the R Package Sirt. Mathematics 2024, 12, 770. https://doi.org/10.3390/math12050770

AMA Style

Robitzsch A. Examining Differences of Invariance Alignment in the Mplus Software and the R Package Sirt. Mathematics. 2024; 12(5):770. https://doi.org/10.3390/math12050770

Chicago/Turabian Style

Robitzsch, Alexander. 2024. "Examining Differences of Invariance Alignment in the Mplus Software and the R Package Sirt" Mathematics 12, no. 5: 770. https://doi.org/10.3390/math12050770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop