1. Introduction
Psychosocial disorders in children are widespread, with a prevalence of up to 34.7% [
1]. These disorders have both long- and short-term repercussions, and early recognition may have a significant and beneficial effect on their management [
2,
3]. However, investigating psychological disorders, especially in young infants, presents various execution and reliability challenges. The availability of a well-validated instrument that is simple to use with young children helps mitigate these problems to a significant degree.
The Strengths and Difficulties Questionnaire (SDQ) is a prominent instrument for evaluating mental health and behavioral issues in children and adolescents. It was designed by Goodman (1997) in the United Kingdom [
4] and is currently used in both research and therapeutic settings for children aged 2–17 [
5]. Compared to other forms of screening, such as the Child Behavior Checklist [
6], the SDQ has several benefits, such as brevity and practical assessment of attention deficit and hyperactivity; thus, it has been translated into other languages and used worldwide, mainly because of its focus on both competencies and difficulties and its brevity (SDQ.; Goodman, 1997; see
https://www.sdqinfo.com/, accessed on 7 February 2022).
Studies indicate that the predictive and concurrent validity of the SDQ is generally good [
7,
8]. Normative SDQ data in school-age children are available in several countries [
9,
10,
11,
12,
13,
14]. Several studies have determined that the internal consistency of individual samples was adequate for most subscales and supported the feasibility of the screening instrument in general. Mixed findings have been reported. For instance, Tobia et al., reported that internal consistency was lower for Peer Problems and Emotional Symptoms Scales [
14]. Huskey et al., discovered a similar lack of consistency in the Peer Problems Scale in seven European countries [
12].
The five-factor structure of the SDQ is receiving increasing support as more studies have used confirmatory factor analysis to test its hypothesized factor structure [
15]. In the U.S.; He et al. (2013) reported that the five-factor structure of the SDQ showed a satisfactory fit to the data and was invariant across age, ethnicity, gender, and income subgroups [
11]. The SDQ scores showed a noticeably higher likelihood of fitting the DSM-IV criteria for diagnosis. In Asia, the CFA of the SDQ supports a five-factor structure [
13]. The psychometric features of the SDQ for preschoolers (ages 3–5) have been the subject of a very small number of investigations, and these studies were mainly conducted in the West [
3,
16,
17]. Their findings demonstrated that the SDQ is suitable as a psychopathological screening tool in linguistically and culturally varied preschool settings and that the SDQ fits the five-factor model in all preschool groups. One study conducted in South Africa on young children aged 4–6 years [
18] reported mixed findings that some subscales did not work in their context and language; in particular, Conduct Problems Scale was only partially supported, and both the Hyperactivity/Inattentive and Peer Problems subscales were poorly loaded [
18].
Although a great deal of research has been undertaken on the reliability and validity of the SDQ, several critical areas still require additional exploration [
19]. First, while reliability has been widely investigated, the dependability of the subscales, particularly the Conduct Problem Scale issues and Peer Problems Scale, seems inadequate. Additionally, the reliability of the test-retest and the temporal stability were found to be adequate and not high in a recent systematic review; on the other hand, the sensitivity and the positive predictive value were unsatisfactory [
15]. Some studies in Arab countries have used the SDQ [
20,
21,
22]; however, there has been little work on the psychometric characteristics of the instrument for use in Gulf nations, focusing mainly on young children.
Two to four percent of preschoolers display clinically significant levels of disruptive behavior that warrant a diagnosis of attention deficit hyperactivity disorder (ADHD) or conduct disorder, and the expulsion rate for preschoolers is three times higher than that of students in higher grades [
23,
24,
25]. The validation of SDQ in preschoolers can be of great help [
7]. It should be noted that although emotional and behavioral difficulties are quite common among preschoolers, particularly in economically disadvantaged communities, very few really get any help due to delays in diagnosis.
2. Literature Review
There is common agreement on the importance of early diagnosis and early interventions for behavioral and emotional problems [
3]. Approximately 10–30% of preschool children exhibit high levels of disruptive behavior [
26]. Preschool children are expelled approximately three times more often than children in grades K–12; approximately 2–4% of preschoolers are reported to exhibit disruptive behaviors to a clinical degree and are diagnosed with attention deficit hyperactivity disorder (ADHD) or oppositional defiance disorder [
23,
24]. Yoder and Williford (2019) reported that teachers rated approximately one-fourth of their study sample (
N = 2427 preschoolers) as exhibiting high levels of problem behavior; 16% of the total sample were described as showing these behaviors at a clinical level (i.e., indicated by DSM-5 symptom-count criteria for ODD and ADHD) [
25]. Therefore, screening instruments, especially at this early stage, play a significant role in measuring the different types of behavioral and psychosocial problems, the strengths that might be found, and the severity of these problems [
7].
2.1. Related Studies
Many studies have examined the psychometric properties of different versions of the SDQ in children and young people. These studies generally support the feasibility of this screening instrument. For example, Husky et al. (2018) examined and compared the internal consistency of teacher and parent versions in their study of 541 children aged 5–12 years in seven European countries [
12]. They found that the internal consistency of the total sample was adequate for most subscales, except for Peer Problems. The study by Tobia et al. (2018), in which 301 teachers assessed 3302 children aged 3 to 15 years, validated the internal structure of the Italian teacher version of the SDQ [
14]. Internal consistency was lower for the Emotional Symptoms and Peer Problems Scales. Thus, the psychometric properties of the teacher version for this Italian sample were only partially consistent with those of previous studies in other countries. Español-Martín and colleagues (2021) assessed the reliability and validity of the Spanish versions of the SDQ (teacher, parent, self-report) with a sample of 6775 students ages 5 to 17 [
10]. The results demonstrated acceptable reliability estimates for all SDQ subscales, and the CFA supported the original five-factor model. In the US, He et al. (2013) evaluated the five-factor structure of the SDQ and assessed its convergent validity compared to comprehensive clinical diagnostic assessments using data from the National Comorbidity Survey-Adolescent Supplement, a nationally representative sample of adolescents between 13 and 18 years of age (
N = 6483) [
11].
The findings indicated that the five-factor structure of the SDQ showed a satisfactory fit to the data and was invariant between the age, ethnicity, sex, and income subgroups. The SDQ scores indicated a significantly increased probability of meeting the DSM-IV criteria for a disorder. Numerous studies in Asia have evaluated the validity and reliability of the SDQ. Shibata et al. (2015) used the parent and teacher versions of the SDQ in Japan with a sample of 1487 elementary school children aged 6 to 12 years [
13]. The CFA results of these SDQ reports support a five-factor structure. Du et al. (2008) described the normative data, validity, and reliability of three Chinese versions of the SDQ in a large sample of children in Shanghai of 2128 students between the ages of 3 and 17 [
9], providing only partial support for structural validity but good support for convergent validity.
2.2. Arabic Version of the Instrument
Arabic versions of SDQ have demonstrated good psychometric properties [
27]. Although several studies in Arab countries have used SDQ [
20,
21,
22,
28,
29], research on the psychometric properties of the instrument for Gulf countries, especially for young children, is limited. Only Ababneh and Alomari (2016) investigated the psychometric properties of the teacher version of the SDQ for children aged four and five years in Jordan (
N = 788) [
28]. They conducted a factor analysis of three factors that correspond to the three dimensions of the psychometric properties of the instrument. They also evaluated concurrent validity with an early development instrument (Cronbach’s α = 0.73). Other studies have focused on older populations, and some studies have focused on determining the prevalence of emotional and behavioral disorders. Maajeeny (2019) aimed to extend previous efforts to determine the prevalence of behavioral and emotional disorders among children in Saudi Arabia to evaluate the demand for intervention services [
22]. He distributed the SDQ to teachers of students aged 4–17 years and concluded that students with emotional and behavioral disorders in Saudi Arabia might represent around 25% of this age group. The results of the SDQ reliability test showed an acceptable level of general reliability (Cronbach’s α = 0.64) [
22]. El-Keshky and Emam (2015) conducted a cross-cultural examination of the teacher version of the SDQ in Saudi Arabia (
n = 323) and Oman (
n = 439), focusing on emotional and behavioral difficulties in children diagnosed with learning disabilities [
20]. Researchers have also examined the structure of the SDQ. Multigroup CFAs based on structural equation modeling indicated cultural invariance for the instrument. The three-factor model was supported and provided a better explanation of the structure of the SDQ. El-Keshky and Emam found that the Arabic version of the SDQ has acceptable psychometric properties and performs well and consistently across different Arab cultures and genders. Al-Mukhani et al. had 377 students aged 11 to 16 who completed the SDQ [
29]. They concluded that the self-reporting version of the SDQ is a reliable screening instrument for behavioral and psychological problems in children in Oman. Emam et al. (2016) targeted middle school students (
N = 815). Their assessment of the five-factor SDQ structure revealed significant empirical support. Their findings revealed a reasonable fit between the three forms of informants. Across genders, factor variances, factor correlations, and item residuals were not invariant.
2.3. Early Childhood Studies
Research on the psychometric properties of SDQ for preschool children (ages 3–5) is limited [
3,
17]. In a Dutch community sample focusing on children 4 to 7 years of age, Stone et al. (2015) examined the SDQ versions of the parent (
n = 1513) and the teacher (
n = 2238) [
7]. They tested reliability and construct validity and examined predictive validity. They found that the five-factor structure was supported for both parent and teacher versions, and the indices of the validity of the criteria were adequate. Dahlberg and colleagues (2019) examined the SDQ of 3 to 5 years old in a sample of (
N = 17,752) in Sweden [
3]. Their results revealed that the original five-factor SDQ model could be used for younger populations. It also demonstrated good construct validity of the SDQ for a preschool population and suggested that the instrument can be used to evaluate children’s problem behavior from various informant perspectives. Downs et al. (2012) examined SDQ for (
N = 477) children from the United States (
n = 298; ages 3 to 5 years) and Germany (
n = 179; ages 3–6 years) [
30]. The SDQ showed adequate validity and reliability in different cultures and linguistics. The results generally support the suitability of SDQ as a psychopathological screening tool in culturally and linguistically diverse preschool settings. The results supported the fit of the SDQ five-factor model in all three groups of preschoolers. Croft et al. (2015) tested the SDQ for ages three to four years [
16]. They were completed by 16,659 parents collected as part of the Millennium Cohort Study, which followed the development of children born in the UK during 2000–2001. The CFA indicated the adequacy of the five-factor measurement model. They concluded that the satisfactory psychometric properties of the preschool version supported its use as a screening tool to detect behavioral and emotional problems in children aged 3 to 5 years. Mellins et al. (2018) also examined the psychometric properties of SDQ, using data extracted from a large population (
N = 1581) of young children aged 4 to 6 years in South Africa [
18]. Although the total difficulty score, Emotional Symptoms Scale, and Prosocial behavior factors were confirmed, behavior problems were only partially supported. (The Hyperactivity/Inattentive and Peer Problems subscales loaded poorly). The authors concluded that, although the SDQ might be a valuable screening instrument in South Africa, some subscales did not work in this context and language; additional measures or modifications could be needed.
2.4. The Current Study
The current research study aimed to assess the teacher version of the SDQ’s reliability, inter-measure correlations (Cronbach’s alpha), discriminate validity, as well as conceptual, exploratory (EFA), and confirmatory factor analysis (CFA) for pre-kindergarten and kindergarten children in Qatar. There have not been many recent studies on the psychometric properties of the SDQ teacher version in early childhood populations, especially in the Arab world. The study of the psychometric characteristics of the SDQ teacher version and the validity of its usage in children of this age group might pave the way for the validation of the use of this instrument broadly by teachers in Qatar’s public schools. This, in turn, can aid in the early diagnosis of emotional and behavioral disorders in children. This research study aimed to answer the following questions. One is to determine whether the Arabic-version of SDQ (teacher version) is a reliable instrument to assess preschoolers in Qatar, and the second is to determine whether data from this study can be used to apply the five-factor SDQ framework. For this purpose, I conducted a validation and psychometric assessment of the SDQ teacher report version in a group of four to five-year-old children. To my knowledge, this is the first study from Qatar to analyze the factor structure of SDQ (teacher version) in a representative sample of preschoolers.
5. Discussion
This study examined the validity and application of the hypothesized factor structure of the SDQ (teacher version) in a sample of preschoolers in Qatar. The findings indicate the poor internal reliability of the Peer Problems Scale. The results also suggest that, unlike several other studies, the five-factor structure does not fit well with these data. In particular, the EFA identified five factors that accounted for 60% of the variance; however, two items, namely, somatic symptoms and nervousness in new situations, had higher loadings on the Peer Problems Scale than on the Emotional Symptoms Scale. The Conduct Problems Scale had a high loading on the three items. I also extracted a Hyperactivity/Inattention Scale from the five original items. These items had high loadings, but the two positively worded items were also highly loaded on the Prosocial Scale.
In the CFA, none of the goodness-of-fit ranges or values was reached for any of the fit indices. Based on these results, I can conclude that these data do not show the original five factors of the SDQ in a straightforward manner [
36,
37]. Although five factors were identified in these data (with eigenvalues > 1) and four were extracted mainly from their original elements, one factor did not appear in the results. Moreover, the five positively worded items functioned in different directions than the other items on the scale. This finding suggests an effect bias. This bias is the variance due to a technique of measurement, such as item wording, as in this example favorably phrased items, instead of the conceptual framework, which is the variance due to a technique of measurement, such as item wording, as in this example favorably phrased items, instead of the conceptual framework [
38]. It is noteworthy that in grades 2–4, correlations between latent components were substantial, especially when the conduct Problems Scale issue factor was included, showing a large overlap between problem areas and a modified version of Goodman’s five-factor model, which demonstrated a satisfactory fit [
39].
In their meta-analysis, Stome et al., found that the bulk of the research supporting the five-factor structure was conducted on adolescents and the findings favored the parental assessment of their children’s social development using the SDQ [
8]. However, in a subsequent study, Stone et al., verified the applicability of five-factor models in parent and teacher versions for children aged 4–7 [
7]. Interestingly, in a study of young and old children, factor loadings were found to be acceptable for all groups, although older children judged by teachers had better internal consistency than younger children scored by their parents [
40]. The authors also suggested using a customized approach when using the SDQ for low-risk epidemiological samples. In particular, the five-factor structure was not validated in a large community sample of 7–17 years old, none of the subscales being unidimensional [
41]. Some other studies found little evidence for the five-component structure [
42,
43], revealing problems in the SDQ component structure, especially with reverse-coded questions [
41], internal consistency [
44], and high correlations between scales [
45].
Although results of this study do not agree with those of other studies that support a five-factor model for CFA use [
10,
46,
47], it is important to note that, in several studies, modified five-factor structure models were used and/or multiple models were found suitable. In this study, since the objective was to validate the original SDQ, I did not use any modification of the model. For instance, Boe et al., used a modified model, and Yao et al., concluded that both five-factor and higher-order structures are suitable. Boe et al., suggested that their findings would have been more positive if they had considered local dependencies within the variables [
46]. The fitting of the model can be further enhanced by defining cross-loadings of favorably worded problem items on the Prosocial Scale and minor variables on the Hyperactivity Scale, as shown in a recent study in Swedish adolescents [
36].
In close agreement with the findings of this study, internal consistency and the various SDQ scales were found to be good, with the exception of the peer issues scores in young children in the study conducted by Husky et al., which covered seven European countries and Hagquist et al., conducted in Sweden [
12,
48]. Supporting the use of specific subscales, Gustafsson et al. determined that for children ages 1–3, the subscales ‘Hyperactivity’ was valid, whereas, for ages 4–5, the full original SDQ scale, 4-factor solution, demonstrated acceptable validity [
49]. Other studies have indicated that there are no accessible norm scores for the Conduct Problems Scale and Peer Problems Scale between the ages of 2–3 and 4–5 owing to low internal consistency [
16,
50,
51].
As stated above, although adding an additional method component might enhance model fitting, I did not study techniques to overcome incorrect solutions in these models because I utilized a confirmatory approach, and the objective was to validate the original five-factor structure. I want to highlight that while the findings do not fully support the five-factor model or internal consistency for all subscales, it is crucial to note that EFA and CFA are just one type of verification approach. The use of specific scales and combinations of scales could be effective depending on the research problem. These aspects should be explored further in the future. As Karlsson et al., correctly contended, the SDQ has sufficient factorial validity for epidemiological studies if investigators realize its flaws and explain the implications of the method and related compounds [
36].
Strengths and Limitations
The key strength of this study is that it is the first study in Qatar to focus on preschoolers. The findings of this study will motivate other researchers to further explore different versions of the SDQ in different populations, which will accelerate the clinical and practical applications of this tool. This study had some limitations. First, I only included public preschools; therefore, results may not apply to private institutions. Moreover, CFA is a rigorous analysis; hence, its use may limit generalization. Self-reported data may or may not be accurate. It is also possible that the five-factor model did not fit the data because some key refinement was lost in the translation from the original SDQ to Arabic. In the future, the validation of SDQ compared to other mental health instruments will strengthen its construct validity.