Next Article in Journal
TGF-β and SHH Regulate Pluripotent Stem Cell Differentiation into Brain Microvascular Endothelial Cells in Generating an In Vitro Blood–Brain Barrier Model
Next Article in Special Issue
ACNN-BiLSTM: A Deep Learning Approach for Continuous Noninvasive Blood Pressure Measurement Using Multi-Wavelength PPG Fusion
Previous Article in Journal
Effect of Ligament Fibers on Dynamics of Synthetic, Self-Oscillating Vocal Folds in a Biomimetic Larynx Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessing Autistic Traits in Toddlers Using a Data-Driven Approach with DSM-5 Mapping

1
Abu Dhabi School of Management, Abu Dhabi P.O. Box 6844, United Arab Emirates
2
Manukau Institute of Technology, Auckland 2023, New Zealand
3
Higher Colleges of Technology, Abu Dhabi P.O. Box 25026, United Arab Emirates
4
ASDTests, Auckland 0610, New Zealand
*
Author to whom correspondence should be addressed.
Bioengineering 2023, 10(10), 1131; https://doi.org/10.3390/bioengineering10101131
Submission received: 31 July 2023 / Revised: 27 August 2023 / Accepted: 7 September 2023 / Published: 27 September 2023

Abstract

:
Autistic spectrum disorder (ASD) is a neurodevelopmental condition that characterises a range of people, from individuals who are not able to speak to others who have good verbal communications. The disorder affects the way people see, think, and behave, including their communications and social interactions. Identifying autistic traits, preferably in the early stages, is fundamental for clinicians in expediting referrals, and hence enabling patients to access to required healthcare services. This article investigates various ASD behavioral features in toddlers and proposes a data process using machine-learning techniques. The aims of this study were to identify early behavioral features that can help detect ASD in toddlers and to map these features to the neurodevelopment behavioral areas of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). To achieve these aims, the proposed data process assesses several behavioral features using feature selection techniques, then constructs a classification model based on the chosen features. The empirical results show that during the screening process of toddlers, cognitive features related to communications, social interactions, and repetitive behaviors were most relevant to ASD. For the machine-learning algorithms, the predictive accuracy of Bayesian network (Bayes Net) and logistic regression (LR) models derived from ASD behavioral data subsets were consistent pinpointing to the suitability of ML techniques in predicting ASD.

Graphical Abstract

1. Introduction

ASD is a neurodevelopmental condition that affects people’s perception of the world [1]. According to the [2], one in every 100 children is estimated to have ASD. Zeidan et al., (2022) [3] stated that there are more male cases of ASD than female cases. In the USA, the average age for detecting autism is four years, and families sometimes need to wait for 13 months after the first screening for their children for further assessments. According to Kosmicki et al., (2015) [4], this period can be longer for minority communities or for people of lower socioeconomic status. A study by [5] found that the average age of patients diagnosed with ASD is approximately 60 months.
In spite of the progress made in the early detection of autism, a clear decline was noticed during the COVID-19 pandemic, according to a report of the Centers for Disease Control and Prevention Report (CDC, 2023) [6]. Each person with autism is affected in a different way. People with autism can also exhibit other neurodevelopmental disorders, such as attention deficit hyperactivity disorder (ADHD) [1,7].
Currently, with the rapid developments in computer applications, and data analytics, which have enhanced the collection and availability of data [8], the healthcare sector is heavily reliant on data-based systems in guiding decisions. Medical-related datasets that are processed using artificial intelligence (AI) and data analytics techniques can be vital in the discovery of useful insights that can be utilized in diagnosing patients and improving their treatments [9]. The applications of data-driven approaches to assist in improving the ASD detection rate and to determine influential ASD traits have gained attention in the medical community [10,11,12,13,14,15,16]. However, little work has been focused on detecting ASD traits in toddlers [10,11,17,18] to investigate early indicators of ASD or the relationship of such indicators to the neurodevelopment behavioral areas described in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [19].
This research proposes a data-driven process to improve the screening of ASD in toddlers, applying machine-learning algorithms and using data collected by AutismAI system [20]. AutismAI embeds medical ASD-screening questionnaires, such as the short version of the Quantitative Checklist for Autism in Toddlers (Q-CHAT-10) [21], and others. The proposed data process can be used to determine a minimal set of dissimilar features to build classification models for assessing ASD. By identifying these features from ASD–toddler data, medical professionals can accurately carry out ASD screening in a cost-effective manner, using models developed by classification algorithms, and thereby reach clinical screening decisions as soon as these behavioral traits are identified during a clinical session.
Unlike most current data-driven methods that have been used in studying the behavioral features of ASD in toddlers such as the methods used by [17,22], and others, the current study’s main contribution is not only to consider the relationships between behavioral features of the ASD-screening methods used for assessment, but also to consider the correlations between the features themselves. Therefore, we can identify possible redundant features in ASD-screening methods. More essentially, this study maps the relationship between the ASD features with their neurodevelopment behavioral areas according to the DSM-5, opening discussions about which criteria are most vital in early diagnosis and which criteria pertain to the most behavioral features within the ASD diagnostic tools.
This paper is organized as follows: Section 2 provides a literature review and sets out works related to ASD detection in toddlers. Section 3 outlines the methodology followed. In Section 4, an analysis of the findings of the experiment are described and discussed. Finally, in the conclusion section, a summary of the results and the study’s experiment limitations are outlined.

2. Literature Review

Related Works

Zhu et al., (2023) [10] proposed a machine-learning approach to detect ASD in toddlers, using audio and a face-based detection system to assess response-to-name (RTN) as an ASD indicator. They compared human-based ratings and computer-based ratings for RTN. In that study, approximately 125 toddlers aged between 16 months and 30 months participated in the experiment, 61 of whom had been diagnosed with ASD, 31 of whom were diagnosed with developmental delays, and 33 of whom were diagnosed with typical delays. Caregivers and clinical evaluators were involved in the study. In the proposed system, the main features used to detect RTN were response latency, response duration, and head pose. For facial feature detection, a DLiB library was used with deep-learning method [23]. The human-rated approach and the computer-based approach were compared using area-under-curve (AUC) and accuracy measurements. According to the authors, both approaches reported consistent results. The AUC measurements reported a significant difference between human-based ratings and computer-based ratings (0.91 and 0.81, respectively). The other performance measurements reported no significant difference. The computer-based detection of RTN reported sensitivity, specificity, and accuracy of 80.0%, 69.8%, and 74.8%, respectively. For the human-based rating approach, the specificity was 82.5%, the sensitivity was 83.3%, and the accuracy was 82.9%. The authors emphasized that the use of machine-learning approaches for the early detection of ASD was promising; however, relying on a single factor, such as RTN, was not sufficient, and the dynamics and the evolution of ASD behavior required further consideration in future studies.
In 2023, [10] relied on a single predictor (response-to-name) to assess ASD in toddlers, using a machine-learning approach. In 2022, [11] chose to detect ASD in children by collecting their dataset via a semi-structured experiment using audio and video recordings. They considered different variables, including demographic information, medical diagnoses, and healthcare service procedures that were collected and extracted from individual cases. The dataset used was the IBM MarketScan Health Claims Database, which covered claims from 2005 to 2016. Their study included 38,576 observations (12,743 of which were ASD-diagnosed and 25,833 of which were non-ASD-diagnosed). The chronological age of the ASD cohort ranged from 18 months to 30 months, while the ages of the non-ASD cohort extended to 60 months. The researchers applied LR and random forest (RF) algorithms to discriminate between ASD and non-ASD cases. The RF algorithm reported better performance than the LR algorithm (AUCROC: 0.78 and 0.76, respectively), with specificity of 93% at the age of 24 months. In addition, the authors determined that better detection was found as the age of the toddlers increased. This was expected, as ASD symptoms, in particular the behavioral and social symptoms, are more obvious and pronounced as the age of a toddler increases [24]. The authors stated that reliance on on medical claims and insurance claims was not sufficient. They recommended that in future studies, the integration of data related to ASD developmental screening results and behavior data would be beneficial.
In 2022, [12] developed a system to pre-diagnose autism that can be used by caregivers, parents, and autistic people (children and adults). Their system uses questions adopted from medical questionnaires, such as Q-CHAT-10 and the various versions of AQ-10. Their system consists of a two-layer convolutional neural network (CNN) with 32 and 64 filters, respectively, and a dataset covering different age categories (toddlers, adolescents, and adults) that was collected by [25,26] with 6075 data observations. The reported performance measurements for accuracy, sensitivity, and specificity were 95.53%, 97.63%, and 98.63%, respectively. The developed CNN outperformed the other machine-learning algorithms, including the decision tree algorithm, the rule induction algorithm, and the Bayes Net algorithm. The authors recommended investigating other machine-learning approaches, such as deep learning, using more complex features such as videos and images to detect ASD. They also recommended using cluster analysis to recognize and tune the treatment strategies for different ASD cases.
In 2022, [13] suggested using a machine-learning approach to evaluate the predictive performance of the ASD-screening process. They suggested a system with two phases. The first phase was a pre-diagnostic phase, where the input dataset was clustered based on independent features related to communication, repetitive traits, and social traits, using a self-organizing-map (SOM). A new class label was created and compared with the existing class label, to refine the dataset and reduce the bias of the screening system’s assigned class. In this phase, 85% accuracy was achieved. In the second phase (the classification phase), a refined dataset was used and the prediction of ASD was remeasured. RF and naïve Bayes (NB) algorithms were used for classification. Approximately 2000 data instances are used to assess the derived models. This experiment provided good performance results in terms of accuracy, precision and recall by both classifiers (NB: 93%, 93%, and 94%; RF: 96%, 96%, and 97%). The RF classifier reported better performance results than the NB classifier. The evaluation of predictive models of ASD-detection systems using unsupervised learning is a creative approach to improving the quality of the machine-learning screening systems of ASD and reducing any bias related to medical screening.
In 2020, [22] discussed the issue of imbalanced class labels and how prediction performance can vary. They used the NB algorithm on nine ASD datasets, with several resampling techniques and different class label ratios, with 100 runs for each. The original dataset used was generated by the ASDTest system [25]. The authors used 1000 observations (975 with no ASD traits and 25 with ASD traits). The resampling techniques used with the NB classifier were the synthetic minority oversampling technique (SMOTE), the random oversampling (ROS) technique, and the random undersampling (RUS) technique. The SMOTE with the NB classifier reported the best ROC, while the NB with RUS reported the lowest performance results. Using the resampling techniques with NB helped to improve the predictive performance in the imbalanced ASD datasets. That research was one of the few studies to investigate the imbalance class label in ASD screening, together with those of, [13,15], and [16].
Washington et al., (2019) [27] explored feature redundancy by performing filter, wrapper, and embedded-feature selection analyses. The data used were aggregated from seven different sources consisting of 16,527 children/adolescents and the completion of with social responsiveness scale (SRS) [28] child/adolescent questionnaires. The SRS is a 65-item questionnaire that is completed by a caregiver about a child. Univariate filter feature selection was applied by the authors to measure the correlation between each feature and the outcome (class). They incorporated the recursive-feature-elimination (RFE) wrapper method, repeatedly removed the weakest feature, and created predictive models until the number of features achieved the desired performance. A support vector machine (SVM) algorithm was applied at each step of the RFE procedure to remove a single feature. During the classification step, principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and denoising autoencoder were used with a multi-layer perceptron (MLP) classifier to process the top-ranking items and to derive a classification model with an AUC of 92%.
In 2020, [29] determined that early detection of the symptoms of ASD can improve the quality of life of the diagnosed individuals. The authors used integrated data from the University of California, Irvine (UCI) machine learning repository, which consisted of three datasets with 20 common attributes. Data were cleaned for missing values and outliers. During the experiments, training and testing data settings in the ratio of 80:20 were applied, with cross-validation. SVM, and NB algorithms were used for classification, along with CNN, LR, and K-nearest neighbor. The derived predictive accuracy in detecting ASD using the CNN algorithm reached 98.30%.
In 2020, [30] suggested using feature selection with classification techniques to decrease data dimensionality and to choose only relevant features to enhance classification accuracy. They explained that various feature-selection methods can be used in ASD research and can help in improving the efficiency and the performance of machine-learning classification algorithms such as the flat, the streaming, and the structured feature engineering approaches. According to their recommendation, feature selection in ASD research requires additional investigation, due to its essential role in the pre-processing phase.
Kosmicki et al., (2015) [4] attempted the development of a more accurate method of fast detection of ASD than the current standard methods. They used machine-learning techniques to evaluate the clinical assessment of ASD, using the Autism Diagnostic Observation Schedule (ADOS) [31] to test whether a smaller subset of behaviors could differentiate between children who exhibited ASD traits and those who did not. The ADOS presents behavioral observations in a clinical setting and comprises four modules with different levels of cognitive functioning. The authors estimated that 27% of individuals are undiagnosed at the age of 8 years. Eight machine-learning algorithms were used, with feature selection, to process ADOS-related data. The results showed that the number of behavior traits was reduced from 28 to 9, at least for ADOS Module 2, and from 28 to 12 for ADOS Module 3. The LG algorithm showed good performance results when processing 9-trait data of Module 2, with 98.81% and 89.39% specificity. The SVM algorithm that was used when processing on Module 3 exhibited 97.71% and 97.20% specificity (a true-negative rate). In 2015, [4] claimed that these results encouraged the development of screening-based instruments for ASD detection and mobile-health approaches that enable individuals to receive necessary care.
In 2019, [32] considered that the revised Modified Checklist for Autism in Toddlers (M-CHAT-R) [33] was good for the initial screening of children for autism, but it required follow-up questions, as the interpretation of its scores by humans can be biased. The authors believed that AI methods could overcome the barriers to ASD screening, such as the use of a feed-forward artificial neural network [fANN] [34]. Hence, processing real data related to M-CHAT could not only be accessible, with low-cost screening, but it could be reliable for rural, minority, and low socioeconomic populations with low education levels. In experimentations, [32] included a total of 16,168 toddlers. The machine-learning technique was applied to the complete dataset and filtered by race, gender, and the parents’ education levels, and then the results were compared. The results produced high rates of correct classification using 18, and 14 attributes for toddlers respectively. The researchers claimed that the machine-learning method was comparable to the M-CHAT-R in accuracy of ASD diagnosis, while using fewer items.
Rahman et al., (2020) [30] determined that early identification approaches for ASD were limited and that most toddlers were not identified until after the age of four years, even though evidence suggested that early intervention and diagnosis could lead to major developmental improvements for the children. They applied machine-learning methods to electronic medical records (EMRs) to predict ASD early in a child’s life. The data included 1397 ASD children and 94,741 non-ASD children, born between January 1997 and December 2008. Parental sociodemographic information, parental medical history, and prescribed medication data were used to create 89 features for training and testing with various machine-learning algorithms, including multivariate LR, artificial neural networks, and RF. Additionally, 10-fold cross-validation was used to evaluate prediction performance by computing the area under the operating characteristic curve (AUC, C-statistic), sensitivity, specificity, accuracy, false positive rate, and precision. All machine-learning models produced similar performances, with a C-statistic of 0.709, sensitivity of 29.93%, specificity of 98.18%, accuracy of 95.62%, a false positive rate of 1.8%, and precision of 43.35% for the prediction of ASD in the dataset. Table 1 provides a summary of the literature review.

3. Methodology

This research followed the methodology depicted in Figure 1. The dataset used was collected from a mobile application called AutismAI. This dataset consisted of several characterized behavioral features related to Q-CHAT-10, together with three AQ screening methods plus other characteristics that are useful in detecting ASD. The dataset included 2048 data observations and included four age groups: toddlers (aged 12–36 months), children (aged 3–11 years), adolescents (aged 12–16 years), and adults (aged 17 years and over). Our focus was on toddlers only, a subset of 401. The data collection BY AutismAI was conducted after obtaining ethical approval from the host educational institutions. According to the authors, during data collection, there was no direct human-to-human contact; instead, participants such as parents used the system, after agreeing on an electronic consent form that data were to be used for research purposes only and were stored on a secured cloud database [20].
The medical questionnaire used for feature assessment was the short version of the Q-CHAT-10. Baron-Cohen et al., [35] created a checklist for autism in toddlers (CHAT), which is a screening instrument that identifies children who are 18 months of age and who are at risk for autism. Early detection of autism has been rare before the age of three years, as it is an uncommon condition and no specialized screening tools exist. By the age of 18 months, children on the spectrum exhibit an absence of behaviors such as joint attention and pretend play. Joint attention is the ability to establish a shared focus of attention with another person, while pretend play is an imaginary feature attribute involving pretending (by themselves or with another person) and imagining.
CHAT comprises nine questions that a health professional asks a parent, followed by direct observation of five aspects. The key items are related to pretend play, protodeclarative pointing, following a point, pretending, and producing a point. If the child fails all of these items, the child is predicted to be at the greatest risk of autism. CHAT was tested on a population of 16,000, and the high-risk criteria had a sensitivity of 18%, a specificity of 100%, a positive predictive value of 75%, and a negative predictive value of 99.7% [35]. Q-CHAT originally consisted of 25 questions; it was later reduced by [21] to 10 questions in the Q-CHAT-10 version, to speed up the screening process. Table 2 shows the items/questions of the screening methods considered. For the 10 questions, a score of 1 or 0 is recorded, based on a parent’s answer, If the total score is more than 3, the child is referred for further assessment by health professionals [21].
Figure 2 indicates that male toddlers score higher on Q-CHAT-10 than female toddlers. This indicates that boys have a higher risk of ASD than girls, but this result could also be because there were more male toddlers than girl toddlers who were screened the AutismAI screening app.
Attributes for id, date, and AutismAgeCategory were removed, as these were irrelevant for the experiment. To conduct further feature selection, as there are potentially four class attributes, the Score, DNNPrediction, and IsASDDignosed attributes were also removed, as they basically represented the same class and could have added extra weight to the model. Finally, the values of the Q1–Q10 attributes were transformed into nominal notations, as the responses to these questions from the Q-CHAT-10 could be either 0 or 1.
We used different feature-selection methods, including Pearson correlation, Relief-f [36] and gain ratio (GR) [37], as these methods are commonly used in medical research and provide dissimilar mathematical approaches to define the relevancy of each feature. GR can be calculated as follows:
G R a t t = I G a t t H a t t
where IG(att) is the information gain (IG) of each attribute and H(att) is the entropy of the attribute att by contributing to the class. GR is used to reduce bias towards multi-valued attributes, which is one of the main limitations of the IG metric [38].
The Pearson correlation can be calculated as follows:
P C o r = X Y X Y n X 2 X 2 n Y 2 Y 2 n
The Pearson correlation coefficient defines a linear correlation between two sets of data. It is essentially the covariance of the two variables divided by the product of their standard deviations. The Relief-f method is inspired by instance-based learning and detects features that are statistically relevant to the target variable. It calculates a score for each feature which is applied to a rank and, then, selects top-scoring features for feature selection. Relief-f evaluates the importance of an attribute by repeatedly sampling an instance and considering the value of the given attribute for the nearest instance of the same class and a different class [36]. Relief-f calculates the weight of a feature (wj) using Equation (3) [39]:
w j = [ x i j , x i h x i j , x i m ] N
where
  • N is the number of instances in a dataset with M features.
For an instance (xi), where xi is a vector xi = (xi₁, xi₂, .., xim), where i = 1, 2, .., N:
  • xih = the instance of the same class as xi (nearest hit neighbor);
  • xim = the instance of a different class (nearest miss neighbor);
  • δ(xij, xih) = the difference between the feature j values of xi and its nearest hit neighbor xih; and
  • δ(xij, xim) measures the difference between the feature j values of xi and its nearest miss neighbor xim.
For the classification step of the data process, we used two different algorithms—the LR and the Bayes Net. These algorithms were used previously in medical [40,41,42] and they have different learning methods. LR is a classification algorithm that uses a logistic function to describe probabilities of a possible outcome of a single trial. Logistic function is designed for classification and is useful for understanding the effects that various independent variables have on a single outcome variable. LR only works on a binary variable under the assumption that all independent variables are independent of each other.
LR can be calculated as follows:
L o g p 1 p = β 0 + β 1 x 1 + β 2 x 2
The Bayes Net algorithm, derived from the Bayes theorem, is a classification technique that uses a probabilistic graphical model to represent features and that can be used on a variety of tasks, including prediction, diagnostics, reasoning, etc. Bayes Net is built on probability distribution and relies on the laws of probability for prediction and anomaly detection. Bayes Net supports both discrete and continuous attributes, and it can be defined as follows:
P(A,B) = P(A|B)P(B) = P(B|A)P(A) => P(A|B) = P(B|A)P(A)/P(B)

4. Experimental Analysis

Stratified 10-fold cross-validation was used during the training phase of the algorithms. It is an evaluation technique that runs repeated percentage splits of data on the model being tested [43]. The considered classification algorithms were used to derive models for the prediction of ASD, and the models were evaluated in this research using accuracy, specificity, and sensitivity rates. For ASD feature–feature analysis, the RStudio heat map was used. The WEKA [44] machine-learning tool was used for running the classification model. None of the hyperparameters for the feature selection and classification methods were changed to specific values; hence, we used the default values.
The results from applying the considered feature-selection techniques (Relief-f, GR) are shown in Table 3. It is worth mentioning that most of the features belonged to Category A of DSM-5 (focus on social communication and interaction across different context), and two features (Q8 and Q10) belonged to Category B of DSM-5 (focus on restricted, repetitive patterns of behavior, interests, or activities) albeit partly for Q8. In addition, it was evident that the attributes that could help detect ASD in toddlers were Q1–Q9, as they appeared before other attributes, such as demographics, achieving the highest-ranking score in each feature-selection method used. However, the top-ranked questions that were reported as per the three feature selection methods were Q3, Q4, Q5, Q6, Q8, and Q9 (see Table 3). For instance, according to the GR and the Pearson correlation, the result showed that the three influential questions were Q6 (0.28/0.59), Q9 (0.26/0.56), and Q5 (0.24/0.54). These questions confirmed deficits in joint attention, social communication, and interaction, and could detect a non-verbal communication problem and the ability of the toddler to pretend to play. However, the Relief-f method reported different results for the second and third rank. The top-three ranked questions were Q6 (0.3), Q5 (0.3) and Q8 (0.28). Question 5 detected a social communication problem that manifested as the toddler’s inability to pretend play.
All feature-selection methods showed that Q5 and Q6 were strong predictors of autism in toddlers. These questions are related to social communication and interaction, as per the DSM-5, and belong in the same DSM-5 domain. However, the results here were consistent with previous studies conducted in ASD research. Social communication and interaction and delay in language ability are early symptoms that trigger the need for more autism screening [45].
According to the DSM-5, autism diagnostic criteria can be split into two groups of features. One is “persistent deficits in social communication and social interaction” and the other is “restricted, repetitive patterns of behavior, interests, or activities” (Autism Speaks, 2013). The first set of features can be further divided into two distinct features: communication and social interaction. Table 1 earlier shows the breakdown of Q1–Q10 according to DSM-5. The purpose of this research was to identify the features that can help detect ASD in toddlers, so we can break this down further to find the most influential social behaviors according to DSM-5 guidelines. Q1–Q9 were the top features from the selection methods, covering social interaction and communication cognitive behaviors, while Q10 relates to repetitive behaviors and is ranked as the lowest of all QCHAT-10 responses. Table 4 also shows Set 3, which is made up the top five features: Q4, Q5, Q6, Q9, and Q8. All of these features are in the social interaction category, except Q8, which concerns mainly communication and, in some cases, repetitive behaviour (speech). The result for the feature-selection methods enabled us to create four different sets of features for the classification, as shown in Table 4. Social interaction plays the biggest role in assessing toddlers and determining whether they are on the spectrum.
Classification methods were used on four sets of features to compare the performance of the models and to help us evaluate certain behavioral features. The first set contained a full set of attributes; the second set included Q1–Q9, as they were the common highest-ranked features in all three feature selection results; the third set had the top five attributes; and the last set contained the lowest-scoring attributes.
We used the LR algorithm for the classification. In addition, the Q-CHAT-10 features were added incrementally to the LR classifier. We started with Q1 against the class, then added Q2. As mentioned, the questions were added incrementally and the changes in the sensitivity were reported after each feature addition. Figure 3 shows the changes in the sensitivity of the LR classifier as the features are added. The reported sensitivity of the model when only Q1 was used was 85.5%. When Q2–Q4 were added to the data with Q1, decreases in sensitivity were recorded (79.70%), indicating the negative effect on the model’s sensitivity. However, this decrease in the sensitivity changed when Q5 was added to the data—an increase was noted and the sensitivity reached 90.9%. It is worth emphasizing that Q5 was one of the three top-ranked features selected by the feature selection methods.
Figure 4 shows the correlation among the QCHAT-10 attributes only. As shown in the color legend, blue represents negative, while red represents a positive correlation (ranging from −1 to 1). As depicted in Figure 4, Q3 was highly correlated with four questions (Q4, Q9, Q1, and Q5). In addition, it was evident that Q3 and Q4 had the largest correlation of 0.48, followed by Q6 and Q4, which had a correlation of 0.42, and then by Q7 and Q5, which had a correlation of 0.41. Referring to Table 1, both questions measured whether a child was able to express a certain interest by pointing. This could explain the high correlation between Q3 and Q4. In addition, Q6 and Q4 were related—they measured whether a child was looking or pointing to where an adult was looking. Furthermore, Q7 and Q5 measured whether a child showed comfort in playing roles—both questions measured a child’s ability to interact in a given social context. It is clear that only Q7 and Q10 had no correlation (close to 0) with each other. Q10 was also slightly correlated with other features, achieving only 0.13 as the highest correlation with Q4.
The results (sensitivity, specificity, accuracy) of the classification algorithms on the entire set of features can be taken as a point of reference when comparing results of these algorithms on the other subsets of features as shown in Figure 5. The accuracies of the LR and Bayes Net models seem to be consistent when processing datasets 2, 3, and 4, when compared with models derived from dataset 1. In classifying toddlers in terms of ASD, the Bayes Net algorithm derived better accuracy rates against dataset 2 than the models derived from dataset 1. For instance, the Bayes Net algorithm reported 23 cases as false positives and 14 cases as false negatives from Set 1, while the other models reported 15 cases as false positives and 10 cases for false negatives. The Bayes Net and LR algorithms derived models from Set 2 and Set 3 with consistent accuracy, specificity, and sensitivity rates. The two algorithms reported the worst performance result against Set 4. For example, LR reported 133 cases as false negatives and 61 as false positives. Bayes Net reported 59 instances as false positives and 120 instances as false negatives.

5. Conclusions, Limitations, and Ethical Implications

This research assessed behavioral features related to the QCHAT-10 ASD medical screening method for toddlers by applying machine-learning techniques. The data were first filtered to obtain toddler data observations, and descriptive analysis was conducted to obtain an understanding of the data before any further testing. GR, Pearson correlation, and Relief-f feature-selection techniques were used to evaluate social, behavioral, communication, and repeated-learning features that could help with the early detection of ASD traits in toddlers. The preliminary feature-selection assessment reported consistent results in terms of the influential features for ASD. It was evident that behavioral traits, such as the social-interaction cognitive behaviors from QCHAT-10 responses, were significant features that could help predict ASD in toddlers. The results showed that Set 3, which is made up of four social features (Q4, Q5, Q6, and Q9) and one communication feature (Q8), was significant in determining ASD during clinical screening. The result for the feature-selection methods allowed us to create three different sets of features for the classification phase. These could be embedded within a digital system for clinicians to use during the clinical assessment of autism. Social interaction played the biggest part in assessing toddlers and determining if they could be classified as having ASD traits, at least based on the machine-learning techniques and the dataset we considered. However, the results showed some overlapping among the features, in which some measured the same domain within the DSM-5, such as Q3 and 4 or Q5 and 7.
The predictive models derived by the machine-learning algorithms (LR and Bayes Net) from the data showed competitive performance in classifying toddlers. Particularly, when processing Set 1, which contained all features, the derived models provided high accuracy, specificity, and sensitivity rates via the machine-learning algorithms. In particular, the accuracy rates of the LR and Bayes Net algorithms, derived from the behavioral data subsets (2,3), were within acceptable medical standard rates, reaching up to 95% accuracy by LR algorithm. In addition, the models derived by the machine-learning algorithm from Subset 3 (the minimal data subset) showed competitive performance for sensitivity rates, reaching up to 89% by the Bayes Net algorithm. This demonstrated that processing only five behavioral features related to toddlers by data-driven algorithms produced competitive performance in terms of accuracy, specificity, and sensitivity measurements.
A limitation of this is study was the low number of data observations that were used for toddlers in generating the models. In addition, the study was restricted to the machine-learning and feature-selection techniques that were used, so including other techniques, especially advanced AI methods such as deep learning that may consider other types of features related to eye-tracking, may be a way forward in the future. In the near future, we plan to include more data observations and expand on the techniques used. This will allow a more comprehensive conclusion about how feature selection can impact the detection of ASD, especially when including complex features related to play or social interaction from videos, as well as the movements of toddlers.
The application of AI and machine learning for ASD classification presents a variety of ethical implications that require careful consideration. While these technologies hold promise in enhancing early detection and improving personalized interventions for individuals on the spectrum, they also raise concerns that revolve around privacy, bias, and informed consent. One of the ethical concerns for individuals with ASD pertains to their privacy. AI systems often require vast amounts of data for training the algorithms, including personal information about patients. Ensuring the security and proper handling of these data becomes crucial to prevent unauthorized access.
Moreover, bias in classification models is an ethical challenge. If the training data is not diverse and sufficiently representative, the resulting models might reflect bias and could lead to inaccuracies in diagnosis. Efforts must be made to mitigate bias and ensure fairness in the development and deployment of these systems. Finally, informed consent becomes a critical issue when using AI and machine-learning techniques for ASD classification. Individuals and their families should have a clear understanding of how their data will be used, the potential benefits, and the limitations of automated-based assessments.

Author Contributions

Conceptualization: R.T. and F.T.; methodology, N.A., R.T. and F.T.; software, N.A., R.T. and H.M.; validation, N.A. and F.T.; formal analysis, H.M. and N.A.; investigation, N.A., R.T., and H.M., data curation, N.A. and R.T.; writing, N.A., R.T., H.M. and F.T.; review and editing, N.A., and F.T.; visualization, N.A., R.T., and H.M.; supervision, F.T. and N.A.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding

Institutional Review Board Statement

Not Applicable.

Informed Consent Statement

Not Applicable.

Data Availability Statement

Data are available on Kaggle data repository & originally generated by AutismAI’s authors.

Conflicts of Interest

We declare no conflict of interest.

References

  1. Lord, C.; Elsabbagh, M.; Baird, G.; Veenstra-Vanderweele, J. Autism spectrum disorder. Lancet 2018, 392, 508–520. [Google Scholar] [CrossRef] [PubMed]
  2. World Health Organization (2023)www.who.int. 2023. Available online: https://www.who.int/news-room/fact-sheets/detail/autism-spectrum-disorders (accessed on 22 March 2023).
  3. Zeidan, J.; Fombonne, E.; Scorah, J.; Ibrahim, A.; Durkin, M.S.; Saxena, S.; Yusuf, A.; Shih, A.; Elsabbagh, M. Global prevalence of autism: A systematic review update. Autism Res. 2022, 15, 778–790. [Google Scholar] [CrossRef] [PubMed]
  4. Kosmicki, J.A.; Sochat, V.; Duda, M.; Wall, D.P. Searching for a minimal set of behaviors for autism detection through feature selection-based machine learning. Transl. Psychiatry 2015, 5, e514. [Google Scholar] [CrossRef]
  5. Hof, M.; Tisseur, C.; van Berckelear-Onnes, I.; van Nieuwenhuyzen, A.; Daniels, A.M.; Deen, M.; Hoek, H.W.; Ester, W.A. Age at autism spectrum disorder diagnosis: A systematic review and meta-analysis from 2012 to 2019. Autism 2020, 25, 862–873. [Google Scholar] [CrossRef]
  6. Community Report on 2023, CDC Report. 2023. Available online: https://www.cdc.gov/ncbddd/autism/pdf/ADDM-Community-Report-SY2020-h.pdf (accessed on 23 June 2023).
  7. Hargitai, L.D.; Livingston, L.A.; Waldren, L.H.; Robinson, R.; Jarrold, C.; Shah, P. Attention-deficit hyperactivity disorder traits are a more important predictor of internalising problems than autistic traits. Sci. Rep. 2023, 13, 31. [Google Scholar] [CrossRef] [PubMed]
  8. Dash, S.; Shakyawar, S.K.; Sharma, M.; Kaushik, S. Big data in healthcare: Management, analysis and future prospects. J. Big Data 2019, 6, 54. [Google Scholar] [CrossRef]
  9. Ahsan, M.M.; Luna, S.A.; Siddique, Z. Machine-Learning-Based Disease Diagnosis: A Comprehensive Review. Healthcare 2022, 10, 541. [Google Scholar] [CrossRef] [PubMed]
  10. Zhu, F.L.; Wang, S.H.; Liu, W.B.; Zhu, H.L.; Li, M.; Zou, X.B. A multimodal machine learning system in early screening for toddlers with autism spectrum disorders based on the response to name. Front. Psychiatry 2023, 14, 1039293. [Google Scholar] [CrossRef] [PubMed]
  11. Chen, Y.H.; Chen, Q.; Kong, L.; Liu, G. Early detection of autism spectrum disorder in young children with machine learning using medical claims data. BMJ Health Care Inf. 2022, 29, e100544. [Google Scholar] [CrossRef]
  12. Shahamiri, S.R.; Thabtah, F.; Abdelhamid, N. A new classification system for autism based on machine learning of artificial intelligence. Technol. Health Care 2022, 30, 605–622. [Google Scholar] [CrossRef]
  13. Thabtah, F.; Spencer, R.; Abdelhamid, N.; Kamalov, F.; Wentzel, C.; Ye, Y.; Dayara, T. Autism screening: An unsupervised machine learning approach. Health Inf. Sci. Syst. 2022, 10, 26. [Google Scholar] [CrossRef]
  14. Scott AJ, W.; Wang, Y.; Abdel-Jaber, H.; Thabtah, F.; Ray, S.K. Improving screening systems of autism using data sampling. Technol. Health Care 2021, 29, 897–909. [Google Scholar] [CrossRef] [PubMed]
  15. Abdelhamid, N.; Padmavathy, A.; Peebles, D.; Thabtah, F.; Goulder-Horobin, D. Data Imbalance in Autism Pre-Diagnosis Classification Systems: An Experimental Study. J. Inf. Knowl. Manag. 2020, 19, 2040014. [Google Scholar] [CrossRef]
  16. Erkan, U.; Thanh, D.N. Autism Spectrum Disorder Detection with Machine Learning Methods. Curr. Psychiatry Res. Rev. 2020, 15, 297–308. [Google Scholar] [CrossRef]
  17. Chan, S.; Thabtah, F.; Abdel-Jaber, H.; Guerrero, F. Autism detection for toddlers from behavioural indicators using classification techniques. Intell. Decis. Technol. 2022, 16, 589–599. [Google Scholar] [CrossRef]
  18. Rajab, K.D.; Padmavathy, A.; Thabtah, F. Machine Learning Application for Predicting Autistic Traits in Toddlers. Arab. J. Sci. Eng. 2021, 46, 3793–3805. [Google Scholar] [CrossRef]
  19. American Psychiatric Association. Cautionary statement for forensic use of DSM-5. In Diagnostic and Statistical Manual of Mental Disorders, 5th ed.; American Psychiatric Association: Washington, DC, USA, 2013. [Google Scholar]
  20. Shahamiri, S.R.; Thabtah, F. Autism AI: A New Autism Screening System Based on Artificial Intelligence. Cogn. Comput. 2020, 12, 766–777. [Google Scholar] [CrossRef]
  21. Allison, C.; Baron-Cohen, S.; Wheelwright, S.; Charman, T.; Richler, J.; Pasco, G.; Brayne, C. The Q-CHAT (Quantitative CHecklist for Autism in Toddlers): A Normally Distributed Quantitative Measure of Autistic Traits at 18–24 Months of Age: Preliminary Report. J. Autism Dev. Disord. 2008, 38, 1414–1425. [Google Scholar] [CrossRef]
  22. Thabtah, F.; Hammoud, S.; Kamalov, F.; Gonsalves, A. Data imbalance in classification: Experimental evaluation. Inf. Sci. 2020, 513, 429–441. [Google Scholar] [CrossRef]
  23. Kazemi, V.; Sullivan, J. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar] [CrossRef]
  24. Anderson, D.K.; Oti, R.S.; Lord, C.; Welch, K. Patterns of Growth in Adaptive Social Abilities Among Children with Autism Spectrum Disorders. J. Abnorm. Child Psychol. 2009, 37, 1019–1034. [Google Scholar] [CrossRef]
  25. Thabtah, F. ASD Tests. Google.com. 2017. Available online: https://play.google.com/store/407apps/details?id=com.asd.asdquiz&hl=en (accessed on 11 December 2022).
  26. Thabtah, F.; Kamalov, F.; Rajab, K. A new computational intelligence approach to detect autistic features for autism screening. Int. J. Med. Inform. 2018, 117, 112–124. [Google Scholar] [CrossRef] [PubMed]
  27. Washington, P.; Paskov, K.M.; Kalantarian, H.; Stockham, N.; Voss, C.; Kline, A.; Patnaik, R.; Chrisman, B.; Varma, M.; Tariq, Q.; et al. Feature Selection and Dimension Reduction of Social Autism Data. Biocomputing 2019, 2020, 707–718. [Google Scholar] [CrossRef]
  28. Constantino, J.N. Social Responsiveness Scale. Encycl. Autism Spectr. Disord. 2013, 2919–2929. [Google Scholar] [CrossRef]
  29. Raj, S.; Masood, S. Analysis and Detection of Autism Spectrum Disorder Using Machine Learning Techniques. Procedia Comput. Sci. 2020, 167, 994–1004. [Google Scholar] [CrossRef]
  30. Rahman, R.; Kodesh, A.; Levine, S.Z.; Sandin, S.; Reichenberg, A.; Schlessinger, A. Identification of newborns at risk for autism using electronic medical records and machine learning. Eur. Psychiatry 2020, 63, e22. [Google Scholar] [CrossRef]
  31. Molloy, C.A.; Murray, D.S.; Akers, R.; Mitchell, T.; Manning-Courtney, P. Use of the Autism Diagnostic Observation Schedule (ADOS) in a clinical setting. Autism 2011, 15, 143–162. [Google Scholar] [CrossRef]
  32. Achenie LE, K.; Scarpa, A.; Factor, R.S.; Wang, T.; Robins, D.L.; McCrickard, D.S. A Machine Learning Strategy for Autism Screening in Toddlers. J. Dev. Behav. Pediatr. 2019, 40, 369–376. [Google Scholar] [CrossRef]
  33. Chlebowski, C.; Robins, D.L.; Barton, M.L.; Fein, D. Large-Scale Use of the Modified Checklist for Autism in Low-Risk Toddlers. Pediatrics 2013, 131, e1121–e1127. [Google Scholar] [CrossRef]
  34. Schalkoff, R.J. Artificial Neural Networks; McGraw-Hill Science, Engineering & Mathematics: New York, NY, USA, 1997. [Google Scholar]
  35. Baron-Cohen, S.; Wheelwright, S.; Cox, A.; Baird, G.; Charman, T.; Swettenham, J.; Drew, A.; Doehring, P. Early identification of autism by the CHecklist for Autism in Toddlers (CHAT). J. R. Soc. Med. 2000, 93, 521–525. [Google Scholar] [CrossRef]
  36. Kira, K.; Rendell, L.A. A Practical Approach to Feature Selection. Mach. Learn. Proc. 1992, 1992, 249–256. [Google Scholar] [CrossRef]
  37. Priyadarsini, R.P.; Valarmathi, M.L.; Sivakumari, S. Gain Ratio Based Feature Selection Method For Privacy Preservation. ICTACT J. Soft Comput. 2011, 1, 201–205. [Google Scholar]
  38. Trabelsi, M.; Meddouri, N.; Maddouri, M. A New Feature Selection Method for Nominal Classifier based on Formal Concept Analysis. Procedia Comput. Sci. 2017, 112, 186–194. [Google Scholar] [CrossRef]
  39. Robnik-Šikonja, M.I. An adaptation of Relief for attribute estimation in regression. In Proceedings of the Machine learning: Proceedings of the fourteenth international conference (ICML’97); Nashville, TN, USA, 8–12 July 1997, Volume 5, pp. 296–304.
  40. Vishal, V.; Singh, A.K.; Jinila, Y.B.; Shyry, S.P.; Jabez, J. A Comparative Analysis of Prediction of Autism Spectrum Disorder (ASD) using Machine Learning. In Proceedings of the 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 28–30 April 2022; pp. 1355–1358. [Google Scholar]
  41. Thabtah, F.; Abdelhamid, N.; Peebles, D. A machine learning autism classification based on logistic regression analysis. Health Information Science and Systems 2019, 7, 12. [Google Scholar] [CrossRef]
  42. Masum, M.N.; Faruk, A.; Shahriar, H. A Comparative Study of Machine Learning-based Autism Spectrum Disorder Detection with Feature Importance Analysis. STI 2022. 2022. Available online: https://www.researchgate.net/publication/359081817_A_Comparative_Study_of_Machine_Learningbased_Autism_Spectrum_Disorder_Detection_with_Feature_Importance_Analysis (accessed on 17 February 2013).
  43. Hanif, M.K.; Ashraf, N.; Sarwar, M.U.; Adinew, D.M.; Yaqoob, R. Employing Machine Learning-Based Predictive Analytical Approaches to Classify Autism Spectrum Disorder Types. Complexity 2022, 2022, 1–10. [Google Scholar] [CrossRef]
  44. Smith, T.C.; Frank, E. Introducing Machine Learning Concepts with WEKA. Methods Mol. Biol. 2016, 353–378. [Google Scholar] [CrossRef]
  45. Parmeggiani, A.; Corinaldesi, A.; Posar, A. Early features of autism spectrum disorder: A cross-sectional study. Ital. J. Pediatr. 2019, 45, 144. [Google Scholar] [CrossRef]
Figure 1. Methodology Followed.
Figure 1. Methodology Followed.
Bioengineering 10 01131 g001
Figure 2. Breakdown of scores by males and females.
Figure 2. Breakdown of scores by males and females.
Bioengineering 10 01131 g002
Figure 3. Sensitivity changes as attributes are added.
Figure 3. Sensitivity changes as attributes are added.
Bioengineering 10 01131 g003
Figure 4. Feature-to-feature correlation for QCHAT-10 attributes.
Figure 4. Feature-to-feature correlation for QCHAT-10 attributes.
Bioengineering 10 01131 g004
Figure 5. Accuracy, specificity, and sensitivity rates derived from different data subsets.
Figure 5. Accuracy, specificity, and sensitivity rates derived from different data subsets.
Bioengineering 10 01131 g005
Table 1. Literature review summary.
Table 1. Literature review summary.
Methods UsedData UsedPerformanceReference
DLiB library
Deep learning
Kaldi toolkit
125 toddlersSensitivity: 80.00%
Specificity: 69.80%
Accuracy: 74.80%.
[10]
LR
RF
IBM MarketScan Health Claims database;
38,576 observations
RF:
-
AUCROC: 78.00%
-
Specificity: 93.00%
LR:
-
AUCROC: 76.00%
-
Specificity: 90.00%
[11]
CNNDataset: collected by ASDTests
6075 observations
-
Accuracy: 95.53%,
-
Sensitivity: 97.63%
-
Specificity: 98.63%
[12]
SOM
RF
NB
2000 observationsNB:
-
Accuracy: 93.00%,
-
Precision: 93.00%
-
Recall: 94.00%
RF:
-
Accuracy: 96.00%,
-
Precision: 96.00%,
-
Recall: 97.00%
[13]
C4.5
RIPPER
RF
NB
ASDTest dataset;
1054 toddler data observations
No data sampling:
NB: Sensitivity: 96.20%
C4.5: Sensitivity: 92.30%
RIPPER: Sensitivity: 92.40%
RF: Sensitivity: 95.30%
[15]
Symmetrical uncertainty (SU), IG, fast-correlated-based filter (FCBF), leave one out cross-validation (LOOCV), gini index, and chi-square
ID3
ADABoost
Knn
ASDTest dataset;
1054 toddler data observations
No data sampling.
Sensitivity rates between 93% and 98%, depending on the feature sets used by the classification algorithm. The highest sensitivity rate was achieved by ADABoost.
[18]
NB with data sampling:
SMOTE
RUS
ASDTest dataset;
1118 adult data observations
SMOTE+ NB:
-
Sensitivity: 96.00%
RUS + NB:
-
Sensitivity: 94.00%
No sampling + NB:
-
Sensitivity: 93.00%
[22]
mRMR and chi-square testing feature selection
C4.5
RIPPER
RF
NB
SVM
ASDTest dataset;
1054 toddler data observations
No data sampling.
Sensitivity rates between 93% and 97.50%, depending on the feature sets used by the classification algorithm. The highest Sensitivity rate was achieved by the SVM.
[17]
NB with data sampling:
SMOTE
ROS
RUS
ASDTest dataset;
over 1000 observations
SMOTE + NB:
-
Sensitivity: 95.00%
-
Specificity: 94.00%
-
Precision: 95.90%
ROS + NB:
-
Sensitivity: 94.20%
-
Specificity: 96.45%
-
Precision: 94.30%
RUS + NB:
-
Sensitivity: 94.30%
-
Specificity: 96.09%
-
Precision: 94.3%0
[14]
kNN,
SVM
RF
ASD dataset of UCI machine-learning data repository kNN:
Accuracy: 95.70%
Sensitivity: 95.15%
F-measure: 94.64%
AUC: 96.00%
SVM:
Accuracy: 99.90%
Sensitivity: 99.90%
F-measure: 99.90%
AUC: 100%
RF:
Accuracy: 99.90%
Sensitivity: 99.90%
F-measure: 99.90%
AUC: 99.90%
[16]
Multilayer perceptron (MLP) classifierSocial responsiveness scale (SRS) [28] - child/adolescent questionnaire;
16,527 children/adolescents
AUC: 92.00%[27]
SVM
CNN
ANN
An integrated data from the UCI machine-learning data repository, consisting of three datasets with 20 common attributesCNN algorithm.
-
Accuracy: 98.30%
SVM:
-
Accuracy: 97.95%
ANN:
-
Accuracy: 97.60%
[29]
Multivariate LR
MLP
RF
EMR data from a single Israeli health maintenance organization;
96,138 EMR children information
Multivariate LR:
-
Accuracy: 94.90%
-
AUC: 72.70%
MLP:
-
Accuracy: 95.50%
-
AUC: 70.00%
RF
-
Accuracy: 96.50%
-
AUC: 69.30%
[30]
Table 2. Mapping Q-CHAT-10 items to the corresponding DSM-5 domains.
Table 2. Mapping Q-CHAT-10 items to the corresponding DSM-5 domains.
Question
Number
Question DetailsCorresponding DSM-5
Q1Does your child look at you when you call his/her name?Deficits in social communication and interaction (problem with social initiation and response)
Q2How easy is it for you to have eye contact with your child?Deficits in social communication and interaction (non-verbal communication problem)
Q3Does your child point to indicate that s/he wants something (e.g., a toy that is out of reach)?Deficits in joint attention and social communication and interaction (non-verbal communication problem)
Q4Does your child point to share interest with you (e.g., pointing at an interesting sight)?Deficits in joint attention and social communication and interaction (non-verbal-communication problems)
Q5Does your child pretend (e.g., care for dolls, talk on a toy phone)?Deficits in social communication and interaction related to pretend play
Q6Does your child follow where you are looking? Deficits in joint attention and social communication and interaction (non-verbal communication problems)
Q7If you or someone else in the family is visibly upset, does your child show signs of wanting to comfort them (e.g., stroking their hair, hugging them)?Deficits in social communication and interaction (problems with social initiation and response)
Q8Would you describe your child’s first words as (Very typical, Quite typical, Slightly unusual, Very unusual, My child doesn’t speak)Deficits in social communication and interaction related to language development. Stereotyped/repetitive speech
Q9Does your child use simple gestures (e.g., wave goodbye)?Deficits in social communication and interaction (non-verbal communication problem)
Q10Does your child stare at nothing with no apparent purpose?Shows restricted and repetitive patterns of behavior, interests, or activities (stereotyped behaviors)
Table 3. Feature selection results.
Table 3. Feature selection results.
Attribute RankGain Ratio ScoreAttribute RankPearson
Correlation Score
Attribute RankRelief Score
Q60.281297Q60.5978Q60.30551
Q90.263325Q90.5653Q50.30501
Q50.240884Q50.5492Q80.28446
Q40.222894Q40.5229Q90.28521
Q30.222216Q80.5163Q40.24612
Q80.20949Q70.4805Q30.20902
Q70.182863Q30.4783Q20.19799
Q20.163217Q20.461Q70.19098
Q10.146544Q10.4181Q10.17945
FamilyASDHistory0.013727FamilyASDHistory0.1316Q100.05514
Ethnicity0.006148Age0.1201Ethnicity0.02957
Jaundice0.006557Jaundice0.0903FamilyASDHistory0.02331
User0.005728User0.0802User0.01805
Q100.000547Q100.0272Age0.00827
Sex0.000169Ethnicity0.0252Jaundice0.001
Age0Sex0.0137Sex-0.00802
Table 4. Feature sets to be tested by classification.
Table 4. Feature sets to be tested by classification.
Set 1:
No Feature Selection
Set 2:
Q1 to Q9
Set 3:
Highest Scoring attributes
Set 4:
Secondary/Lowest Scoring Attributes
Q1 (Communication)Q1Q6FamilyASDHistory
Q2 (Social interaction)Q2Q9Age
Q3 (Communication)Q3Q5Q10
Q4 (Social interaction)Q4Q8User
Q5 (Social interaction)Q5Q4Jaundice
Q6 (Social interaction)Q6 Ethnicity
Q7 (Social interaction)Q7 Sex
Q8 (Communication)Q8
Q9 (Social interaction)Q9
Q10(Repetitive patterns)
Age
Sex
Ethnicity
Jaundice
Family ASD history
User
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Abdelhamid, N.; Thind, R.; Mohammad, H.; Thabtah, F. Assessing Autistic Traits in Toddlers Using a Data-Driven Approach with DSM-5 Mapping. Bioengineering 2023, 10, 1131. https://doi.org/10.3390/bioengineering10101131

AMA Style

Abdelhamid N, Thind R, Mohammad H, Thabtah F. Assessing Autistic Traits in Toddlers Using a Data-Driven Approach with DSM-5 Mapping. Bioengineering. 2023; 10(10):1131. https://doi.org/10.3390/bioengineering10101131

Chicago/Turabian Style

Abdelhamid, Neda, Rajdeep Thind, Heba Mohammad, and Fadi Thabtah. 2023. "Assessing Autistic Traits in Toddlers Using a Data-Driven Approach with DSM-5 Mapping" Bioengineering 10, no. 10: 1131. https://doi.org/10.3390/bioengineering10101131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop