A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution

Masood, Fahad; Khan, Wajid Ullah; Ullah, Khalil; Khan, Ahmad; Alghamedy, Fatemah H.; Aljuaid, Hanan

doi:10.3390/app13074275

Open AccessArticle

A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution

by

Fahad Masood

^1,2,

Wajid Ullah Khan

¹,

Khalil Ullah

³,

Ahmad Khan

⁴

,

Fatemah H. Alghamedy

⁵

and

Hanan Aljuaid

^6,*

¹

Department of Computing, Abasyn University, Peshawar 25000, Pakistan

²

Department of Electronics, Quaid i Azam University, Islamabad 45320, Pakistan

³

Department of Software Engineering, University of Malakand, Chakdara 18800, Pakistan

⁴

Department of Software Engineering, Mirpur University of Science and Technology, Mirpur 10250, Pakistan

⁵

Department of Computer, Applied College, Imam Abdulrahman Bin Faisal University, Dammam 31441, Saudi Arabia

⁶

Department of Computer Science, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University Riyadh, Riyadh 11671, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(7), 4275; https://doi.org/10.3390/app13074275

Submission received: 7 March 2023 / Revised: 22 March 2023 / Accepted: 22 March 2023 / Published: 28 March 2023

(This article belongs to the Special Issue Artificial Intelligence and Robotics in Healthcare)

Download

Browse Figures

Versions Notes

Abstract

:

Parkinson’s disease (PD) Dysgraphia is a disorder that affects most PD patients and is characterized by handwriting anomalies caused mostly by motor dysfunctions. Several effective ways to quantify PD dysgraphia analysis have been used, including online handwriting processing. In this research, an integrated approach, using a combination of a convolutional neural network (CNN) and long short-term memory (LSTM) layers along with a Random Forest (RF) classifier, is proposed for dysgraphia classification. The proposed approach uses uniform and normal distributions to randomly initialize the weights and biases of the CNN and LSTM layers. The CNN-LSTM model predictions are paired with the RF classifier to enhance the model’s accuracy and endurance. The suggested method shows promise in identifying handwriting symbols for those with dysgraphia, with the CNN-LSTM model’s accuracy being improved by the RF classifier. The suggested strategy may assist people with dysgraphia in writing duties and enhance their general writing skills. The experimental results indicate that the suggested approach achieves higher accuracy.

Keywords:

handwritten characters recognition; Random Forest; CNN; LSTM

1. Introduction

Dysgraphia is a neurological condition that affects the written representation of words and symbols. In modern culture, dysgraphia may be a major issue, as we still rely largely on our capacity to communicate through written language. Moreover, dysgraphia in a school context might have an impact on a child’s self-esteem, normal development, and academic results [1,2]. After accurately detecting the source of a learning problem in a child, early diagnosis allows children to seek assistance and enhance their writing sooner, and it also helps teachers change their teaching method [3].

Dysgraphia is frequently linked to other conditions such as dyslexia. These illnesses appear to share comparable brain locations on a neurophysiological basis [4]. Dysgraphia is also related to developmental coordination issues [5,6], as well as a broader oral and written language acquisition difficulty [7]. Dysgraphia does not constitute a unified concept [8] and can be characterized by a variety of handwriting characteristics [9]. Developed dysgraphia is less prevalent and is generally caused by an accident or illness that affects parts of the brain [10]. Specific handwriting alternations are common in patients with Alzheimer’s disease [11], resulting in Parkinson’s disease dysgraphia. Figure 1 shows the handwriting samples of a normal and dysgraphia patient.

Handwritten character recognition for dysgraphia patients, especially the identification of broken text, is a relatively challenging task. The use of machine learning and deep learning approaches has resulted in a significant rise in the text recognition rate [12,13,14]. Recent development in digital image processing and machine/deep learning models have shown good results in various applied domains that can be used in OCR [15,16,17,18]. Recurrent Neural Networks (RNNs) may generate contextual knowledge by translating all preceding data to each output and constructing a memory using their data sequencing approach.

The Convolution-Neural-Network (CNN) is a deep learning architecture that is physiologically inspired by the human brain’s sensory perception process. Convolution layers are being used for knowledge selection. CNN layers are classified into three types: convolutional, fully connected, and pooling. The most extensively used deep learning algorithms are CNN and Long Short-Term Memory (LSTM). Deep learning architectures enable more sophisticated models to be learned in detail to retrieve hierarchical characteristics from data than typical machine learning methods, which conduct a linear transformation of nonlinear feature extraction on a single layer [19].

The proposed approach in this study has considerable implications for people who have dysgraphia, a neurological condition that affects writing ability. The hand-written character recognition system can help people with dysgraphia write more quickly and accurately. The CNN and LSTM layers’ weights and biases are arbitrarily initialized using uniform and normal distributions, which aids in avoiding local minima and improves the model’s overall performance. Additionally, the CNN-LSTM model’s final layer’s feature extraction procedure enables the extraction of significant and instructive features that can be applied to classification using a Random Forest model. The importance of this research for people with dysgraphia lies in its ability to enhance their general quality of life. Dysgraphia can have an effect on several aspects of quality of life, such as social interactions, employment, and schooling. The suggested method can help people with dysgraphia become more proficient writers, which can enhance their success in school and at work. Additionally, this study may have an impact on enhancing the technology’s usability for people with impairments. Overall, the suggested method may have significant effects on raising the standard of living for people who have dysgraphia and other writing disabilities.

The rest of the paper is organized as follows: Section 2 discusses the previously reported deep learning algorithms for handwritten character recognition. Section 3 describes the detailed methodology, as well as data preparation for the proposed model. It also includes a full discussion of the model implementation. Section 4 describes the CNN-LSTM model’s performance as well as the experimental data. Section 5 offers a concise description of the research work.

2. Literature Review

There have been various initiatives to use machine learning to detect dysgraphia. Different aspects of handwriting may be tested and studied using tablets. The features to be retrieved, such as writing speed, breaks, and pen lifting, are motivated by cognitive and neurological studies on dysgraphia [20,21]. Digitized testing offers features that could not previously be measured, such as handwriting speed, pressure, in-air movement, and acceleration, in contrast to conventional clinical testing, which primarily relies on static features, such as writing density or text shape and the amount of time needed to complete tasks. Asselborn et al. distinguished between four categories of characteristics, kinematic, static, tilt, and pressure [9]. Similar characteristics were used by Mekyska et al., as well as non-linear dynamic and kinematic features [22]. Three machine learning techniques are most often employed to recognize dysgraphia. Random forests were used instead of the conventional BHK testing by Asselborn et al. Random forests were used by Mekyska et al. to identify dysgraphia in 8- to 9-year-olds [22].

Support vector machines (SVMs) were used in another approach by Rosenbloom and Dror19 to detect dysgraphia in young people [23]. SVM was also used by Kurniawan et al. to recognize dysgraphia [24]. Children were required to write in a straight line on a mobile screen for their study, which differs from the regular writing environment. Another approach used to identify dysgraphia is neural networks (NNs). Simple NNs with six hidden neurons were used by Samodro and Sihwi [25]. Deep learning was used by Kariyawasham et al. to test for dysgraphia [26]. Others experiment with enhanced features used for analysis; for instance, Zvoncak et al. employed derivative and wavelet transform features [27,28].

The ability of CNNs to recognize significant properties of an object (a handwritten character, an image, a face, etc.) automatically without human supervision or involvement makes them more effective than their counterparts (Multi-layer perceptron (MLP), etc.). Simard et al. developed a generic convolutional neural network architecture for processing and eliminating the difficult neural network training procedure for visual documents [29]. Wang et al. introduced a unique strategy for end-to-end text recognition based on multi-layer CNNs, achieving good results [30].

For semantic segmentation, Badrinarayanan et al. presented a deep convolution network design. SegNet is a segmentation architecture that comprises an encoder network, a pixel classification layer, and a decoder network. While decoding, the suggested approach exploited max pooling parameters of feature space and achieved good results. The strategy was also studied and compared to existing strategies for interpreting road scenes and inside environments [31,32,33]. A unique multi-objective optimization framework is proposed for selecting the most relevant local areas from a character picture. The work was also assessed using solitary handwritten English numbers [34].

The performance of CNNs is mostly determined by the selection of hyperparameters that are typically determined by trial and error [35]. Various hyper-parameters, such as activation function, number of epochs, learning rate, kernel size, hidden layers, hidden units, etc. are used to control the method an algorithm is trained on for learning from data [36]. Hyper-parameters are distinct from model parameters and should be chosen before training begins. Some common CNN models are GoogleNet, AlexNet, and VGG-16 which feature 150, 78, 57, and 27 hyper-parameters, respectively [37,38,39,40,41].

Several studies have been implemented using uniform and normal distributions to solve random variable problems for different object sizes [42,43]. The analytical average was performed and the results are compared with the results obtained from the numerical average. A comprehensive survey was performed for dysgraphia diagnosis based on machine learning and non-machine methods [44]. These methods include psychological, machine learning-based, commercial-based, and automated diagnosis without machine learning. The primary goal of the research was to examine artificial intelligence-based solutions for diagnosing dysgraphia in children. This study assesses the data-collecting strategy, significant handwriting traits, and machine learning methods used in the literature for dysgraphia diagnosis. This also covers certain non-artificial intelligence-based automated systems. Furthermore, the shortcomings of existing techniques are discussed, and a novel paradigm for dysgraphia diagnosis is suggested.

3. Materials and Methods

The proposed architecture is divided into various sections as shown in Figure 2. A CNN-LSTM model is developed using uniform and normal distributions to randomly determine the weights and biases of the CNN and LSTM layers. The model’s weights and biases are optimized using a collection of classified samples after training on the preprocessed dataset of handwritten characters. A Random Forest classifier is trained using the activations of the CNN-LSTM model’s final layer as variables. The features are obtained for each image in the preprocessed dataset. The extracted features are used to train a Random Forest classifier, which increases the model’s resilience and accuracy. To increase accuracy even more, the classifier combines the CNN-LSTM model’s predictions. The model is also tested on a collection of images produced by people with dysgraphia to determine its ability to identify handwriting from people with dysgraphia. The performance of the approach is compared to that of other current methods for hand-written character identification, such as conventional machine learning and deep learning methods.

3.1. Data Collection

The dataset included in this research has been obtained from 200 individuals (both dysgraphic and non-dysgraphic) by trained professionals. Their ages varied from eight to eleven years where 50 of them were left-handed, while the remaining were right-handed. 60% of them were male, whereas 40% were female. For data collection, a digitized tablet with a wireless pen with a pressure-sensitive tip was used. This digitizer offers the most delicate pen tablet available, capturing all the temporal, kinematic, and pressure data with the greatest accuracy while the electronic pen exists on the surface or in the air. When the pen is hovering above the surface, this device produces accurate spatial characteristics. The advantage of employing in-air data is that it records the motions of the student’s wrist and hand even when the pen is not on the screen. The rotation and movement of the wrist and hand can offer highly significant information even when the pen is not on the screen.

3.2. Uniform and Normal Distributions

A uniform distribution is a continuous probability distribution in which all numbers within a particular interval are equally likely to appear. A normal distribution is a continuous probability distribution that illustrates the probability of a random variable occurring within a certain range of values. Both distributions can be used to set the weights and biases of the CNN and LSTM layers arbitrarily but within a predetermined range. The random start breaks the uniformity of the weights and biases, which can help avoid overfitting and increase convergence during training. The following integral can be used for the uniform and normal distributions as [43],

< f (x) > = \int_{- \infty}^{+ \infty} f (x) P_{x} (x) d x

(1)

where the probability density function (pdf) for uniformly and normally distributed across ‘x’ is given by,

P_{x} (x) = \{\begin{matrix} \frac{1}{n - m} & m \leq x \leq n \\ 0 & x < m o r x > n \end{matrix}

(2)

P_{x} (x) = \frac{1}{σ_{p} \sqrt{2 π}} exp [- \frac{{(x - μ_{p})}^{2}}{2 {σ_{p}}^{2}}]

(3)

where, mean (

μ_{p}

) and standard deviation (

{σ_{p}}^{2}

) for the uniform and normal distributions are given as,

μ_{p} = E (X) = \int_{m}^{n} f (x) * \frac{1}{n - m} d x

(4)

{σ_{p}}^{2} = E ({(X - μ_{p})}^{2}) = \int_{m}^{n} x f (x) d x

(5)

3.3. Convolutional Neural Network

CNN is a multi-layered deep neural network (DNN) derived from either the receptive field or a more advanced neural cognitive machine than a standard neural network. When extra deep layers are included, learning in the DNN model is more profound than in other shallower neural networks. CNN has a strong distortion tolerance due to its spatial configuration and weight-sharing method when dealing with image recognition and classification [35].

During feature extraction, CNN’s essential structures include the combination of a local receptive field, weight sharing, and subsampling along with dimensionality reduction. The weight-sharing technique is used to minimize model complexity, improve performance, and efficiently govern the number of weights. The input image data are mapped to an output, which means it receives incoming image data, analyses it, and accurately predicts the class of the image.

CNN handles input data in a two-dimensional vector more effectively. We employed CNN in the CNN-LSTM model to extract meaningful features from the input. The handwritten text picture is used as input in this stage and processed to extract its characteristics.

3.4. Long Short-Term Memory

Natural language processing relies heavily on RNNs, particularly LSTM. RNNs, unlike feedforward networks, include feedback loops and can handle tasks based on a temporal sequence. Researchers introduced LSTM to overcome the problem of gradient explosion or vanishing that can occur while training standard RNNs. It typically consists of a cell, an input gate, a forget gate, and an output gate. LSTM is used in this study to categorize the input data into several categories. For an input vector x and the trained weights, the perceptron output y is calculated by the following Equation [45],

y = f (\sum_{i = 1}^{N} W_{i} x_{i} + b)

(6)

3.5. Feature Extraction

Feature selection and extraction is a crucial concept in machine learning that has a large impact on model performance. CNN is better used for extracting useful features which is an interesting task for character recognition. To reduce the number of features and improve the classification algorithm’s effectiveness, functional features have been taken from the data. The pooling operation, phase convolution operation, and ReLU (rectified linear unit) activation function are used in feature extraction, as shown in the equations below:

O_{a, b} = \sum_{a} \sum_{b} f [l, k] X [a - l, b - k]

(7)

F_Y_{a, b} = m a x (0, F_O_{a, b})

(8)

The dysgraphia patient’s input data are fed via a convolutional layer, pooling, ReLU, and dropout layer for extracting meaningful features. The input is a matrix of rank 2 with

M x N

(rows/ columns), M and N are indexed as (a, b), where

0 \leq a \leq M, 0 \leq b \leq N

. The convolutional operation layer yields the final and relevant feature map values,

F_Y_{a, b}

. The activation function is employed at each layer to allow the model for solving the nonlinear problems as illustrated in Equation (8), while the dropout and max pooling method is used to decrease the computational load.

The extracted features that are used for handwritten character recognition include the following: stroke-based (number of strokes, the direction and curvature of the strokes, and the length and width of the strokes), shape-based (height, width, and aspect ratio of the characters), texture-based (presence of loops, dots, or other patterns), structural (number and placement of components within the characters), and grid-based (position of the characters in relation to other characters).

3.6. Classification

Pretrained LSTM is used for character categorization. The pre-trained LSTM is accompanied by a batch normalization (BN) layer, fully connected (FC) layers, and SoftMax function. The proposed algorithm demonstrates that the accuracy is much lower during the testing phase than the training phase, which is related to overfitting and results in a complication in the balancing and regularization of hyperparameters. BN is used to improve learning process stability and decrease internal co-variate shifts. During the training phase, the BN evaluates the variance and mean at each intermediate layer. The normalized input for each layer is calculated using the determined mean and variance, as indicated in Equations (4) and (5), where ‘x’ will be replaced with ‘x

_{i}

’ given by,

x_{i}^{'} = \frac{(x_{i} - μ)}{\sqrt{σ^{2} + ϵ}}

(9)

During model training, the parameters

γ

and

ω

of

μ

and

σ

along with the other parameters are learned. Equations (5) to (9) can be represented after BN as,

σ_{f}^{2} = \frac{N}{N (N - 1)} \sum_{i = 1}^{i^{'}} σ^{2} (i)

(10)

μ_{f} = \frac{{\sum_{i = 1}^{i}}^{'} μ (i)}{N}

(11)

y_{i f} = \frac{ω}{\sqrt{σ_{f}^{2} + ϵ}} (x) + (δ + \frac{ω \times μ_{f}}{\sqrt{σ_{f}^{2} + ϵ}})

(12)

where

i^{'}

is the number of batches with multiple samples. The goal of the model training is to update the filter weights in a manner such that the predicted and actual classes are as nearer as feasible. The network is run from the beginning to obtain the expected value. The loss function is computed to determine the improvement of the suggested model. The overall loss at the final layer is calculated by comparing the predicted value to the matching goal value using continuous forward pass running.

3.7. Random Forest

Random forest is a powerful method for ensemble learning that can be used combined with deep learning models, such as CNN-LSTM for handwritten character identification. While CNN-LSTM is successful in feature extraction and sequence learning, it may suffer from overfitting and bias due to the high number of parameters involved. The random forest can help alleviate these problems by adding an additional layer of learning that can catch the nonlinear connections and interactions between the extracted features. The random forest can learn to categorize the extracted features into various classifications, which can then be merged with the CNN-LSTM output for more accurate estimations [46].

4. Results and Discussion

The model is completely supervised to minimize the gradient, and loss from the SoftMax is backpropagated to the CNN layer. The weights and biases are initialized at each layer using random values.

4.1. Implementation Details

In this research, all models are run on an Intel Corei7-1165G7 CPU with 16GB DDR4 memory, and CY22 Intel i7-1165G7, Integrated Intel Iris Xe graphics utility. With a batch size of 32, we employ adaptive moment estimation as the optimization for backpropagation. The learning rate has been set at 0.0001 and will be dropped by half if the test accuracy does not improve after 20 epochs. The early stop method is used to end training intelligently. All of the designs presented in this section are tested using 5-fold cross-validation. The entire data set is separated into five non-intersecting subsets. One of the five parts is used as the validation set and the rest as the training set each time. The end result is a mean accuracy of five fold.

4.2. Selection of Hyperparameters

The role of hyperparameters, such as the number of convolution filters, batch size, learning rate, pooling and kernel size, optimizer type, and epochs, has been observed in model performance during classification. To maximize the convolution filter number, the model learns more complicated features, which raises the number of parameters and generates overfitting problems. As a result, the exact and appropriate filter selection at each layer is critical. Sixty-four filters are used in the first layer, and to compensate for downsampling induced by the pooling layer, the number of filters in the second convolutional layer is doubled.

A filter with reduced size is initially used to learn low-level characteristics; nevertheless, big-size filters work better for specified features and high levels. Layer 1 uses a large kernel size while the other layers use less kernel size. The filter size is selected first because it reads generic information in one value and has a greater global influence on the entire image, but it misses local features. A tiny filter size is used in the following layers to learn local and individual characteristics. The kernel lengths of the convolutional layers are first fixed at 5 and 3. Experiments are then performed with different network widths (which include LSTM units) to observe how they affect the results. The number of filters (C) is doubled every time from 16 to 256, whereas the LSTM units (N3) are generally half or equal to the preceding layer’s kernel width [47]. According to the results, the model performance is increased with the increase of the LSTM units and the number of filters. The Adam optimization algorithm is used for weight adjustments, the weights are then updated accordingly.

4.3. Result Analysis

In this section, results for various performance metrics are discussed. First, the kernel size is chosen from 3 to 9, and L1 is kept higher than L2. Table 1 shows that the accuracy fluctuates somewhat when the kernel size changes. Second, it is noticeable that the network’s accuracy does not increase when the network is deeper to higher layers. In short, while the depth of convolutional layers has an effect on accuracy, it is not decisive. In other words, increasing the breadth of the convolutional layers is more successful than making the network deeper for the suggested strategy. We discovered that enhancing the number of filters in addition to LSTM units (width) provides the most effective strategy to increase the model’s performance. However, it will result in an increase in the number of parameters. To balance these two aspects, we select a network with a width of (64-128-128) and a kernel size of (5-3) for further discussion. The maximum accuracy of the proposed model is

96.1 %

and

97.6 %

for the uniform and normal distributions as mentioned in the last row of Table 1.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(13)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(14)

S e n s i t i v i t y = \frac{T P}{T P + F N}

(15)

Various evaluation metrics, such as accuracy, specificity, and sensitivity given in Equations (13)–(15) are also used for the performance model evaluation as mentioned in Table 2 and Table 3 for different tasks and features. It is clear that the model accuracy for the sentence is low as compared to the other tasks, whereas the highest accuracy is for task l for both the uniform and normal distributions. This is because the sentence has a large number of alphabets that are difficult to write whereas the ‘l’ task is easy to write by the dysgraphia patient. It is noteworthy that the accuracies for both the uniform and normal distributions are almost the same, with the normal distribution having a slightly larger value than the uniform distribution. The proposed CNN-LSTM model is also compared to the other models as mentioned in [48,49]. Results in Table 4 suggest that the proposed model is at rank 2 with a width of (64-128-128) and kernel size of (5-3). It is worth mentioning that greater accuracy may be achieved if the width of the network is increased.

Figure 3 and Figure 4 show the comparison results of accuracy, specificity, and sensitivity for various tasks and features. It has been noticed that the task l has high accuracy due to the easiness of writing while the accuracy for the sentence is low due to the complexity of the text. The accuracy for grid-based features is high as it is based on the position of the characters within a grid or matrix, such as the position of the characters in relation to other characters or the position of the characters within a specific cell of the grid. The accuracy of the texture-based feature is low among all the features as these features are based on the texture of the characters, such as the presence of loops, dots, or other patterns within the characters. Figure 5 represents comparison results for various methods as mentioned in [48,49]. It is noteworthy that the proposed method is the second best among all the other methods.

There are several challenges to getting results for this work. One possible problem is obtaining a big and varied collection of handwritten characters from dysgraphia patients for training and evaluating the model. Another issue is choosing suitable hyperparameters for the model, as optimal values differ based on the dataset and problem being addressed. Furthermore, interpreting the model’s findings can be difficult, as deep learning models are frequently referred to as “black boxes” because of their complicated internal workings. This makes determining the precise variables influencing the model’s output and identifying areas for improvement challenging.

Figure 6 and Figure 7 show the comparison results of the accuracy for various tasks in terms of uniform and normal distribution. The dashed line represents uniform distribution, and the solid line represents the normal distribution. It is noteworthy that both the results are nearer to each other with the normal distribution slightly ahead of the uniform distribution.

5. Conclusions and Future Work

In this research, a CNN-LSTM model for hand-written character recognition with the integration of Random Forest classifier, uniform, and normal distributions was performed. The features were extracted using the penultimate layer of the CNN-LSTM model and a Random Forest classifier was trained on these features. The uniform and normal distributions were used to randomly initialize the weights and biases of the CNN and LSTM layers, which helped to avoid local minima during training and improved the overall performance of the model. The suggested method presented encouraging results in correctly reading handwritten characters, which is crucial for people with dysgraphia who might have trouble writing clearly. By combining CNN-LSTM and Random Forest, we were able to take advantage of each model’s advantages and produce a more precise and reliable identification performance. Overall, the task’s findings show the potential for combining deep learning and machine learning methods to recognize handwritten characters. These results may have significant applications for several uses in the fields of pattern recognition and computer vision.

The suggested CNN-LSTM model can be improved and optimized in future studies. One possible approach is to investigate the use of transfer learning methods, which have been demonstrated to be successful in decreasing training time and increasing accuracy in a variety of image recognition tasks. Another direction for future research is to look into using additional features in the feature extraction process, to enhance the model’s ability to identify dysgraphic handwriting. Furthermore, the proposed model can be tested on bigger datasets to verify its performance and generalizability.

Author Contributions

Conceptualization, F.M.; methodology, F.M. and W.U.K.; software, W.U.K.; validation, W.U.K., K.U. and A.K.; formal analysis, W.U.K.; investigation, F.M.; resources, W.U.K.; data curation, F.M.; writing—original draft preparation, F.M.; writing—review and editing, W.U.K., A.K. and F.H.A.; visualization, W.U.K., K.U. and A.K.; supervision, F.M., F.H.A., H.A. and H.A.; project administration, F.M., K.U. and H.A.; funding acquisition, H.A. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R54), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data collection used for the statistical analysis in the paper is accessible at the given repository accessed on 22 March 2023 (https://github.com/researchgroupatcs/research). Due to ethical limits, the information collected cannot be made public. Furthermore, the parents of all children did not explicitly agree to the public disclosure of the child’s crucial biographical information and writing exams. The Ethics Committee of the university deal for this research did not permit the public sharing of participant data.

Acknowledgments

The researchers would like to thank Princess Nourah bint Abdulrahman University for funding the publication of this project.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
LSTM	Long Short Term Memory
DNN	Deep Neural Network

References

Engel-Yeger, B.; Nagauker-Yanuv, L.; Rosenblum, S. Handwriting performance, self-reports, and perceived self-efficacy among children with dysgraphia. Am. J. Occup. Ther. 2009, 63, 182–192. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rosenblum, S.; Dvorkin, A.Y.; Weiss, P.L. Automatic segmentation as a tool for examining the handwriting process of children with dysgraphic and proficient handwriting. Hum. Mov. Sci. 2006, 25, 608–621. [Google Scholar] [CrossRef] [PubMed]
Richards, R.G. Te Source for Dyslexia and Dysgraphia, LinguiSystems; LinguiSystems, Inc.: East Moline, IL, USA, 1999. [Google Scholar]
Nicolson, R.I.; Fawcett, A.J. Dyslexia, dysgraphia, procedural learning and the cerebellum. Cortex 2011, 47, 117–127. [Google Scholar] [CrossRef] [PubMed]
Prunty, M.; Barnett, A.L. Understanding handwriting difculties: A comparison of children with and without motor impairment. Cogn. Neuropsychol. 2017, 34, 205–218. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Prunty, M.M.; Barnett, A.L.; Wilmut, K.; Plumb, M.S. Handwriting speed in children with developmental coordination disorder: Are they really slower? Res. Dev. Disabil. 2013, 34, 2927–2936. [Google Scholar] [CrossRef] [Green Version]
Berninger, V.W.; Richards, T.; Abbott, R.D. Differential diagnosis of dysgraphia, dyslexia, and owl ld: Behavioral and neuroimaging evidence. Read. Writ. 2015, 28, 1119–1153. [Google Scholar] [CrossRef] [Green Version]
Zoccolotti, P.; Friedmann, N. From dyslexia to dyslexias, from dysgraphia to dysgraphias, from a cause to causes: A look at current research on developmental dyslexia and dysgraphia. Cortex 2010, 46, 1211–1215. [Google Scholar] [CrossRef]
Asselborn, T.; Gargot, T.; Kidzinski, Ł.; Johal, W.; Cohen, D.; Jolly, C.; Dillenbourg, P. Automated human-level diagnosis of dysgraphia using a consumer tablet. NPJ Digit. Med. 2018, 1, 42. [Google Scholar] [CrossRef] [Green Version]
McCloskey, M.; Rapp, B. Developmental dysgraphia: An overview and framework for research. Cogn. Neuropsychol. 2017, 34, 65–82. [Google Scholar] [CrossRef]
Letanneux, A.; Danna, J.; Velay, J.L.; Viallet, F.; Pinto, S. From micrographia to Parkinson’s disease dysgraphia. Mov. Disord. 2014, 29, 1467–1475. [Google Scholar] [CrossRef] [Green Version]
Drotar, P.; Mekyska, J.; Rektorova, I.; Masarova, L.; Smekal, Z.; Faundez-Zanuy, M. Decision support framework for Parkinson’s disease based on novel handwriting markers. IEEE Trans. Neural Syst. Rehabil. Eng. 2014, 23, 508–516. [Google Scholar] [CrossRef] [PubMed]
Espana-Boquera, S.; Castro-Bleda, M.J.; Gorbe-Moya, J.; Zamora-Martinez, F. Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 767–779. [Google Scholar] [CrossRef] [PubMed]
Dutta, K.; Krishnan, P.; Mathew, M.; Jawahar, C.V. Improving CNN-RNN hybrid networks for handwriting recognition. In Proceedings of the 324 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA, 5–8 August 2018; pp. 80–85. [Google Scholar]
Fawad, M.; Masood, J.; Zahir, H.; Driss, K.; Mehmood, N.; Farooq, H. Novel Approach to Evaluate Classification Algorithms and Feature Selection Filter Algorithms using Medical Data. J. Comput. Cogn. Eng. 2022, 2, 57–67. [Google Scholar] [CrossRef]
Zhang, Z.; Luca, G.D.; Archambault, B.; Chavez, J.; Rice, B. Traffic Dataset for Dynamic Routing Algorithm in Traffic Simulation. J. Artif. Intell. Technol. 2022, 2, 111–122. [Google Scholar] [CrossRef]
Chen, Z. Research on Internet Security Situation Awareness Prediction Technology based on Improved RBF Neural Network Algorithm. J. Comput. Cogn. Eng. 2022, 1, 103–108. [Google Scholar]
Yang, X.; Qiu, T.T. Human activity recognition and embedded application based on convolutional neural network. J. Artif. Intell. Technol. 2021, 1, 51–60. [Google Scholar]
Arora, R.; Basu, A.; Mianjy, P.; Mukherjee, A. Understanding deep neural networks with rectified linear units. arXiv 2016, arXiv:1611.01491. [Google Scholar]
Paz-Villagran, V.; Danna, J.; Velay, J.L. Lifs and stops in profcient and dysgraphic handwriting. Hum. Mov. Sci. 2014, 33, 381–394. [Google Scholar] [CrossRef]
Danna, J.; Paz-Villagran, V.; Velay, J.L. Signal-to-noise velocity peaks diference: A new method for evaluating the handwriting movement fuency in children with dysgraphia. Res. Dev. Disabil. 2013, 34, 4375–4384. [Google Scholar] [CrossRef]
Mekyska, J.; Faundez-Zanuy, M.; Mzourek, Z.; Galaz, Z.; Smekal, Z.; Rosenblum, S. Identification and rating of developmental dysgraphia by handwriting analysis. IEEE Trans. Hum.-Mach. Syst. 2016, 47, 235–248. [Google Scholar] [CrossRef]
Rosenblum, S.; Dror, G. Identifying developmental dysgraphia characteristics utilizing handwriting classifcation methods. IEEE Trans. Hum.-Mach. Syst. 2017, 47, 293–298. [Google Scholar] [CrossRef]
Kurniawan, D.A.; Sihwi, S.W.; Gunarhadi. An expert system for diagnosing dysgraphia. In Proceedings of the 2017 2nd International Conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Yogyakarta, Indonesia, 1–2 November 2017; pp. 468–472. [Google Scholar]
Samodro, P.W.; Sihwi, S.W.; Winarno. Backpropagation implementation to classify dysgraphia in children. In Proceedings of the 2019 International Conference of Artifcial Intelligence and Information Technology (ICAIIT), Yogyakarta, Indonesia, 13–15 March 2019; pp. 437–442. [Google Scholar]
Kariyawasam, R.; Nadeeshani, M.; Hamid, T.; Subasinghe, I.; Samarasinghe, P.; Ratnayake, P. Pubudu: Deep learning based screening and intervention of dyslexia, dysgraphia and dyscalculia. In Proceedings of the 2019 14th Conference on Industrial and Information Systems (ICIIS), Kandy, Sri Lanka, 18–20 December 2019; pp. 476–481. [Google Scholar]
Zvoncak, V.; Mucha, J.; Galaz, Z.; Mekyska, J.; Safarova, K.; Faundez-Zanuy, M.; Smekal, Z. Fractional order derivatives evaluation in computerized assessment of handwriting difficulties in school-aged children. In Proceedings of the 2019 11th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), Dublin, Ireland, 28–30 October 2019; pp. 1–6. [Google Scholar]
Zvoncak, V.; Mekyska, J.; Safarova, K.; Smekal, Z.; Brezany, P. New approach of dysgraphic handwriting analysis based on the tunable q-factor wavelet transform. In Proceedings of the 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 20–24 May 2019; pp. 289–294. [Google Scholar]
Simard, P.Y.; Steinkraus, D.; Platt, J.C. Best practice for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition (ICDAR), Edinburgh, UK, 6 August 2003; pp. 3–6. [Google Scholar]
Wang, T.; Wu, D.J.; Coates, A.; Ng, A.Y. End-to-end text recognition with convolutional neural networks. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, 11–15 November 2012. [Google Scholar]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE. Trans. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Araucano Park, Las Condes, Chille, 11–18 December 2015. [Google Scholar]
Gupta, A.; Sarkhel, R.; Das, N.; Kundu, M. Multiobjective optimization for recognition of isolated handwritten Indic scripts. Pattern Recognit. Lett. 2019, 128, 318–325. [Google Scholar] [CrossRef]
Cui, H.; Bai, J. A new hyperparameters optimization method for convolutional neural networks. Pattern Recognit. Lett. 2019, 125, 828–834. [Google Scholar] [CrossRef]
Tso, W.W.; Burnak, B.; Pistikopoulos, E.N. Hyperparameter optimization of machine learning models through parametric programming. Comput. Chem. Eng. 2020, 139, 106902. [Google Scholar] [CrossRef]
Christian, S.; Wei, L.; Yangqing, J.; Pierre, S.; Scott, R.; Dragomir, A.; Andrew, R. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef] [Green Version]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Khan, L.; Amjad, A.; Afaq, K.M.; Chang, H.-T. Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media. Appl. Sci. 2022, 12, 2694. [Google Scholar] [CrossRef]
Madan, P.; Singh, V.; Chaudhari, V.; Albagory, Y.; Dumka, A.; Singh, R.; Gehlot, A.; Rashid, M.; Alshamrani, S.S.; AlGhamdi, A.S. An Optimization-Based Diabetes Prediction Model Using CNN and Bi-Directional LSTM in Real-Time Environment. Appl. Sci. 2022, 12, 3989. [Google Scholar] [CrossRef]
Fiaz, M.A.; Ashraf, M.A.; Rizvi, A.A. Average scattered field from a random PEC cylinder buried below a slightly rough surface. Waves Random Complex Media 2017, 1, 60–75. [Google Scholar] [CrossRef]
Masood, F.; Fiaz, M.A. Evaluation of average cross-polarised scattered field from a PEMC cylinder of the random radius with uniform/normal distribution. IET Microw. Antennas Propag. 2019, 13, 804–812. [Google Scholar] [CrossRef]
Kunhoth, J.; Al-Maadeed, S.; Kunhoth, S.; Akbari, Y. Automated Systems For Diagnosis of Dysgraphia in Children: A Survey and Novel Framework. arXiv 2022, arXiv:2206.13043. [Google Scholar]
Staudemeyer, R.C.; Morris, E.R. Understanding LSTM–a tutorial into long short-term memory recurrent neural networks. arXiv 2019, arXiv:1909.09586. [Google Scholar]
Chen, H.; Wu, L.; Chen, J.; Lu, W.; Ding, J. A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manag. 2022, 59, 102798. [Google Scholar] [CrossRef]
Zhu, J.; Chen, H.; Ye, W. A Hybrid CNN-LSTM Network for the Classification of Human Activities Based on Microdoppler Radar. IEEE Access 2008, 8, 24713–24720. [Google Scholar] [CrossRef]
Drotar, P.; Dobes, M. Dysgraphia detection through machine learning. Sci. Rep. 2020, 10, 21541. [Google Scholar] [CrossRef]
Ghouse, F.; Paranjothi, K.; Vaithiyanathan, R. Dysgraphia Classification based on the Non-Discrimination Regularization in Rotational Region Convolutional Neural Network. Int. J. Intell. Eng. Syst. 2022, 15, 55–63. [Google Scholar]

Figure 1. Handwriting samples of a normal and a dysgraphia patient. (a) Normal handwriting. (b) Dysgraphia patient’s handwriting.

Figure 2. Architecture of the proposed model.

Figure 3. Comparison results of accuracy, specificity and sensitivity for various tasks.

Figure 4. Comparison results of accuracy, specificity and sensitivity for various features.

Figure 5. Comparison results of accuracy, specificity and sensitivity for various methods.

Figure 6. Comparison results of accuracy for various tasks in terms of uniform and normal distributions.

Figure 7. Comparison results of accuracy for various features in terms of uniform and normal distributions.

Table 1. Comparative analysis of the proposed scheme with different kernel sizes and width for uniform and normal distributions.

Kernel Size ( $L_{1}$ - $L_{2}$ )	Width (C-2C- $N_{3}$ )	Depth	Accuracy $_{uni}$	Accuracy $_{norm}$	Accuracy $_{mean}$
5-3	16-32-16	3	90.3%	90.7%	90.5%
5-3	16-32-32	3	90.6%	91.4%	90.9%
5-3	32-64-32	3	90.9%	91.9%	91.4%
5-3	32-64-64	3	91.4%	92.3%	91.8%
5-3	64-128-64	3	91.7%	92.6%	92.1%
5-3	64-128-128	3	92.3%	92.9%	92.6%
5-3	128-256-128	3	93.3%	94.6%	93.9%
5-3	128-256-256	3	95.1%	96.1%	95.6%
5-3	256-512-256	3	96.1%	97.6%	96.8%

Table 2. Accuracy, specificity, and sensitivity for various tasks.

Task	Accuracy $_{uni}$	Accuracy $_{norm}$	Accuracy $_{mean}$	Specificity	Sensitivity
l	96.0	97.0	96.5	96.9	96.2
l $_{m a x}$	94.8	95.2	95.0	94.8	95.1
le	91.9	92.7	92.3	92.8	91.8
le $_{m a x}$	90.7	91.3	91.0	90.2	91.8
Leto	93.0	93.4	93.2	91.8	94.7
Lamoken	93.6	94.8	94.2	94.5	93.8
Hrackarstvo	89.7	90.6	90.1	88.6	91.6
Sentence	88.8	89.6	89.2	88.6	89.8
All	92.3	93.0	92.6	92.2	93.1

Table 3. Accuracy, specificity, and sensitivity for various features.

Task	Accuracy $_{uni}$	Accuracy $_{norm}$	Accuracy $_{mean}$	Specificity	Sensitivity
Stroke	94.3	95.7	95.0	94.9	95.1
Shape	93.8	94.4	94.1	92.1	95.7
Texture	91.6	93.8	92.7	91.4	93.8
Strutural	92.8	93.4	93.1	92.4	93.8
Grid	94.9	96.9	95.9	96.5	95.3

Table 4. Comparative results for various models.

Methods	Accurcay	Specificity	Sensitivity
Adaboost	82.5	78.7	84.7
SVM	80.8	86.4	78.5
RF	78.6	85.3	74.4
LSTM	90.1	91.4	91.3
RPN	88.2	86.5	86.3
Fast RCNN	91.8	91.2	91.1
Faster RCNN	92.1	91.7	91.5
R2CNN	94.2	92.1	91.5
NDR-R2CNN	98.2	96.4	100
CNN-LSTM	92.6	92.2	93.1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Masood, F.; Khan, W.U.; Ullah, K.; Khan, A.; Alghamedy, F.H.; Aljuaid, H. A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution. Appl. Sci. 2023, 13, 4275. https://doi.org/10.3390/app13074275

AMA Style

Masood F, Khan WU, Ullah K, Khan A, Alghamedy FH, Aljuaid H. A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution. Applied Sciences. 2023; 13(7):4275. https://doi.org/10.3390/app13074275

Chicago/Turabian Style

Masood, Fahad, Wajid Ullah Khan, Khalil Ullah, Ahmad Khan, Fatemah H. Alghamedy, and Hanan Aljuaid. 2023. "A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution" Applied Sciences 13, no. 7: 4275. https://doi.org/10.3390/app13074275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid CNN-LSTM Random Forest Model for Dysgraphia Classification from Hand-Written Characters with Uniform/Normal Distribution

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. Data Collection

3.2. Uniform and Normal Distributions

3.3. Convolutional Neural Network

3.4. Long Short-Term Memory

3.5. Feature Extraction

3.6. Classification

3.7. Random Forest

4. Results and Discussion

4.1. Implementation Details

4.2. Selection of Hyperparameters

4.3. Result Analysis

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI