# Mining Educational Data to Predict Students’ Performance through Procrastination Behavior

^{*}

## Abstract

**:**

## 1. Introduction

- How accurately can our proposed algorithm predict students’ performance through their procrastination behaviors?
- Which classification method offers superior predictive power, when using various numbers of classes in the feature vectors? What is the effect of using continuous versus categorical feature vectors on different classification methods?

## 2. Previous Research

#### 2.1. Academic Procrastination

#### 2.2. Educational Data Mining

#### 2.2.1. Clustering Methods in the Context of Education

#### 2.2.2. Classification Methods in the Context of Education

#### 2.3. Procrastination Prediction Using EDM Methods

## 3. Method

#### 3.1. Problem Description

#### 3.2. PPP: Prediction of Students’ Performance through Procrastination Behavior

#### 3.2.1. Building the Feature Vector of Assignment Submission Behavior

**X**and

**Y**.

Algorithm 1 Development of feature vectors X and Y |

Input: $Open{D}^{a},Firstview{D}^{a},Submission{D}^{s},Deadlin{e}^{s}$, S, A |

Output: Feature vector X and Y |

1: Initialize j = |S|, i = |A| |

2: while n < j do |

3: while m < i do |

4: ${x}_{nm}\left[{v}_{1}\right]=\frac{Deadlin{e}^{m}\u2013Submission{D}_{m}^{n}}{Deadlin{e}^{m}\u2013Open{D}^{m}}$ |

5: ${x}_{nm}\left[{v}_{2}\right]=\frac{Firstview{D}_{m}^{n}\u2013Open{D}^{m}}{Deadlin{e}^{m}\u2013Open{D}^{m}}$ |

6: if ${x}_{nm}\left[{v}_{1}\right]$ < = 0 then |

7: ${y}_{nm}\left[{w}_{1}\right]$ = 0 |

8: else |

9: ${y}_{nm}\left[{w}_{1}\right]$ = 1 |

10: if ${x}_{nm}\left[{v}_{2}\right]$ = > median ${x}_{nm}\left[{v}_{2}\right]$ then |

11: ${y}_{nm}\left[{w}_{2}\right]$ = 0 |

12: else |

13: ${y}_{nm}\left[{w}_{2}\right]$ = 1 |

14: end if |

15: end while |

16: end while |

17: return Feature vector X and Y |

#### 3.2.2. Finding the Optimal Number of Classes Using Clustering

Algorithm 2 Discovering the optimal number of clusters using the Elbow method |

Input: feature vectors (outputted from algorithm 1) without class labels, the maximum number of clusters k |

Output: (validated) optimal number of clusters |

1: while $i<k$ do |

2: Construct a similarity graph and let W be its weighted adjacency matric |

3: Compute the unnormalized Laplacian L |

4: Compute the first k eigenvectors ${\mathit{u}}_{\mathbf{1}},\dots ,{\mathit{u}}_{\mathbf{k}}$ of the generalized eigenproblem $L\mathit{u}$ = $\lambda \mathit{D}\mathit{u}$ |

5: Let $\mathit{U}\in {\mathbb{R}}^{n\ast k}$ be the matrix containing the vectors ${\mathit{u}}_{\mathbf{1}},\dots ,{\mathit{u}}_{\mathbf{k}}$ as columns |

6: For $i=1,\dots ,n$ let ${\mathit{y}}_{i}\in {\mathbb{R}}^{k}$ be the vector corresponding to the i-th row of U |

7: Cluster the points ${\left({\mathit{y}}_{i}\right)}_{i=1,\dots ,n}$. in ${\mathbb{R}}^{k}$ with the k-means algorithm into clusters ${C}_{1},\dots ,{C}_{k}$ |

8: Calculate distortion score |

9: end while |

10: Plot the curve of distortion score according to the number of clusters k |

11: Consider the location of a bend (knee) in the plot as the optimal number of clusters |

12: Validate the optimal number of cluster through further (statistical) analysis of different number of clusters |

13: return the optimal number of clusters |

#### 3.2.3. Classification of Students Using Class Labels

Algorithm 3 PPP: Prediction of students’ performance through procrastination behavior |

Input: $Open{D}^{a},Firstview{D}^{a},Submission{D}^{a},Deadlin{e}^{a}$, S, A |

Output: prediction of procrastination behavior (if a student is procrastinator, procrastinator candidate, or non-procrastinator) |

1: Implement algorithm 1 to build feature vector X and Y |

2: Implement algorithm 2 to produce the optimal number of clusters from the feature vector |

3: Apply classification algorithm using class labels |

4: L-SVM, R-SVM, GP, DT, RF, NN, ADB, and NB |

5: Compare classification algorithm performance by using test data |

6: ${P}_{c}={P}_{1},{P}_{2},{P}_{3},\dots ,{P}_{n}$ |

7: while i < = n do |

8: if ${P}_{ci}{P}_{ci+1}$ then |

9: $C\leftarrow {c}_{i}$. |

10: else |

11: $C\leftarrow {c}_{i+1}$ |

12: end if |

13: end while |

14: Choose the best performed classification algorithm $C$ |

15: Employ the classification $C$ to predict procrastination |

16: return prediction of procrastinator, procrastinator candidate, or non-procrastinator |

## 4. Experimental Results

#### 4.1. Dataset

#### 4.2. Label

#### 4.3. Results

#### 4.3.1. Phase 1: Clustering Development and Analysis

#### 4.3.2. Phase 2: Classification

^{−9}for NB. We used accuracy, F1-score, precision, and recall as performance metrics to evaluate the classification techniques. Table 5 lists the average of all performance metrics at different k-fold for all classification methods.

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Table A1.**Performance metrics of classification methods in three-class at different k-folds (i.e., 5, 10, 15, and 20).

Three-Class | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continues Features | ||||||||

Precision_5 | 0.958 | 0.933 | 0.940 | 0.933 | 0.933 | 0.939 | 0.927 | 0.898 |

Precision_10 | 0.963 | 0.937 | 0.944 | 0.933 | 0.934 | 0.938 | 0.930 | 0.899 |

Precision_15 | 0.952 | 0.934 | 0.945 | 0.941 | 0.940 | 0.944 | 0.937 | 0.893 |

Precision_20 | 0.954 | 0.934 | 0.943 | 0.933 | 0.934 | 0.940 | 0.930 | 0.879 |

Categorical Features | ||||||||

Precision_5 | 0.867 | 0.923 | 0.941 | 0.948 | 0.952 | 0.955 | 0.957 | 0.958 |

Precision_10 | 0.866 | 0.913 | 0.934 | 0.941 | 0.944 | 0.945 | 0.949 | 0.949 |

Precision_15 | 0.868 | 0.923 | 0.937 | 0.944 | 0.951 | 0.953 | 0.955 | 0.957 |

Precision_20 | 0.868 | 0.921 | 0.941 | 0.951 | 0.953 | 0.955 | 0.957 | 0.959 |

Three-Class | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continues Features | ||||||||

Accuracy_5 | 0.950 | 0.924 | 0.932 | 0.926 | 0.926 | 0.934 | 0.918 | 0.884 |

Accuracy_10 | 0.959 | 0.930 | 0.938 | 0.927 | 0.926 | 0.931 | 0.920 | 0.885 |

Accuracy_15 | 0.946 | 0.932 | 0.941 | 0.932 | 0.931 | 0.937 | 0.927 | 0.890 |

Accuracy_20 | 0.951 | 0.930 | 0.935 | 0.923 | 0.922 | 0.928 | 0.916 | 0.881 |

Categorical Features | ||||||||

Accuracy_5 | 0.929 | 0.954 | 0.961 | 0.961 | 0.961 | 0.963 | 0.962 | 0.962 |

Accuracy_10 | 0.930 | 0.951 | 0.959 | 0.962 | 0.961 | 0.962 | 0.960 | 0.960 |

Accuracy_15 | 0.930 | 0.955 | 0.960 | 0.962 | 0.963 | 0.965 | 0.963 | 0.962 |

Accuracy_20 | 0.929 | 0.954 | 0.963 | 0.967 | 0.966 | 0.968 | 0.965 | 0.963 |

Three-Class | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continues Features | ||||||||

F1_5 | 0.949 | 0.922 | 0.931 | 0.925 | 0.926 | 0.933 | 0.917 | 0.874 |

F1_10 | 0.958 | 0.929 | 0.937 | 0.928 | 0.927 | 0.932 | 0.921 | 0.881 |

F1_15 | 0.948 | 0.935 | 0.944 | 0.936 | 0.935 | 0.940 | 0.929 | 0.895 |

F1_20 | 0.953 | 0.934 | 0.938 | 0.927 | 0.926 | 0.933 | 0.921 | 0.894 |

Categorical Features | ||||||||

F1_5 | 0.963 | 0.969 | 0.973 | 0.971 | 0.972 | 0.973 | 0.970 | 0.970 |

F1_10 | 0.963 | 0.969 | 0.974 | 0.974 | 0.972 | 0.974 | 0.972 | 0.972 |

F1_15 | 0.963 | 0.971 | 0.972 | 0.973 | 0.974 | 0.975 | 0.973 | 0.973 |

F1_20 | 0.962 | 0.972 | 0.979 | 0.979 | 0.978 | 0.979 | 0.978 | 0.977 |

## References

- Rovai, A.P.; Jordan, H. Blended learning and sense of community: A comparative analysis with traditional and fully online graduate courses. Int. Rev. Res. Open Distrib. Learn.
**2004**, 5. [Google Scholar] [CrossRef] - Phillips, R. Tools Used in Learning Management Systems: Analysis of WebCT Usage Logs. Available online: https://pdfs.semanticscholar.org/b416/28c1adc770c11b559d5916b3548b7c579c18.pdf (accessed on 20 December 2019).
- Romero, C.; Ventura, S.; García, E. Data mining in course management systems: Moodle case study and tutorial. Comput. Educ.
**2008**, 51, 368–384. [Google Scholar] [CrossRef] - Kotsiantis, S.; Tselios, N.; Filippidi, A.; Komis, V. Using learning analytics to identify successful learners in a blended learning course. Int. J. Technol. Enhanc. Learn.
**2013**, 5, 133–150. [Google Scholar] [CrossRef] - Azevedo, R.; Cromley, J.G.; Seibert, D. Does adaptive scaffolding facilitate students’ ability to regulate their learning with hypermedia? Contemp. Educ. Psychol.
**2004**, 29, 344–370. [Google Scholar] [CrossRef] - Hooshyar, D.; Kori, K.; Pedaste, M.; Bardone, E. The potential of open learner models to promote active thinking by enhancing self-regulated learning in online higher education learning environments. Br. J. Educ. Technol.
**2019**. [Google Scholar] [CrossRef] - Richardson, M.; Abraham, C.; Bond, R. Psychological correlates of university students’ academic performance: A systematic review and meta-analysis. Psychol. Bull.
**2012**, 138, 353. [Google Scholar] [CrossRef] [Green Version] - Michinov, N.; Brunot, S.; Le Bohec, O.; Juhel, J.; Delaval, M. Procrastination, participation, and performance in online learning environments. Comput. Educ.
**2011**, 56, 243–252. [Google Scholar] [CrossRef] - Tuckman, B.W. Relations of academic procrastination, rationalizations, and performance in a web course with deadlines. Psychol. Rep.
**2005**, 96, 1015–1021. [Google Scholar] [CrossRef] - Cerezo, R.; Esteban, M.; Sánchez-Santillán, M.; Núñez, J.C. Procrastinating behavior in computer-based learning environments to predict performance: A case study in Moodle. Front. Psychol.
**2017**, 8, 1403. [Google Scholar] [CrossRef] [Green Version] - Cerezo, R.; Sánchez-Santillán, M.; Paule-Ruiz, M.P.; Núñez, J.C. Students’ LMS interaction patterns and their relationship with achievement: A case study in higher education. Comput. Educ.
**2016**, 96, 42–54. [Google Scholar] [CrossRef] - Visser, L.; Korthagen, F.; Schoonenboom, J. Influences on and consequences of academic procrastination of first-year student teachers. Pedagog. Stud.
**2015**, 92, 394–412. [Google Scholar] - Kostopoulos, G.; Karlos, S.; Kotsiantis, S. Multi-view Learning for Early Prognosis of Academic Performance: A Case Study. IEEE Trans. Learn. Technol.
**2019**, 12, 212–224. [Google Scholar] [CrossRef] - Kotsiantis, S. Educational data mining: A case study for predicting dropout-prone students. Int. J. Knowl. Eng. Soft Data Paradig.
**2009**, 1, 101–111. [Google Scholar] [CrossRef] - Hellas, A.; Ihantola, P.; Petersen, A.; Ajanovski, V.V.; Gutica, M.; Hynninen, T.; Knutas, A.; Leinonen, J.; Messom, C.; Liao, S.N. Predicting Academic Performance: A Systematic Literature Review; ACM: New York, NY, USA, 2018; pp. 175–199. [Google Scholar]
- Marbouti, F.; Diefes-Dux, H.A.; Madhavan, K. Models for early prediction of at-risk students in a course using standards-based grading. Comput. Educ.
**2016**, 103, 1–15. [Google Scholar] [CrossRef] [Green Version] - Drăgulescu, B.; Bucos, M.; Vasiu, R. Predicting assignment submissions in a multi-class classification problem. TEM J.
**2015**, 4, 244. [Google Scholar] - Schraw, G.; Wadkins, T.; Olafson, L. Doing the things we do: A grounded theory of academic procrastination. J. Educ. Psychol.
**2007**, 99, 12. [Google Scholar] [CrossRef] - Ryan, R.M.; Deci, E.L. Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemp. Educ. Psychol.
**2000**, 25, 54–67. [Google Scholar] [CrossRef] - Ferrari, J.R. AARP Still Procrastinating?: The No-Regrets Guide to Getting it Done; John Wiley & Sons: New York, NY, USA, 2011. [Google Scholar]
- Sigall, H.; Kruglanski, A.; Fyock, J. Wishful thinking and procrastination. J. Soc. Behav. Personal.
**2000**, 15, 283–296. [Google Scholar] - Chun Chu, A.H.; Choi, J.N. Rethinking procrastination: Positive effects of “active” procrastination behavior on attitudes and performance. J. Soc. Psychol.
**2005**, 145, 245–264. [Google Scholar] [CrossRef] - Steel, P. The nature of procrastination: A meta-analytic and theoretical review of quintessential self-regulatory failure. Psychol. Bull.
**2007**, 133, 65. [Google Scholar] [CrossRef] - Ackerman, D.S.; Gross, B.L. My instructor made me do it: Task characteristics of procrastination. J. Mark. Educ.
**2005**, 27, 5–13. [Google Scholar] [CrossRef] [Green Version] - Van Eerde, W. A meta-analytically derived nomological network of procrastination. Personal. Individ. Differ.
**2003**, 35, 1401–1418. [Google Scholar] [CrossRef] - Díaz-Morales, J.F.; Ferrari, J.R.; Cohen, J.R. Indecision and avoidant procrastination: The role of morningness—eveningness and time perspective in chronic delay lifestyles. J. Gen. Psychol.
**2008**, 135, 228–240. [Google Scholar] [CrossRef] [PubMed] - Visser, R.M.; Kunze, A.E.; Westhoff, B.; Scholte, H.S.; Kindt, M. Representational similarity analysis offers a preview of the noradrenergic modulation of long-term fear memory at the time of encoding. Psychoneuroendocrinology
**2015**, 55, 8–20. [Google Scholar] [CrossRef] [PubMed] - Hen, M.; Goroshit, M. Academic procrastination, emotional intelligence, academic self-efficacy, and GPA: A comparison between students with and without learning disabilities. J. Learn. Disabil.
**2014**, 47, 116–124. [Google Scholar] [CrossRef] - Balkıs, M. Academic efficacy as a mediator and moderator variable in the relationship between academic procrastination and academic achievement. Eurasian J. Educ. Res.
**2011**, 45, 1–16. [Google Scholar] - You, J.W. The relationship among academic procrastination, self-regulated learning, fear, academic self-efficacy, and perceived academic control in e-learning. J. Educ. Inf. Media
**2012**, 18, 249–271. [Google Scholar] - Akinsola, M.K.; Tella, A.; Tella, A. Correlates of academic procrastination and mathematics achievement of university undergraduate students. Eurasia J. Math. Sci. Technol. Educ.
**2007**, 3, 363–370. [Google Scholar] [CrossRef] - Klingsieck, K.B.; Fries, S.; Horz, C.; Hofer, M. Procrastination in a distance university setting. Distance Educ.
**2012**, 33, 295–310. [Google Scholar] [CrossRef] - Melton, A.W. The situation with respect to the spacing of repetitions and memory. J. Verbal Learn. Verbal Behav.
**1970**, 9, 596–606. [Google Scholar] [CrossRef] [Green Version] - Elvers, G.C.; Polzella, D.J.; Graetz, K. Procrastination in online courses: Performance and attitudinal differences. Teach. Psychol.
**2003**, 30, 159–162. [Google Scholar] [CrossRef] - Wighting, M.J.; Liu, J.; Rovai, A.P. Distinguishing sense of community and motivation characteristics between online and traditional college students. Q. Rev. Distance Educ.
**2008**, 9, 285–295. [Google Scholar] - Dutt, A.; Ismail, M.A.; Herawan, T. A systematic review on educational data mining. IEEE Access
**2017**, 5, 15991–16005. [Google Scholar] [CrossRef] - Abu Tair, M.M.; El-Halees, A.M. Mining educational data to improve students’ performance: A case study. Min. Educ. Data Improv. Stud. Perform. A Case Study
**2012**, 2, 2. [Google Scholar] - Li, C.; Yoo, J. Modeling student online learning using clustering. In Proceedings of the 44th Annual Southeast Regional Conference, Melbourne, Florida, 10–12 March 2006; pp. 186–191. [Google Scholar]
- Pedaste, M.; Sarapuu, T. Developing an effective support system for inquiry learning in a Web-based environment. J. Comput. Assist. Learn.
**2006**, 22, 47–62. [Google Scholar] [CrossRef] - Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput.
**2007**, 17, 395–416. [Google Scholar] [CrossRef] - Gao, L.; Wan, B.; Fang, C.; Li, Y.; Chen, C. Automatic Clustering of Different Solutions to Programming Assignments in Computing Education. In Proceedings of the ACM Conference on Global Computing Education, Chengdu, China, 17–19 May 2019; pp. 164–170. [Google Scholar]
- Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng.
**2007**, 160, 3–24. [Google Scholar] - Gkontzis, A.; Kotsiantis, S.; Panagiotakopoulos, C.; Verykios, V. A predictive analytics framework as a countermeasure for attrition of students. Interact. Learn. Environ.
**2019**, 25, 1–5. [Google Scholar] [CrossRef] - Tomasevic, N.; Gvozdenovic, N.; Vranes, S. An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput. Educ.
**2020**, 143, 103676. [Google Scholar] [CrossRef] - Kotsiantis, S.; Patriarcheas, K.; Xenos, M. A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowl.-Based Syst.
**2010**, 23, 529–535. [Google Scholar] [CrossRef] - Ahmad, F.; Ismail, N.H.; Aziz, A.A. The prediction of students’ academic performance using classification data mining techniques. Appl. Math. Sci.
**2015**, 9, 6415–6426. [Google Scholar] [CrossRef] - Kotsiantis, S.; Pierrakeas, C.; Pintelas, P. Predicting Student’s Performance in Distance Learning using Machine Learning Techniques. Appl. Artif. Intell.
**2004**, 18, 411–426. [Google Scholar] [CrossRef] - Huang, S.; Fang, N. Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models. Comput. Educ.
**2013**, 61, 133–145. [Google Scholar] [CrossRef] - Romero, C.; Ventura, S. Educational data mining: A review of the state of the art. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)
**2010**, 40, 601–618. [Google Scholar] [CrossRef] - Akram, A.; Fu, C.; Li, Y.; Javed, M.Y.; Lin, R.; Jiang, Y.; Tang, Y. Predicting Students’ Academic Procrastination in Blended Learning Course Using Homework Submission Data. IEEE Access
**2019**, 7, 102487–102498. [Google Scholar] [CrossRef] - Olivé, D.M.; Huynh, D.; Reynolds, M.; Dougiamas, M.; Wiese, D. A Quest for a one-size-fits-all Neural Network: Early Prediction of Students At Risk in Online Courses. IEEE Trans. Learn. Technol.
**2019**, 12, 171–183. [Google Scholar] [CrossRef] - Tuckman, B.W. Academic Procrastinators: Their Rationalizations and Web-Course Performance. Available online: https://eric.ed.gov/?id=ED470567 (accessed on 20 December 2019).
- Michinov, N.; Primois, C. Improving productivity and creativity in online groups through social comparison process: New evidence for asynchronous electronic brainstorming. Comput. Hum. Behav.
**2005**, 21, 11–28. [Google Scholar] [CrossRef]

**Figure 4.**Performance metrics of classification methods at different k-fold for two-class: (

**a**) precision, (

**b**) accuracy, and (

**c**) F1-score.

**Figure 5.**Performance metrics of classification methods at different k-fold for three-class: (

**a**) precision, (

**b**) accuracy, and (

**c**) F1-score.

**Figure 6.**Performance metrics of classification methods at different k-fold for four-class: (

**a**) precision, (

**b**) accuracy, and (

**c**) F1-score.

Objective | Behavioral Patterns before Submission | Attributes | Classification Techniques | ||
---|---|---|---|---|---|

Inactive Time | Spare Time | ||||

[17] | Prediction of assignment submission | no | no | -students’ activity data-course and assignment information | DT (CART), Random Forest, NN, GaussianNB, Logit, LDA, SVC |

[50] | Prediction of students’ procrastination | no | yes | -grade | ZeroR, OneR, ID3, J48, Random Forest, Decision Stump, JRip, PART, NBTree, Prism |

[51] | Prediction of students at risk through assignment submission | no | no | -students’ activity data-course and assignment information-peers activity data | Neural Network |

Our work | Prediction of procrastination | yes | yes | -students’ activity and assignment data-grade | L-SVM, R-SVM, Gaussian Processes, Decision Tree, Random Forest, Neural Network, AdaBoost, Naive Bayes |

Notation | Explanation |
---|---|

$\mathrm{S},\text{}\mathrm{A}$ | A set of students and assignments |

$s,a$ | A specific student and assignment |

${v}_{1},{v}_{2},{w}_{1},{w}_{2}$ | A spare time and an inactive time (both continuous and categorical values) |

$Open{D}^{a}$ | The open date of assignment |

$Deadlin{e}^{a}$ | The due date of assignment |

$Firstview{D}^{s}$ | The student’s first view date of assignment |

$Submission{D}^{s}$ | The student’s assignment submission date |

${x}_{i}$,$\text{}{y}_{i}$ | A pair of continuous and categorical features for an assignment i, ${x}_{i}=\left({v}_{1},{v}_{2}\right),{y}_{i}=\left({w}_{1},{w}_{2}\right)$ |

${\mathit{X}}_{j}$, ${\mathit{Y}}_{j}$ | Continuous and categorical feature vectors for a student j, ${\mathit{X}}_{j}=({x}_{1j},{x}_{2j},\dots {x}_{ij})$, ${\mathit{Y}}_{j}=\left({y}_{1j},{y}_{2j},\dots {y}_{ij}\right)$ |

W | Weighted adjacency matrix |

L | Unnormalized Laplacian |

u | Eigenvector |

U | The matrix containing the eigenvectors |

P | The set of Performance metrics |

C | The best classification method |

Course | Period | Type | # of Assignments | # of Students | |
---|---|---|---|---|---|

Dataset 1 (16 continuous features) | Teaching and reflection | 2019 | blended | 8 | 242 |

Dataset 2 (16 categorical features) | Teaching and reflection | 2019 | blended | 8 | 242 |

$\mathbf{Spare}\text{}\mathbf{Time}\text{}\left({\mathit{v}}_{1}\right)$ | $\mathbf{Inactive}\text{}\mathbf{Time}\text{}\left({\mathit{v}}_{2}\right)$ | Score | |
---|---|---|---|

spare time (${v}_{1}$) | 1 | –0.495 | 0.901 |

inactive time (${v}_{2}$) | –0.495 | 1 | –0.508 |

score | 0.901 | –0.508 | 1 |

count | 242 | 242 | 242 |

mean | 7.185 | 4 | 80.902 |

standard deviation | 1.867 | 2.578 | 24.579 |

minimum | 0 | 0 | –3.333 |

maximum | 8 | 8 | 100 |

Cluster 2 | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continuous Features | ||||||||

Precision | 0.992 | 0.974 | 0.981 | 0.983 | 0.983 | 0.985 | 0.987 | 0.985 |

Recall | 0.993 | 0.980 | 0.984 | 0.985 | 0.985 | 0.986 | 0.986 | 0.984 |

Accuracy | 0.993 | 0.980 | 0.984 | 0.985 | 0.985 | 0.986 | 0.986 | 0.984 |

F1-score | 0.993 | 0.982 | 0.986 | 0.986 | 0.986 | 0.987 | 0.988 | 0.986 |

Categorical Features | ||||||||

Precision | 0.984 | 0.992 | 0.989 | 0.991 | 0.990 | 0.990 | 0.990 | 0.989 |

Recall | 0.992 | 0.996 | 0.994 | 0.996 | 0.995 | 0.994 | 0.994 | 0.994 |

Accuracy | 0.992 | 0.996 | 0.994 | 0.996 | 0.995 | 0.994 | 0.994 | 0.994 |

F1-score | 0.996 | 0.998 | 0.997 | 0.998 | 0.997 | 0.997 | 0.997 | 0.997 |

Cluster 3 | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continuous Features | ||||||||

Precision | 0.957 | 0.934 | 0.943 | 0.935 | 0.935 | 0.940 | 0.931 | 0.892 |

Recall | 0.952 | 0.929 | 0.937 | 0.927 | 0.926 | 0.933 | 0.920 | 0.885 |

Accuracy | 0.952 | 0.929 | 0.937 | 0.927 | 0.926 | 0.933 | 0.920 | 0.885 |

F1-score | 0.952 | 0.930 | 0.938 | 0.929 | 0.928 | 0.935 | 0.922 | 0.886 |

Categorical Features | ||||||||

Precision | 0.867 | 0.920 | 0.938 | 0.946 | 0.950 | 0.952 | 0.954 | 0.956 |

Recall | 0.930 | 0.954 | 0.961 | 0.963 | 0.963 | 0.965 | 0.963 | 0.962 |

Accuracy | 0.930 | 0.954 | 0.961 | 0.963 | 0.963 | 0.965 | 0.963 | 0.962 |

F1-score | 0.963 | 0.970 | 0.974 | 0.975 | 0.974 | 0.975 | 0.973 | 0.973 |

Cluster 4 | L-SVM | R-SVM | GP | DT | RF | NN | ADB | NB |

Continuous Features | ||||||||

Precision | 0.764 | 0.809 | 0.862 | 0.861 | 0.862 | 0.877 | 0.850 | 0.813 |

Recall | 0.842 | 0.841 | 0.880 | 0.868 | 0.868 | 0.881 | 0.837 | 0.805 |

Accuracy | 0.842 | 0.841 | 0.880 | 0.868 | 0.868 | 0.881 | 0.837 | 0.805 |

F1-score | 0.899 | 0.874 | 0.903 | 0.888 | 0.885 | 0.896 | 0.852 | 0.820 |

Categorical Features | ||||||||

Precision | 0.596 | 0.778 | 0.843 | 0.856 | 0.858 | 0.866 | 0.848 | 0.855 |

Recall | 0.719 | 0.840 | 0.887 | 0.886 | 0.881 | 0.889 | 0.870 | 0.873 |

Accuracy | 0.719 | 0.840 | 0.887 | 0.886 | 0.881 | 0.889 | 0.870 | 0.873 |

F1-score | 0.788 | 0.874 | 0.911 | 0.905 | 0.898 | 0.905 | 0.889 | 0.891 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Hooshyar, D.; Pedaste, M.; Yang, Y.
Mining Educational Data to Predict Students’ Performance through Procrastination Behavior. *Entropy* **2020**, *22*, 12.
https://doi.org/10.3390/e22010012

**AMA Style**

Hooshyar D, Pedaste M, Yang Y.
Mining Educational Data to Predict Students’ Performance through Procrastination Behavior. *Entropy*. 2020; 22(1):12.
https://doi.org/10.3390/e22010012

**Chicago/Turabian Style**

Hooshyar, Danial, Margus Pedaste, and Yeongwook Yang.
2020. "Mining Educational Data to Predict Students’ Performance through Procrastination Behavior" *Entropy* 22, no. 1: 12.
https://doi.org/10.3390/e22010012