# Predicting and Comparing Students’ Online and Offline Academic Performance Using Machine Learning Algorithms

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Works

#### 2.1. The New Way of Teaching

#### 2.2. Behavior Studies and the Application of Markov Models

#### 2.3. On-Site Research

#### 2.4. Previous Applications of Decision Trees and Artificial Neural Networks

#### 2.5. Previous Applications of Support Vector Machines

## 3. Databases

## 4. Correlation Analysis—The Chi-Square Test

## 5. Used Machine Learning Algorithms

#### 5.1. Support Vector Machine

#### 5.2. K-Nearest Neighbor

- Euclidean distance (for continuous): Euclidean distance is calculated as the square root of the sum of the squared differences between points.
- Manhattan distance (for continuous): This is the distance between vectors using the sum of their absolute difference.
- Hamming distance (for categorical): It is used for categorical variables. If the x and the y are equal, the distance D will be equal to 0. Otherwise, the distance is 1.

#### 5.3. Decision Tree

#### 5.4. Random Forest

#### 5.5. Limitations of the Different Algorithms and Databases

## 6. Method

- numpy: a library for numerical computing in Python;
- pandas: a library for data manipulation and analysis;
- seaborn: a library for data visualization based on matplotlib;
- matplotlib: a library for creating static, animated, and interactive visualizations in Python;
- time: a library providing various time-related functions;
- sklearn: a machine learning library for Python;
- astropy: a library for astronomy and astrophysics;
- scipy: a library for scientific and technical computing;
- scikit-learn: a library for machine learning algorithms and tools in Python.

- Support vector machine (SVM) was used with three different kernels (linear, polynomial, Gaussian). The algorithm splits the data into training, validation, and testing sets, and tunes hyperparameters to find the optimal C value for the SVM algorithm. Then, the SVM model is trained with the optimal C value and the testing set is used to evaluate the model’s accuracy and F1 score. This process is repeated until a certain threshold of accuracy and F1 score is reached. Finally, the results are displayed. The max_iteration limit was set to 100, as it gave the best result for that value. The optimal split states for the linear kernel, polynomial kernel, and Gaussian kernel were 388,628,375; 7,070,621 and 93,895,097; respectively. Random splitting of the data was used to avoid overfitting and to get a generalizable model. By setting the seed to a specific number, we can ensure that the data is split in the same way each time the code is run.
- The k-nearest neighbors (KNN) algorithm was used to classify a dataset into two classes (binary classification). The dataset was loaded into x and y, where x contained the features and y contained the target variable (the class labels). To find the optimal random state, the dataset was randomly split into training and testing sets, and a KNN model was trained. Its performance was then evaluated using accuracy and F1 score. The optimal state was chosen based on the highest accuracy and F1 score obtained. Once the optimal state was found, the dataset was split again, and a new KNN model was trained on the training set with fixed hyperparameters (n_neighbors = 7, metric = ’chebyshev’). GridSearchCV was used to determine the best hyperparameter for the number of neighbors; however, it was not used to train the final KNN model. 7 was chosen as the K value, because after tuning the parameters 7 or 11 was the optimal candidate.
- This decision tree classifier was used to classify a dataset into two classes (binary classification). The dataset is loaded into X_1 and y_1, where X_1 contains the features and y_1 contains the target variable (the class labels). The algorithm performs a train-test split and fits a decision tree classifier on the training set. The algorithm also performs cross-validation using both the k-fold and stratified k-fold methods, with both 5- and 10-folds, to evaluate the performance of the model. The macro-averaged scores for accuracy, precision, recall, and F1-score are computed using the cross_validate function.
- The algorithm trains a random forest classifier model to predict whether a student has passed a course or not, based on various features related to their demographics, academic performance, and personal habits. The model is evaluated using a train-test split, as well as cross-validation via both k-fold and stratified k-fold methods. Metrics such as accuracy, precision, recall, and F1 score are used to evaluate the model’s performance. Cross-validation is also performed using the k-fold and stratified k-fold methods to evaluate the model’s performance.

## 7. Results

#### 7.1. Support Vector Machine

- Linear kernel: When the data can be divided using a single line, or when it is linearly separable, a linear kernel is utilized. It is usually applied when a particular dataset contains a sizable number of features.
- Polynomial kernel: A kernel function called a polynomial kernel, which is frequently used with SVMs and other kernelized models, shows how similar vectors in a feature space are to the polynomials of the original variables and enables the learning of non-linear models.
- Furthermore, Gaussian radial basis function (RBF) is a well-liked kernel approach used in SVM models. The value of an RBF kernel relies on how far it is from the origin or another location.

#### 7.2. K-Nearest Neighbors

#### 7.3. Decision Tree

#### 7.4. Random Forest

## 8. Discussion

## 9. Conclusions and Future Work

#### 9.1. Conclusions

#### 9.2. Future Work and Suggestions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Acevedo-Duque, Á.; Jiménez-Bucarey, C.; Prado-Sabido, T.; Fernández-Mantilla, M.M.; Merino-Flores, I.; Izquierdo-Marín, S.S.; Valle-Palomino, N. Education for Sustainable Development: Challenges for Postgraduate Programmes. Int. J. Environ. Res. Public Health
**2023**, 20, 1759. [Google Scholar] [CrossRef] [PubMed] - Psacharopoulos, G. The Contribution of Education to Economic Growth: International Comparisons. Available online: https://documents.worldbank.org/en/publication/documents-reports/documentdetail/843251487669376366/the-contribution-of-education-to-economic-growth-international-comparisons (accessed on 12 March 2023).
- Thomas, L. Towards equity: Developing a national approach to improving social justice through higher education in England. In Equity Policies in Global Higher Education: Reducing Inequality and Increasing Participation and Attainment; Springer International Publishing: Cham, Switzerland, 2022; pp. 89–115. [Google Scholar]
- Shah, T.H. Big data analytics in higher education. In Research Anthology on Big Data Analytics, Architectures, and Applications; Information Resources Management Association: Hershey, PA, USA, 2022; pp. 1275–1293. [Google Scholar]
- Thorn, W.; Vincent-Lancrin, S. Education in the time of COVID-19 in France, Ireland, the United Kingdom and the United States: The nature and impact of remote learning. In Primary and Secondary Education during COVID-19: Disruptions to Educational Opportunity during a Pandemic; Springer: New York, NY, USA, 2022; pp. 383–420. [Google Scholar]
- Fauzi, M.A. E-learning in higher education institutions during COVID-19 pandemic: Current and future trends through bibliometric analysis. Heliyon
**2022**, 8, e09433. [Google Scholar] [CrossRef] [PubMed] - Xiao, W.; Ji, P.; Hu, J. A survey on educational data mining methods used for predicting students’ performance. Eng. Rep.
**2022**, 4, e12482. [Google Scholar] [CrossRef] - Yağcı, M. Educational data mining: Prediction of students’ academic performance using machine learning algorithms. Smart Learn. Environ.
**2022**, 9, 11. [Google Scholar] [CrossRef] - Zhang, W.; Wang, Y.; Wang, S. Predicting academic performance using tree-based machine learning models: A case study of bachelor students in an engineering department in China. Educ. Inf. Technol.
**2022**, 27, 13051–13066. [Google Scholar] [CrossRef] - Qiu, F.; Zhang, G.; Sheng, X.; Jiang, L.; Zhu, L.; Xiang, Q.; Jiang, B.; Chen, P.k. Predicting students’ performance in e-learning using learning process and behaviour data. Sci. Rep.
**2022**, 12, 453. [Google Scholar] [CrossRef] - Yousafzai, B.K.; Khan, S.A.; Rahman, T.; Khan, I.; Ullah, I.; Ur Rehman, A.; Baz, M.; Hamam, H.; Cheikhrouhou, O. Student-Performulator: Student Academic Performance Using Hybrid Deep Neural Network. Sustainability
**2021**, 13, 9775. [Google Scholar] [CrossRef] - Atlam, E.S.; Ewis, A.; El-Raouf, M.A.; Ghoneim, O.; Gad, I. A new approach in identifying the psychological impact of COVID-19 on university student’s academic performance. Alex. Eng. J.
**2022**, 61, 5223–5233. [Google Scholar] [CrossRef] - Gao, L.; Zhao, Z.; Li, C.; Zhao, J.; Zeng, Q. Deep cognitive diagnosis model for predicting students’ performance. Future Gener. Comput. Syst.
**2022**, 126, 252–262. [Google Scholar] [CrossRef] - Mubarak, A.A.; Cao, H.; Zhang, W. Prediction of students’ early dropout based on their interaction logs in online learning environment. Interact. Learn. Environ.
**2020**, 30, 1414–1433. [Google Scholar] [CrossRef] - Liu, C.; Wang, H.; Du, Y.; Yuan, Z. A Predictive Model for Student Achievement Using Spiking Neural Networks Based on Educational Data. Appl. Sci.
**2022**, 12, 3841. [Google Scholar] [CrossRef] - Chettaoui, N.; Atia, A.; Bouhlel, M.S. Predicting Students Performance Using Eye-Gaze Features in an Embodied Learning Environment. In Proceedings of the 2022 IEEE Global Engineering Education Conference (EDUCON), Tunis, Tunisia, 28–31 March 2022; pp. 704–711. [Google Scholar] [CrossRef]
- Tadayon, M.; Pottie, G.J. Predicting Student Performance in an Educational Game Using a Hidden Markov Model. IEEE Trans. Educ.
**2020**, 63, 299–304. [Google Scholar] [CrossRef] - Nabizadeh, A.H.; Gonçalves, D.; Gama, S.; Jorge, J. Early Prediction of Students’ Final Grades in a Gamified Course. IEEE Trans. Learn. Technol.
**2022**, 15, 311–325. [Google Scholar] [CrossRef] - Hai, L.; Sang, G.; Wang, H.; Li, W.; Bao, X. An Empirical Investigation of University Students & Behavioural Intention to Adopt Online Learning: Evidence from China. Behav. Sci.
**2022**, 12, 403. [Google Scholar] [CrossRef] - Valdebenito-Villalobos, J.; Parra-Rizo, M.A.; Chávez-Castillo, Y.; Díaz-Vargas, C.; Sanzana Vallejos, G.; Gutiérrez Echavarría, A.; Tapia Figueroa, A.; Godoy Montecinos, X.; Zapata-Lamana, R.; Cigarroa, I. Perception of Cognitive Functions and Academic Performance in Chilean Public Schools. Behav. Sci.
**2022**, 12, 356. [Google Scholar] [CrossRef] - Xu, J.; Moon, K.H.; van der Schaar, M. A Machine Learning Approach for Tracking and Predicting Student Performance in Degree Programs. IEEE J. Sel. Top. Signal Process.
**2017**, 11, 742–753. [Google Scholar] [CrossRef] - Ofori, F.; Maina, E.; Gitonga, R. Using machine learning algorithms to predict students’s performance and improve learning outcome: A literature based review. J. Inf. Technol.
**2020**, 4, 33–55. [Google Scholar] - Balaji, P.; Alelyani, S.; Qahmash, A.; Mohana, M. Contributions of Machine Learning Models towards Student Academic Performance Prediction: A Systematic Review. Appl. Sci.
**2021**, 11, 7. [Google Scholar] [CrossRef] - Baashar, Y.; Alkawsi, G.; Mustafa, A.; Alkahtani, A.A.; Alsariera, Y.A.; Ali, A.Q.; Hashim, W.; Tiong, S.K. Toward Predicting Student & Academic Performance Using Artificial Neural Networks (ANNs). Appl. Sci.
**2022**, 12, 1289. [Google Scholar] [CrossRef] - Thaher, T.; Zaguia, A.; Al Azwari, S.; Mafarja, M.; Chantar, H.; Abuhamdah, A.; Turabieh, H.; Mirjalili, S.; Sheta, A. An Enhanced Evolutionary Student Performance Prediction Model Using Whale Optimization Algorithm Boosted with Sine-Cosine Mechanism. Appl. Sci.
**2021**, 11, 237. [Google Scholar] [CrossRef] - Ramaswami, G.; Susnjak, T.; Mathrani, A. On Developing Generic Models for Predicting Student Outcomes in Educational Data Mining. Big Data Cogn. Comput.
**2022**, 6, 6. [Google Scholar] [CrossRef] - Poudyal, S.; Mohammadi-Aragh, M.J.; Ball, J.E. Prediction of Student Academic Performance Using a Hybrid 2D CNN Model. Electronics
**2022**, 11, 1005. [Google Scholar] [CrossRef] - Arcinas, M.M. Design of Machine Learning Based Model to Predict Students Academic Performance. ECS Trans.
**2022**, 107, 3207. [Google Scholar] [CrossRef] - Naicker, N.; Adeliyi, T.; Wing, J. Linear support vector machines for prediction of student performance in school-based education. Math. Probl. Eng.
**2020**, 2020, 4761468. [Google Scholar] [CrossRef] - Csafrit. Higher Education Students Performance Evaluation. 2021. Available online: https://www.kaggle.com/datasets/csafrit2/higher-education-students-performance-evaluation (accessed on 1 November 2022).
- kartikaya924. Kartikaya924/Student-Performance-Prediction-Using-Data-Mining-Techniques: A Semantic Approach towards Student Performance Prediction Using Data Mining Techniques. Available online: https://github.com/kartikaya924/Student-Performance-Prediction-using-Data-Mining-Techniques (accessed on 1 November 2022).
- Salcedo-Sanz, S.; Rojo-Álvarez, J.L.; Martínez-Ramón, M.; Camps-Valls, G. Support vector machines in engineering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
**2014**, 4, 234–267. [Google Scholar] [CrossRef] - Janni, M.; Coppedè, N.; Bettelli, M.; Briglia, N.; Petrozza, A.; Summerer, S.; Vurro, F.; Danzi, D.; Cellini, F.; Marmiroli, N.; et al. In Vivo Phenotyping for the Early Detection of Drought Stress in Tomato. Plant Phenomics
**2019**, 2019, 1–10. [Google Scholar] [CrossRef][Green Version] - Berwick, R.; Idiot, V. An Idiot’s Guide to Support Vector Machines (SVMs). 2013. Available online: https://web.mit.edu/6.034/wwwbob/svm-notes-long-08.pdf (accessed on 13 February 2023).
- Prasatha, V.; Alfeilate, H.A.A.; Hassanate, A.; Lasassmehe, O.; Tarawnehf, A.S.; Alhasanatg, M.B.; Salmane, H.S.E. Effects of distance measure choice on knn classifier performance-a review. arXiv
**2017**, arXiv:1708.04321. [Google Scholar] - Shah, R. Introduction to k-Nearest Neighbors (kNN) Algorithm, Medium. 2021. Available online: https://ai.plainenglish.io/introduction-to-k-nearest-neighbors-knn-algorithm-e8617a448fa8 (accessed on 13 February 2023).
- Kotsiantis, S.B. Decision trees: A recent overview. Artif. Intell. Rev.
**2013**, 39, 261–283. [Google Scholar] [CrossRef] - Biau, G.; Scornet, E. A random forest guided tour. Test
**2016**, 25, 197–227. [Google Scholar] [CrossRef][Green Version] - Kashyap, K. Machine Learning- Decision Trees and Random Forest Classifiers, Medium. 2019. Available online: https://medium.com/analytics-vidhya/machine-learning-decision-trees-and-random-forest-classifiers-81422887a544 (accessed on 13 February 2023).
- Manna, S. Evaluation Metrics Part 3, Medium. 2020. Available online: https://medium.com/the-owl/evaluation-metrics-part-3-47c315e07222 (accessed on 13 February 2023).

**Figure 6.**Support vector machine hyperplane [33].

**Figure 7.**Support vector machine margin [34].

**Figure 8.**K-nearest neighbor algorithm [36].

**Figure 9.**Random forest algorithm [39].

**Figure 10.**ROC curve description [40].

Online | Offline | ||||
---|---|---|---|---|---|

Attribute | P | In Relationship with PASSED | Attribute | P | In Relationship with PASSED |

GENDER | 1.0 | 1- Student Age | 0.5653471007309754 | ||

SCHOOL | 0.30312215312723617 | 2- Sex | 0.5523722889801679 | ||

AGE | 0.2953693982712599 | 3- Graduated high-school type | 0.24781084162944328 | ||

GPA | $3.0519065109580407\times {10}^{-169}$ | ✓ | 4- Scholarship type | 0.11572254970006551 | |

TOOL | 0.28673716710442554 | 5- Additional work | 0.7826380994749932 | ||

TIME | 0.008962416672052801 | ✓ | 6- Regular artistic or sports activity | 0.5523722889801679 | |

STUDY_DIGITAL | 0.08621366366725645 | 7- Do you have a partner | 0.6303571835687162 | ||

DISTRACTED | 0.7399025825207426 | 8- Total salary if available | 0.6099332258713447 | ||

FIX_BEDTIME | 0.6908995564635236 | 9- Transportation to the university | 0.5322122277880168 | ||

SLEEPING_HABBIT | 0.47716005566088127 | 10- Accommodation type in Cyprus | 0.20099557574466903 | ||

SCREEN_EXPOSURE | 0.9837331924869825 | 11- Mother’s education | 0.8447430862982488 | ||

DISTANCE_LEARNING | 0.87106656348976 | 12- Father’s education | 0.4446173066557606 | ||

ISOLATION | 0.19302015313488863 | 13- Number of sisters/brothers if available | 0.3751796318164345 | ||

UNI_LEARNING_STRENGTHENING | 0.7302432920209609 | 14- Parental status | 0.26226459029998406 | ||

STAYING_HOME_LAZINESS | 0.9494226741965801 | 15- Mother’s occupation | 0.5866029226108516 | ||

TOOLS_BOREDOM | 0.857639492044402 | 16- Father’s occupation | 0.970296826082659 | ||

STRESS | 0.9438510256976413 | 17- Weekly study hours | 0.730484272332087 | ||

VOLUME_OF_ASSIGNMENTS | 0.6252988719785977 | 18- Reading frequency non-scientific books/journals | 0.6578769407759745 | ||

ONLINE_QUIZ_NERVOUSNESS | 0.2998717029054319 | 19- Reading frequency scientific books/journals | 0.47322234454747136 | ||

20- Attendance to the seminars/conferences | 0.6301485020030153 | ||||

21- Impact of your projects/activities on your success | 0.36687804454852474 | ||||

22- Attendance to classes | 0.16635800926992822 | ||||

23- Preparation to midterm exams 1 | 0.31046666478253787 | ||||

24- Preparation to midterm exams 2 | 0.7229766411774667 | ||||

25- Taking notes in classes | 0.3169771850067032 | ||||

26- Listening in classes | 0.670788988176968 | ||||

27- Discussion improves my interest | 0.31097735731866283 | ||||

28- Flip-classroom | 0.7258411373761349 | ||||

29- Cumulative grade point last semester (/4.00) | 0.001270588208597507 | ✓ | |||

30- Cumulative grade point in the graduation (/4.00) | 0.06807720013490783 | ||||

31- Course ID | 0.0026331340604132353 | ✓ | |||

32- OUTPUT Grade | $4.549289632793446\times {10}^{-28}$ | ✓ |

Metric | Linear Kernel | Polynomial Kernel | Gaussian Kernel |
---|---|---|---|

Training time | 1 ms | 0 ms | 1 ms |

Accuracy % | 87.5 | 70.8 | 70.83 |

Confusion matrix | [0, 7] | [2, 3] | [0, 3] |

[0, 17] | [4, 15] | [0, 21] | |

F1 score | 0.41 | 0.59 | 0.47 |

ROC_AUC_score | 0.50 | 0.59 | 0.50 |

The model accuracy: 98% |

The training time is: 0 ms |

The F1 score is: 0.5 |

The ROC_AUC_score is: 0.5 |

Offline | Online | ||||||||
---|---|---|---|---|---|---|---|---|---|

Precision | Recall | F1-Score | Support | Precision | Recall | F1-Score | Support | ||

0.0 | 0.00 | 0.00 | 0.00 | 13 | 0.0 | 0.00 | 0.00 | 0.00 | 1 |

1.0 | 0.70 | 1.00 | 0.83 | 31 | 1.0 | 1.00 | 1.00 | 1.00 | 236 |

Accuracy | 0.70 | 44 | Accuracy | 1.00 | 237 | ||||

Macro avg | 0.35 | 0.50 | 0.41 | 44 | Macro avg | 0.50 | 0.50 | 0.50 | 237 |

Weighted avg | 0.5 | 0.7 | 0.58 | 44 | Weighted avg | 0.99 | 1.00 | 0.99 | 237 |

Offline | Online | ||||||||
---|---|---|---|---|---|---|---|---|---|

Precision | Recall | F1-Score | Support | Precision | Recall | F1-Score | Support | ||

0.0 | 1.00 | 1.00 | 1.00 | 8 | 0.0 | 1.00 | 1.00 | 1.00 | 1 |

1.0 | 1.0 | 1.00 | 1.00 | 36 | 36 | 1.00 | 1.00 | 1.00 | 236 |

Accuracy | 1.00 | 44 | Accuracy | 1.00 | 237 | ||||

Macro avg | 1.00 | 1.00 | 1.00 | 44 | Macro avg | 1.00 | 1.00 | 1.00 | 237 |

Weighted avg | 1.00 | 1.00 | 1.00 | 44 | Weighted avg | 1.00 | 1.00 | 1.00 | 237 |

Confusion matrix[[8, 0][0, 26]] | Confusion matrix[[1, 0][0, 236]] |

Offline | Online | ||||||||
---|---|---|---|---|---|---|---|---|---|

Precision | Recall | F1-Score | Support | Precision | Recall | F1-Score | Support | ||

0.0 | 0.80 | 1.00 | 0.89 | 8 | 0.0 | 0.00 | 0.00 | 0.00 | 1 |

1.0 | 1.00 | 0.94 | 0.97 | 36 | 36 | 1.00 | 1.00 | 1.00 | 236 |

Accuracy | 0.95 | 44 | Accuracy | 1.00 | 237 | ||||

Macro avg | 0.90 | 0.97 | 0.93 | 44 | Macro avg | 0.50 | 0.50 | 0.50 | 237 |

Weighted avg | 0.96 | 0.95 | 0.96 | 44 | Weighted avg | 0.99 | 1.00 | 0.99 | 237 |

Confusion matrix[[8, 0][2, 34]] | Confusion matrix[[0, 1][0, 236]] |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Holicza, B.; Kiss, A. Predicting and Comparing Students’ Online and Offline Academic Performance Using Machine Learning Algorithms. *Behav. Sci.* **2023**, *13*, 289.
https://doi.org/10.3390/bs13040289

**AMA Style**

Holicza B, Kiss A. Predicting and Comparing Students’ Online and Offline Academic Performance Using Machine Learning Algorithms. *Behavioral Sciences*. 2023; 13(4):289.
https://doi.org/10.3390/bs13040289

**Chicago/Turabian Style**

Holicza, Barnabás, and Attila Kiss. 2023. "Predicting and Comparing Students’ Online and Offline Academic Performance Using Machine Learning Algorithms" *Behavioral Sciences* 13, no. 4: 289.
https://doi.org/10.3390/bs13040289