ClickTree: a tree-based method for predicting math students’ performance based on clickstream data

Narges Rohani, Behnam Rohani, Areti Manataki

Research output: Contribution to journalArticlepeer-review

Abstract

The prediction of student performance and the analysis of students’ learning behaviour play an important role in enhancing online courses. By analysing a massive amount of clickstream data that captures student behaviour, educators can gain valuable insights into the factors that influence students’ academic outcomes and identify areas of improvement in courses. In this study, we developed ClickTree, a treebased methodology, to predict student performance in mathematical problems in end-unit assignments based on students’ clickstream data. Utilising extensive clickstream data, we extracted a novel set of features at three levels, including problem level, assignment-level and student-level, and we trained a CatBoost tree to predict whether a student will successfully answer a problem in an end-unit assignment or not. The developed method achieved an Area under the ROC Curve (AUC) of approximately 79% in the Educational Data Mining Cup 2023 and ranked second in the competition. Our results indicate that students who performed well in end-unit assignment problems engaged more with in-unit assignments and answered more problems correctly, while those who struggled had higher tutoring request rates. We also found that students face more difficulties with “check all that apply” types of problems. Moreover, Algebra II was the most difficult subject for students. The proposed method can be utilised to improve students’ learning experiences, and the insights from this study can be integrated into mathematics courses to enhance students’ learning outcomes. The code and implementation is available at https://www.kaggle.com/code/nargesrohani/clicktree/notebook.
Original languageEnglish
Pages (from-to)32-57
Number of pages26
JournalJournal of Educational Data Mining
Volume16
Issue number2
DOIs
Publication statusPublished - 17 Oct 2024

Keywords

  • Student performance prediction
  • Educational data mining
  • Mathematics
  • Learning behaviour
  • Learning analytics

Fingerprint

Dive into the research topics of 'ClickTree: a tree-based method for predicting math students’ performance based on clickstream data'. Together they form a unique fingerprint.

Cite this