Education data mining methods for predicting students performance

Author(s) Collection number Pages Download abstract Download full text
Verhun V. R. № 1 (58) 71-78 Image Image

The ability to predict student performance may open many opportunities in creation of advanced and personalized educational programs as well as affect educational process and quality of education. Most of the scientific publications are aimed to re­search feasibility of predicting final students’ grades and finding the new methods and approaches that allow to make predictions in more clear and precise way. In most cases, the data mining method is used. All the data generated by any learning management sys­tems is considered for predictive modelling. Nowadays many research publications in area of educational data mining is published but still there are many potential fields of research since all the scientific results are highly depended to context and data structure. Prediction of students’ performance also depends on learning styles and teaching styles.

In this research, the problem of predicting the success of students of educational programs has been considered. The methods of machine learning for solving the clas­sification tasks have been selected for predicting student success. The data set, on the basis of which the research on the productivity of the selected methods has been conducted, has been described. Selected performance metrics have been used to evaluate the accuracy of the selected methods. The methods of solving the classification problem with the best performance indicators have been determined by experimental way. It has been established that the highest accuracy of work is provided by the methods of logistic regression and random forest. All the performance indicators of the algorithms have been given, as well as the matrix of errors. It has been found that the efficiency of applying these methods of methods is not high, and it has a large dependency on the data set, the context and the defined problem.

Keywords: data mining, forecasting, random forest, regression, success, curriculum, education, education data mining.

doi: 10.32403/1998-6912-2019-1-58-71-78


  • Hellas, A. et. al. (2018). Predicting academic performance: a systematic literature review. In Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE 2018 Companion). ACM, New York, NY, USA, 175–199. doi: https://doi.org/10.1145/3293881.3295783 (in English).
  • World Economic Forum Global Competitiveness Report. (2011). Retrieved from http://www.feg.org.ua/docs/sistema_en.pdf (дата звернення 30.05.2019) (in English).
  • Sarker, F et al. (2013). Student’s performance prediction by using institutional internal and external open data sources. CSEDU13 : 5th International Conference on Computer Sup­ported Education. Aachen, Germany (in English).
  • Xu, J., Han, Y., Marcu, D. (2017). Progressive Prediction of Student Performance in College Programs. Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence (AAAI-17), 1604–1610 (in English).
  • Oleksiv, I., Izonin, I., Kharchuk, V., Tkachenko, R., Doroshenko, A. (2018). Identifica­tion of IT Sector Stakeholder’s Requirements to Masters Program in Information System in Lviv Region. In: Ermolayev, V., Suárez-Figueroa, M. C., Ławrynowicz, A., Palma, R., Yakovyna, V., Mayr, H. C., Nikitchenko, M., and Spivakovsky, A. (Eds.): ICT in Education, Research and Industrial Applications. Proc. 14-th Int. Conf. ICTERI 2018. Volume I: Main Conference. Kyiv, Ukraine, May 14–17, 112–120. CEUR-WS.org (in English).
  • How the IT industry of Ukraine and Eastern Europe works: a report. (2019). Retrie­ved from https://ain.ua/en/2019/02/15/it-industry-of-ukraine-and-eastern-europe/ (дата звернення 30.05.2019) (in English).
  • Verhun, V. (2019). Ohliad metodiv rozviazannia zadachi klasyfikatsii v intelektual-no­mu analizi danykh navchalnykh prohram: Naukovyi visnyk NLTU Ukrainy, 29, 5 (v drutsi) (in Ukrainian).
  • Pirotti, F., Sunar, F., Piragnolo, M. (2016). Benchmark of machine learning methods for clas­sification of a Sentine l-2 image. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 41, 335–340 (in English).
  • Breiman, L. (2001). Random forests. Machine learning, 45, 1, 5–32 (in English).
  • Haykin, S. (2009). Neural networks and learning machines. Upper Saddle River. NJ, USA : Pearson, 3 (in English).