Education data mining methods for predicting students performance. Scientific Papers. Ukrainian Academy of Printing

Author(s)	Collection number	Pages	Download abstract	Download full text
Verhun V. R.	№ 1 (58)	71-78

Summary
References

The ability to predict student performance may open many opportunities in creation of advanced and personalized educational programs as well as affect educational process and quality of education. Most of the scientific publications are aimed to research feasibility of predicting final students’ grades and finding the new methods and approaches that allow to make predictions in more clear and precise way. In most cases, the data mining method is used. All the data generated by any learning management systems is considered for predictive modelling. Nowadays many research publications in area of educational data mining is published but still there are many potential fields of research since all the scientific results are highly depended to context and data structure. Prediction of students’ performance also depends on learning styles and teaching styles.

In this research, the problem of predicting the success of students of educational programs has been considered. The methods of machine learning for solving the classification tasks have been selected for predicting student success. The data set, on the basis of which the research on the productivity of the selected methods has been conducted, has been described. Selected performance metrics have been used to evaluate the accuracy of the selected methods. The methods of solving the classification problem with the best performance indicators have been determined by experimental way. It has been established that the highest accuracy of work is provided by the methods of logistic regression and random forest. All the performance indicators of the algorithms have been given, as well as the matrix of errors. It has been found that the efficiency of applying these methods of methods is not high, and it has a large dependency on the data set, the context and the defined problem.

Keywords: data mining, forecasting, random forest, regression, success, curriculum, education, education data mining.

doi: 10.32403/1998-6912-2019-1-58-71-78

Hellas, A. et. al. (2018). Predicting academic performance: a systematic literature review. In Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE 2018 Companion). ACM, New York, NY, USA, 175–199. doi: https://doi.org/10.1145/3293881.3295783 (in English).
World Economic Forum Global Competitiveness Report. (2011). Retrieved from http://www.feg.org.ua/docs/sistema_en.pdf (дата звернення 30.05.2019) (in English).
Sarker, F et al. (2013). Student’s performance prediction by using institutional internal and external open data sources. CSEDU13 : 5th International Conference on Computer Supported Education. Aachen, Germany (in English).
Xu, J., Han, Y., Marcu, D. (2017). Progressive Prediction of Student Performance in College Programs. Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence (AAAI-17), 1604–1610 (in English).
Oleksiv, I., Izonin, I., Kharchuk, V., Tkachenko, R., Doroshenko, A. (2018). Identification of IT Sector Stakeholder’s Requirements to Masters Program in Information System in Lviv Region. In: Ermolayev, V., Suárez-Figueroa, M. C., Ławrynowicz, A., Palma, R., Yakovyna, V., Mayr, H. C., Nikitchenko, M., and Spivakovsky, A. (Eds.): ICT in Education, Research and Industrial Applications. Proc. 14-th Int. Conf. ICTERI 2018. Volume I: Main Conference. Kyiv, Ukraine, May 14–17, 112–120. CEUR-WS.org (in English).
How the IT industry of Ukraine and Eastern Europe works: a report. (2019). Retrieved from https://ain.ua/en/2019/02/15/it-industry-of-ukraine-and-eastern-europe/ (дата звернення 30.05.2019) (in English).
Verhun, V. (2019). Ohliad metodiv rozviazannia zadachi klasyfikatsii v intelektual-nomu analizi danykh navchalnykh prohram: Naukovyi visnyk NLTU Ukrainy, 29, 5 (v drutsi) (in Ukrainian).
Pirotti, F., Sunar, F., Piragnolo, M. (2016). Benchmark of machine learning methods for classification of a Sentine l-2 image. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 41, 335–340 (in English).
Breiman, L. (2001). Random forests. Machine learning, 45, 1, 5–32 (in English).
Haykin, S. (2009). Neural networks and learning machines. Upper Saddle River. NJ, USA : Pearson, 3 (in English).