Vol. 19 No. 1 (2020): Revista UIS Ingenierías
Articles

Comparison of data mining techniques to identify signs of student desertion, based on academic performance

Boris Rainiero Perez-Gutierrez
Universidad Francisco de Paula Santander

Published 2020-01-04

Keywords

  • student data,
  • higher education,
  • data mining,
  • prediction models,
  • dropout

How to Cite

Perez-Gutierrez, B. R. (2020). Comparison of data mining techniques to identify signs of student desertion, based on academic performance. Revista UIS Ingenierías, 19(1), 193–204. https://doi.org/10.18273/revuin.v19n1-2020018

Abstract

One of the great challenges in educational institutions is to be able to establish the possibility of retirement or desertion of their students. This article presents the results of a comparative study of techniques to support the identification of student dropouts using the academic record of students at a University in Colombia for the Systems Engineering program. The academic record was established for a period of 7 years. Decision trees, logistic regression, and Naive Bayes were compared to establish the best dropout detection technique. Additionally, IBM’s Watson Analytics tool was used to compare its usability and accuracy to a non-expert user. Our experience has shown that the use of simple algorithms is sufficient to achieve ideal levels of accuracy. These results are presented to the academic community to help decrease student dropout.

Downloads

Download data is not yet available.

References

[1] D. Kim and S. Kim, “Sustainable Education: Analyzing the Determinants of University Student Dropout by Nonlinear Panel Data Models,” Sustainability, vol. 10, no. 4, pp. 1–18, March 2018 [En línea]. Disponible en: https://ideas.repec.org/a/gam/jsusta/ v10y2018i4p954-d137969.html

[2] J. J. Brunner, J. Gacel-Avilà, M. Laverde, J. Puukka, J. Rubio, S. Schwartzman, Ó. Valiente et al., Higher Education in Regional and City Development: Antioquia, Colombia 2012. OECD, 2012. [En línea]. Disponible en: https://www.oecd-ilibrary.org/ content/publication/9789264179028-en

[3] S. d. O. Durso, J. V. A. d. Cunha, “Determinant Factors for Undergraduate Student’s Dropout in an Accounting Studies Department of a Brazilian Public University,” Educação em Revista, vol. 34, 00 2018. [En línea]. Disponible en: http://www.scielo.br/scielo.php?script=sci_arttext&pid= S0102-46982018000100142&nrm=iso

[4] T. Mishra, D. Kumar, and S. Gupta, “Mining students’ data for prediction performance,” in Fourth International Conference on Advanced Computing & Communication Technologies, ser. ACCT ’14. Washington, DC, USA: IEEE Computer Society, 2014, pp. 255–262. doi: 10.1109/ ACCT.2014.105

[5] C. Márquez-Vera, A. Cano, C. Romero, A. Y. M. Noaman, H. Mousa Fardoun, and S. Ventura, “Early dropout prediction using data mining: a case study with high school students,” Expert Systems, vol. 33, no. 1, pp. 107–124, feb 2016. doi: 10.1111/exsy.12135

[6] A. Seidman, “Retention revisited: R= e, id+ e & in, iv.” College and University, vol. 71, no. 4, pp. 18–20, 1996.

[7] “Spadies - sistema de prevención y análisis a la deserción en las instituciones de educación superior,” Ministerios de Educación, [En línea]. Disponible en: www.mineducacion.gov.co/ 1621/article-156292.html

[8] V. Tinto, “Dropout from higher education: A theoretical synthesis of recent research,” Review of educational research, vol. 45, no. 1, pp. 89–125, 1975.

[9] B. K. Bhardwaj and S. Pal, “Data mining: A prediction for performance improvement using classification,” (IJCSIS) International Journal of Computer Science and Information Security, vol. 9, no. 4, 2011.

[10] B. K. Baradwaj and S. Pal, “Mining educational data to analyze students’ performance,” International Journal of Advanced Computer Science and Applications, vol. 2, no. 6, 2011.

[11] Z. Kovacic, “Early prediction of student success: Mining students’ enrolment data,” in Informing Science & IT Education Conference (InSITE), vol. 10, 2010, pp. 647–665. doi: 10.28945/1281

[12] T. Devasia, Vinushree T P, and V. Hegde, “Prediction of students performance using Educational Data Mining,” in 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE) IEEE, mar 2016, pp. 91–95. [En línea]. Disponible en: http://ieeexplore.ieee.org/document/7684167/

[13] P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and R. Wirth, “Crisp-dm 1.0,” CRISP-DM Consortium, vol. 76, 2000. [En línea]. Disponible en: ftp://ftp.software.ibm.com/software/analytics/spss/support/ Modeler/Documentation/14/UserManual/CRISP-DM.pdf

[14] R. Wirth, “Crisp-dm: Towards a standard process model for data mining,” in Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, 2000, pp. 29–39.

[15] L. Aulck, N. Velagapudi, J. Blumenstock, and J. West, “Predicting Student Dropout in Higher Education,” in 2016 ICML Workshop on Data4Good: Machine Learning in Social Good Applications, 2016, pp. 16–20. [En línea]. Disponible en: http://arxiv.org/abs/1606.06364

[16] G. W. Dekker, M. Pechenizkiy, and J. M. Vleeshouwers, “Predicting Students Drop Out: A Case Study,” in International Conference on Educational Data Mining (EDM), 2009, pp. 41–50. [En línea]. Disponible en: http://www.educationaldatamining. org/EDM2009/uploads/proceedings/dekker.pdf

[17] E. Yukselturk, S. Ozekes, and Y. K. Türel, “Predicting dropout student: an application of data mining methods in an online education program,” European Journal of Open, Distance and Elearning, vol. 17, no. 1, pp. 118–133, 2014.

[18] A. Tekin, “Early Prediction of Students’ Grade Point Averages at Graduation: A Data Mining Approach,” Eurasian Journal of Educational Research, vol. 54, pp. 207–226, 2014. [En línea]. Disponible en: https://eric.ed.gov/?id=EJ1057301

[19] Q. A. Al-Radaideh, E. M. Al-Shawakfa, and M. I. Al-Najjar, “Mining student data using decision trees,” in International Arab Conference on Information Technology (ACIT’2006), Yarmouk University, Jordan, 2006, pp. 1–5.

[20] L. Jing, “Data mining and its applications in higher education,” New Directions for Institutional Research, vol. 2002, no. 113, pp. 17–36, 2002. doi: 10.1002/ir.35

[21] C. Romero and S. Ventura, “Educational data mining: A survey from 1995 to 2005,” Expert Systems with Applications, vol. 33, no. 1, pp. 135 – 146, 2007. [En línea]. Disponible en: http://www. sciencedirect.com/science/article/pii/S0957417406001266

[22] H. Serge, “Estimating student retention and degreeâcompletion time: Decision trees and neural networks vis-á-vis regression,” New Directions for Institutional Research, vol. 2006, no. 131, pp. 17–33, 2006. doi: 10.1002/ir.185