Classification Modelling of Random Forest to Identify the Important Factors in Improving the Quality of Education

Aditya Ramadhan, Budi Susetyo, - Indahwati


National Education Standards (SNP) is the minimum criteria that must be met by the education units and/or educational organizations to realize high-quality national education. The evaluation is implemented through accreditation, and national evaluation of graduate competencies carried out through national examination (UN). Research on the causality relationship between SNP and the UN has been done, but research using classification modelling to explain the relationship between SNP and the UN has never been done. This study employed random forest for multi-class classification to examine important variables in improving the quality of education at the high school level (SMA/MA) based on computer-based national exam (UNBK) scores and accreditation results. The highest classification accuracy and G-Mean value were obtained in multi-class random forest modelling of 88.17% and 48.95% based on the evaluation model. This model generates important factors in the classifying the quality of education by the items of accreditation instruments. Important factors are items 69, 68, 62, 71, 67, 55, 56, 83, 45, 39, 36, 33, 64, 46, and 14. Based on the indicators of important factors, SNP has an important role in classifying the quality of education, which are standards of school facilities (SSP), standards of teacher and education staff (SPT), and standards of graduate competency (SKL). The study results advise region governments and education units to collaborate in improving SSP, SPT, and SKL.


National education standards; UNBK; classification modelling; multi-class random forest.

Full Text:



Indonesian government, “National Education Standards (SNP),” 2005.

Indonesian government, Permendikbud No.004/H/AK/2017, “Criteria and Instrument Accreditations for SMA/MA,” 2017.

Indonesian government, Permendikbud No. 3 of 2017, “Educational Assessment by The Government and Schools,” 2017.

D. Vita, B. Susetyo, and B. Indriyanto, “Generalized Structured Component Analysis (GSCA) for National Education Standards (NES) of Secondary School In Indonesia,” Global Journal of Pure and Applied Mathematics, vol. 11, pp. 2441–2449, Apr 2015.

M. Hijrah, B. Susetyo, and B. Sartono, “Structural Equation Modeling of National Standard Education of Vocational High School Using Partial Least Square Path Modeling,” IJSRSET, vol. 4, pp. 1418–1422, Apr. 2018.

I. A. Setiawan, B. Susetyo, and A. Fitrianto, “Application of Generalized Structural Component Analysis to Identify Relation between Accreditation and National Assessment,” IJSRSET, vol. 4, pp 93–97, Oct 2018.

L. Breiman, “Random Forest,” Machine Learning, vol. 45, pp. 5-32, Apr 2001.

Q. Yanjun. (2017) The CMU website. [Online]. Available:

C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, ”DBSMOTE: Density-based Synthetic Minority Over-Sampling Technique,” Application Intelligence, vol. 36, pp. 664–684. Mar. 2012.

J. Brownlee. (2015) Machine Learning Process homepage on machinelearningmastery. [Online]. Available:

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “{SMOTE}: Synthetic Minority Over-Sampling Technique,” Journal of Atrificial Intelligence Research, vol. 9, pp. 321-357, Jun 2002.

S. Cost and S. Salzberg S, “A Weighted Neighbour Algorithm for Learning with Symbolic Features,” Machine Learning, vol. 10, pp. 57-58, Jan. 1993.

M. N. Adnan and M. Z. Islam, “One-vs-all binarization technique in the context of random forest,” Computational. Intelligence and Machine Learning, vol. 5, pp. 385-390, Apr 2015.

L. Zhou, Q. Wang, and H. Fujita, “One versus one multi-class classification fusion using optimizing decision directed acyclic graph for predicting listing status of companies,” Information Fusion, vol. 36, pp 80–89, Nov. 2016.

E. Hullermeier and S. Vanderlooy S, “Combining predictions in pairwise classification : An optimal adaptive voting strategy and its relation to weighted voting,” Pattern Recognit, vol. 43 pp. 128–142, Jan. 2010.

A. Sen, M. M. Islam, K. Murase, and X. Yao. (2015) IEEEtran homepage on CS.BHAM. [Online]. Available:

M. Sandri and P. Zuccolotto, Data Analysis, Classification and the Forward Search, Zani S., cccc A., M. Riani, and M. Vichi., Ed. Berlin, Germany: Springer, 2006.

P. Probst, M. Wright, and A-L. Boulesteix. (2018) The ARXIV website. [Online]. Available:

T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How many trees in a random forest?” in Machine Learning and Data Mining in Pattern Recognition: 8th International Conference, 2012, paper Proceedings, vol. 7376, p. 154.

P. Probst and A-L. Boulesteix. (2017) The ARXIV website. [Online]. Available:



  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development