Facial and Body Gesture Recognition for Determining Student Concentration Level

Xian Yang Chan, Tee Connie, Michael Kah Ong Goh


Online learning has gained immense popularity, especially since the COVID-19 pandemic. However, it has also brought its own set of challenges. One of the critical challenges in online learning is the ability to evaluate students' concentration levels during virtual classes. Unlike traditional brick-and-mortar classrooms, teachers do not have the advantage of observing students' body language and facial expressions to determine whether they are paying attention. To address this challenge, this study proposes utilizing facial and body gestures to evaluate students' concentration levels. Common gestures such as yawning, playing with fingers or objects, and looking away from the screen indicate a lack of focus. A dataset containing images of students performing various actions and gestures representing different concentration levels is collected. We propose an enhanced model based on a vision transformer (RViT) to classify the concentration levels. This model incorporates a majority voting feature to maintain real-time prediction accuracy. This feature classifies multiple frames, and the final prediction is based on the majority class. The proposed method yields a promising 92% accuracy while maintaining efficient computational performance. The system provides an unbiased measure for assessing students' concentration levels, which can be useful in educational settings to improve learning outcomes. It enables educators to foster a more engaging and productive virtual classroom environment.


Vision transformer; random projection; facial expression recognition; gesture recognition; concentration level prediction

Full Text:



R. Shafique, W. Aljedaani, F. Rustam, E. Lee, A. Mehmood, and G. S. Choi, “Role of Artificial Intelligence in Online Education: A Systematic Mapping Study,†IEEE Access, vol. 11, pp. 52570–52584, 2023, doi: 10.1109/ACCESS.2023.3278590.

Y. Shi, F. Sun, H. Zuo, and F. Peng, “Analysis of Learning Behavior Characteristics and Prediction of Learning Effect for Improving College Students’ Information Literacy Based on Machine Learning,†IEEE Access, vol. 11, pp. 50447–50461, 2023, doi: 10.1109/ACCESS.2023.3278370.

A. Revadekar, S. Oak, A. Gadekar, and P. Bide, “Gauging attention of students in an e-learning environment,†in 2020 IEEE 4th Conference on Information & Communication Technology (CICT), Dec. 2020, pp. 1–6. doi: 10.1109/CICT51604.2020.9312048.

D. M. Cretu and Y.-S. Ho, “The Impact of COVID-19 on Educational Research: A Bibliometric Analysis,†Sustainability, vol. 15, no. 6, Art. no. 6, Jan. 2023, doi: 10.3390/su15065219.

A. Kumar et al., “Impact of the COVID-19 pandemic on teaching and learning in health professional education: a mixed methods study protocol,†BMC Medical Education, vol. 21, no. 1, p. 439, Aug. 2021, doi: 10.1186/s12909-021-02871-w.

B. Meriem, H. Benlahmar, M. A. Naji, E. Sanaa, and K. Wijdane, “Determine the Level of Concentration of Students in Real Time from their Facial Expressions,†International Journal of Advanced Computer Science and Applications (IJACSA), vol. 13, no. 1, Art. no. 1, 55/31 2022, doi: 10.14569/IJACSA.2022.0130119.

G. J. DuPaul, P. L. Morgan, G. Farkas, M. M. Hillemeier, and S. Maczuga, “Academic and Social Functioning Associated with Attention-Deficit/Hyperactivity Disorder: Latent Class Analyses of Trajectories from Kindergarten to Fifth Grade,†J Abnorm Child Psychol, vol. 44, no. 7, pp. 1425–1438, Oct. 2016, doi: 10.1007/s10802-016-0126-z.

S. T. Lim, J. Y. Yuan, K. W. Khaw, and X. Chew, “Predicting Travel Insurance Purchases in an Insurance Firm through Machine Learning Methods after COVID-19,†Journal of Informatics and Web Engineering, vol. 2, no. 2, Art. no. 2, Sep. 2023, doi: 10.33093/jiwe.2023.2.2.4.

S. V. Mahadevkar et al., “A Review on Machine Learning Styles in Computer Vision—Techniques and Future Directions,†IEEE Access, vol. 10, pp. 107293–107329, 2022, doi: 10.1109/ACCESS.2022.3209825.

M.-C. Su, C.-T. Cheng, M.-C. Chang, and Y.-Z. Hsieh, “A Video Analytic In-Class Student Concentration Monitoring System,†IEEE Transactions on Consumer Electronics, vol. 67, no. 4, pp. 294–304, Nov. 2021, doi: 10.1109/TCE.2021.3126877.

M. M. A. Parambil, L. Ali, F. Alnajjar, and M. Gochoo, “Smart Classroom: A Deep Learning Approach towards Attention Assessment through Class Behavior Detection,†in 2022 Advances in Science and Engineering Technology International Conferences (ASET), Feb. 2022, pp. 1–6. doi: 10.1109/ASET53988.2022.9735018.

J. Zaletelj and A. Košir, “Predicting students’ attention in the classroom from Kinect facial and body features,†EURASIP Journal on Image and Video Processing, vol. 2017, no. 1, p. 80, Dec. 2017, doi: 10.1186/s13640-017-0228-8.

C. Thomas and D. B. Jayagopi, “Predicting student engagement in classrooms using facial behavioral cues,†in Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, in MIE 2017. New York, NY, USA: Association for Computing Machinery, Nov. 2017, pp. 33–40. doi: 10.1145/3139513.3139514.

X. Zhang, C.-W. Wu, P. Fournier-Viger, L.-D. Van, and Y.-C. Tseng, “Analyzing students’ attention in class using wearable devices,†in 2017 IEEE 18th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), Jun. 2017, pp. 1–9. doi: 10.1109/WoWMoM.2017.7974306.

Q. Deng and Z. Wu, “Students’ Attention Assessment in eLearning based on Machine Learning,†IOP Conf. Ser.: Earth Environ. Sci., vol. 199, no. 3, p. 032042, Dec. 2018, doi: 10.1088/1755-1315/199/3/032042.

N. Veliyath, P. De, A. A. Allen, C. B. Hodges, and A. Mitra, “Modeling Students’ Attention in the Classroom using Eyetrackers,†in Proceedings of the 2019 ACM Southeast Conference, in ACM SE ’19. New York, NY, USA: Association for Computing Machinery, Apr. 2019, pp. 2–9. doi: 10.1145/3299815.3314424.

D. Canedo, A. Trifan, and A. J. R. Neves, “Monitoring Students’ Attention in a Classroom Through Computer Vision,†in Highlights of Practical Applications of Agents, Multi-Agent Systems, and Complexity: The PAAMS Collection, vol. 887, J. Bajo, J. M. Corchado, E. M. Navarro Martínez, E. Osaba Icedo, P. Mathieu, P. Hoffa-Dąbrowska, E. Del Val, S. Giroux, A. J. M. Castro, N. Sánchez-Pi, V. Julián, R. A. Silveira, A. Fernández, R. Unland, and R. Fuentes-Fernández, Eds., in Communications in Computer and Information Science, vol. 887. , Cham: Springer International Publishing, 2018, pp. 371–378. doi: 10.1007/978-3-319-94779-2_32.

S. Li, Y. Dai, K. Hirota, and Z. Zuo, “A Students’ Concentration Evaluation Algorithm Based on Facial Attitude Recognition via Classroom Surveillance Video,†Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 24, no. 7, pp. 891–899, 2020, doi: 10.20965/jaciii.2020.p0891.

J. N. Mindoro, N. U. Pilueta, Y. D. Austria, L. Lolong Lacatan, and R. M. Dellosa, “Capturing Students’ Attention Through Visible Behavior: A Prediction Utilizing YOLOv3 Approach,†in 2020 11th IEEE Control and System Graduate Research Colloquium (ICSGRC), Aug. 2020, pp. 328–333. doi: 10.1109/ICSGRC49013.2020.9232659.

U. B. P. Shamika, W. A. C. Weerakoon, P. K. P. G. Panduwawala, and K. A. P. Dilanka, “Student concentration level monitoring system based on deep convolutional neural network,†in 2021 International Research Conference on Smart Computing and Systems Engineering (SCSE), Sep. 2021, pp. 119–123. doi: 10.1109/SCSE53661.2021.9568328.

K. Han et al., “A Survey on Vision Transformer,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 87–110, Jan. 2023, doi: 10.1109/TPAMI.2022.3152247.

X. Yu, J. Wang, Y. Zhao, and Y. Gao, “Mix-ViT: Mixing attentive vision transformer for ultra-fine-grained visual categorization,†Pattern Recognition, vol. 135, p. 109131, Mar. 2023, doi: 10.1016/j.patcog.2022.109131.

W. Sun et al., “Vicinity Vision Transformer,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 10, pp. 12635–12649, Oct. 2023, doi: 10.1109/TPAMI.2023.3285569.

Y. Lou, R. Wu, J. Li, L. Wang, X. Li, and G. Chen, “A Learning Convolutional Neural Network Approach for Network Robustness Prediction,†IEEE Transactions on Cybernetics, vol. 53, no. 7, pp. 4531–4544, Jul. 2023, doi: 10.1109/tcyb.2022.3207878.

L. Alzubaidi et al., “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,†Journal of Big Data, vol. 8, no. 1, p. 53, Mar. 2021, doi: 10.1186/s40537-021-00444-8.

A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.†arXiv, Jun. 03, 2021. doi: 10.48550/arXiv.2010.11929.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.†arXiv, May 24, 2019. doi: 10.48550/arXiv.1810.04805.

A. Vaswani et al., “Attention Is All You Need.†arXiv, Aug. 01, 2023. doi: 10.48550/arXiv.1706.03762.

B. Ghojogh, A. Ghodsi, F. Karray, and M. Crowley, “Johnson-Lindenstrauss Lemma, Linear and Nonlinear Random Projections, Random Fourier Features, and Random Kitchen Sinks: Tutorial and Survey.†arXiv, Aug. 09, 2021. doi: 10.48550/arXiv.2108.04172.

L. Lam and S. Y. Suen, “Application of majority voting to pattern recognition: an analysis of its behavior and performance,†IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 27, no. 5, pp. 553–568, Sep. 1997, doi: 10.1109/3468.618255.

DOI: http://dx.doi.org/10.18517/ijaseit.13.5.19035


  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development