Dynamic Sign Language Recognition Using Mediapipe Library and Modified LSTM Method

- Ridwang, Amil Ahmad Ilham, Ingrid Nurtanio, - Syafaruddin

Abstract


Hand gesture recognition (HGR) is a primary mode of communication and human involvement. While HGR can be used to enhance user interaction in human-computer interaction (HCI), it can also be used to overcome language barriers. For example, HGR could be used to recognize sign language, which is a visual language expressed by hand movements, poses, and faces, and used as a basic communication mode by deaf people around the world. This research aims to create a new method to detect dynamic hand movements, poses, and faces in sign language translation systems. The Long Short-Term Memory Modification (LSTM) approach and the Mediapipe library are used to recognize dynamic hand movements. In this study, twenty dynamic movements that match the context were designed to solve the challenge of identifying dynamic signal movements. Sequences and image processing data are collected using MediaPipe Holistic, processed, and trained using the LSTM Modification method. This model is practiced using training and validation data and a test set to evaluate it. The training evaluation results using the confusion matrix achieved an average accuracy of twenty words trained, which was 99.4% with epoch 150. The results of experiments per word showed detection accurateness of 85%, while experiments using sentences only reached 80%. The research carried out is a significant step forward in advancing the accuracy and practice of the dynamic sign language recognition system, promising better communication and accessibility for deaf people.

Keywords


Deaf people; modification LSTM; static sign; words sign; sentence sign

Full Text:

PDF

References


B. Sundar and T. Bagyammal, "American Sign Language Recognition for Alphabets Using MediaPipe and LSTM," Procedia Comput Sci, vol. 215, pp. 642–651, 2022, doi: 10.1016/j.procs.2022.12.066.

Y. SHI, Y. LI, X. FU, M. I. A. O. Kaibin, and M. I. A. O. Qiguang, "Review of dynamic gesture recognition," Virtual Reality and Intelligent Hardware, vol. 3, no. 3. KeAi Communications Co., pp. 183–206, Jun. 01, 2021. doi: 10.1016/j.vrih.2021.05.001.

D. K. Jain, A. Kumar, and S. R. Sangwan, "TANA: The amalgam neural architecture for sarcasm detection in indian indigenous language combining LSTM and SVM with word-emoji embeddings," Pattern Recognit Lett, vol. 160, pp. 11–18, Aug. 2022, doi: 10.1016/J.PATREC.2022.05.026.

Y. S. Tan, K. M. Lim, and C. P. Lee, "Hand gesture recognition via enhanced densely connected convolutional neural network," Expert Syst Appl, vol. 175, Aug. 2021, doi: 10.1016/j.eswa.2021.114797.

P. K. Athira, C. J. Sruthi, and A. Lijiya, "A Signer Independent Sign Language Recognition with Co-articulation Elimination from Live Videos: An Indian Scenario," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 3, pp. 771–781, Mar. 2022, doi: 10.1016/j.jksuci.2019.05.002.

R. O. Maimon-Mor et al., "Talking with Your (Artificial) Hands: Communicative Hand Gestures as an Implicit Measure of Embodiment," iScience, vol. 23, no. 11, Nov. 2020, doi: 10.1016/j.isci.2020.101650.

V. Adithya and R. Rajesh, "Hand gestures for emergency situations: A video dataset based on words from Indian sign language," Data Brief, vol. 31, Aug. 2020, doi: 10.1016/j.dib.2020.106016.

A. P. G and A. P. k, "Design of an integrated learning approach to assist real-time deaf application using voice recognition system," Computers and Electrical Engineering, vol. 102, p. 108145, Sep. 2022, doi: 10.1016/J.COMPELECENG.2022.108145.

C. Hinchcliffe et al., "Language comprehension in the social brain: Electrophysiological brain signals of social presence effects during syntactic and semantic sentence processing," Cortex, vol. 130, pp. 413–425, Sep. 2020, doi: 10.1016/J.CORTEX.2020.03.029.

M. Suneetha, P. MVD, and K. PVV, "Multi-view motion modelled deep attention networks (M2DA-Net) for video based sign language recognition," J Vis Commun Image Represent, vol. 78, p. 103161, Jul. 2021, doi: 10.1016/J.JVCIR.2021.103161.

K. Sadeddine, Z. F. Chelali, R. Djeradi, A. Djeradi, and S. Ben Abderrahmane, "Recognition of user-dependent and independent static hand gestures: Application to sign language," J Vis Commun Image Represent, vol. 79, p. 103193, Aug. 2021, doi: 10.1016/J.JVCIR.2021.103193.

L. R. Cerna, E. E. Cardenas, D. G. Miranda, D. Menotti, and G. Camara-Chavez, "A multimodal LIBRAS-UFOP Brazilian sign language dataset of minimal pairs using a microsoft Kinect sensor," Expert Syst Appl, vol. 167, p. 114179, Apr. 2021, doi: 10.1016/J.ESWA.2020.114179.

R. Solgi, H. A. Loáiciga, and M. Kram, "Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations," J Hydrol (Amst), vol. 601, Oct. 2021, doi: 10.1016/j.jhydrol.2021.126800.

L. Gao, H. Li, Z. Liu, Z. Liu, L. Wan, and W. Feng, "RNN-Transducer based Chinese Sign Language Recognition," Neurocomputing, vol. 434, pp. 45–54, Apr. 2021, doi: 10.1016/J.NEUCOM.2020.12.006.

S. Subburaj and S. Murugavalli, "Survey on sign language recognition in context of vision-based and deep learning," Measurement: Sensors, vol. 23, p. 100385, Oct. 2022, doi: 10.1016/J.MEASEN.2022.100385.

K. Anand, S. Urolagin, and R. K. Mishra, "How does hand gestures in videos impact social media engagement - Insights based on deep learning," International Journal of Information Management Data Insights, vol. 1, no. 2, Nov. 2021, doi: 10.1016/j.jjimei.2021.100036.

R. Solgi, H. A. Loáiciga, and M. Kram, "Long short-term memory neural network (LSTM-NN) for aquifer level time series forecasting using in-situ piezometric observations," J Hydrol (Amst), vol. 601, Oct. 2021, doi: 10.1016/j.jhydrol.2021.126800.

P. K. Athira, C. J. Sruthi, and A. Lijiya, "A Signer Independent Sign Language Recognition with Co-articulation Elimination from Live Videos: An Indian Scenario," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 3, pp. 771–781, Mar. 2022, doi: 10.1016/j.jksuci.2019.05.002.

G. A. Rao and P. V. V. Kishore, "Selfie video based continuous Indian sign language recognition system," Ain Shams Engineering Journal, vol. 9, no. 4, pp. 1929–1939, Dec. 2019, doi: 10.1016/j.asej.2016.10.013.

S. K. Devi and S. CN, "Intelligent Deep Learning Empowered Text Detection Model from Natural Scene Images," Int J Adv Sci Eng Inf Technol, vol. 12, no. 3, pp. 1263–1268, 2022, Accessed: Nov. 07, 2023. [Online]. Available: http://dx.doi.org/10.18517/ijaseit.12.3.15771

R. Sreemathy, M. P. Turuk, S. Chaudhary, K. Lavate, A. Ushire, and S. Khurana, "Continuous word level sign language recognition using an expert system based on machine learning," International Journal of Cognitive Computing in Engineering, vol. 4, pp. 170–178, Jun. 2023, doi: 10.1016/J.IJCCE.2023.04.002.

M. A. Almasre and H. Al-Nuaim, "A comparison of Arabic sign language dynamic gesture recognition models," Heliyon, vol. 6, no. 3, p. e03554, Mar. 2020, doi: 10.1016/J.HELIYON.2020.E03554.

Y. S. Tan, K. M. Lim, and C. P. Lee, "Hand gesture recognition via enhanced densely connected convolutional neural network," Expert Syst Appl, vol. 175, Aug. 2021, doi: 10.1016/j.eswa.2021.114797.

R. Gupta and A. Kumar, "Indian sign language recognition using wearable sensors and multi-label classification," Computers & Electrical Engineering, vol. 90, p. 106898, Mar. 2021, doi: 10.1016/J.COMPELECENG.2020.106898.

J. Bora, S. Dehingia, A. Boruah, A. A. Chetia, and D. Gogoi, "Real-time Assamese Sign Language Recognition using MediaPipe and Deep Learning," Procedia Comput Sci, vol. 218, pp. 1384–1393, Jan. 2023, doi: 10.1016/J.PROCS.2023.01.117.

I. A. Putra, O. D. Nurhayati, and D. Eridani, “Human Action Recognition (HAR) Classification Using MediaPipe and Long Short-Term Memory (LSTM),†TEKNIK, vol. 43, no. 2, pp. 190–201, Aug. 2022, doi: 10.14710/teknik.v43i2.46439.

B. Subramanian, B. Olimov, S. M. Naik, S. Kim, K. H. Park, and J. Kim, "An integrated mediapipe-optimized GRU. model for Indian sign language recognition," Sci Rep, vol. 12, no. 1, Dec. 2022, doi: 10.1038/s41598-022-15998-7.

R. Ridwang, I. Nurtanio, A. Ahmad Ilham, and S. Syafaruddin, "Deaf Sign Language Translation System with Pose and Hand Gesture Detection Under LSTM-Sequence Classification Model," ICIC Express Letters, vol. 17, no. 7, pp. 809–816, 2023, doi: 10.24507/icicel.17.07.809.

Q. Xiao, X. Chang, X. Zhang, and X. Liu, "Multi-Information Spatial-Temporal LSTM Fusion Continuous Sign Language Neural Machine Translation," IEEE Access, vol. 8, pp. 216718–216728, 2020, doi: 10.1109/ACCESS.2020.3039539.

B. Verma, "A two stream convolutional neural network with bi-directional GRU. model to classify dynamic hand gesture," J Vis Commun Image Represent, vol. 87, p. 103554, Aug. 2022, doi: 10.1016/J.JVCIR.2022.103554.

Q. Xiao, X. Chang, X. Zhang, and X. Liu, "Multi-Information Spatial-Temporal LSTM Fusion Continuous Sign Language Neural Machine Translation," IEEE Access, vol. 8, pp. 216718–216728, 2020, doi: 10.1109/ACCESS.2020.3039539.

A. S. Agrawal, A. Chakraborty, and C. M. Rajalakshmi, "Real-Time Hand Gesture Recognition System Using MediaPipe and LSTM," International Journal of Research Publication and Reviews, vol. 3, pp. 2509–2515, 2022, [Online]. Available: www.ijrpr.com




DOI: http://dx.doi.org/10.18517/ijaseit.13.6.19401

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development