Observing the Performance of the TextRank Algorithm on Automatic Text Summarization for Bahasa Indonesia

Dani Gunawan, Deden Witarsyah, Dedy Syamsuar, Amalia Amalia, - Abdurrohman, Romi Fadillah Rahmat

Abstract


The research about automatic text summarization is common in English text. According to the previous study, automatic text summarization in Bahasa Indonesia is still challenging due to research in this area, especially the research which discusses TextRank algorithm performance, which is still meagerly. Accordingly, this research observes the performance of the TextRank algorithm to summarize the text in Bahasa Indonesia. The TextRank algorithm summarizes a text by sorting out the essential words and relevant sentences regardless of the source language. This algorithm uses a vertex to represent a word. The similarity measurement process will calculate the overlapping words (the same word between two vertices). These overlapping words are represented by the edge, which connects the vertices. Thus, the text forms a graph. This research focuses on the similarity measurement process to determine relevant sentences in a text. As the similarity measurement is critical for the summarization result, this research switches the original process to the Levenshtein Distance algorithm and observes its performance. This research uses the human-produced summarized text by the expert in Bahasa Indonesia linguistics to evaluate the result. The evaluation method is conducted by using ROUGE-1 and ROUGE-2. The result shows that the average of ROUGE-1 and ROUGE-2 for the TextRank algorithm is 0.439 and 0.3186, respectively. Meanwhile, the modified TextRank obtains 0.3999 and 0.2805, respectively. Both of the algorithms have not shown satisfactory results as expected.

Keywords


Automatic text summarization; textrank; modified textrank; textrank performance; textrank Bahasa Indonesia

Full Text:

PDF

References


D. Jurafsky and J. H. Martin, Speech and Language Processing (2nd Edition), 2nd ed. Prentice-Hall, Inc., 2009.

V. Gupta and G. Singh Lehal, “A Survey of Text Summarization Extractive Techniques,†2010.

R. Ferreira et al., “A Context Based Text Summarization System,†in 2014 11th IAPR International Workshop on Document Analysis Systems, 2014, pp. 66–70.

R. Moro and M. Bielikov’, “Personalized Text Summarization Based on Important Terms Identification,†in 2012 23rd International Workshop on Database and Expert Systems Applications, 2012, pp. 131–135.

S. Liu, M. X. Zhou, S. Pan, W. Qian, W. Cai, and X. Lian, “Interactive, topic-based visual text summarization and analysis,†in Proceeding of the 18th ACM conference on Information and knowledge management - CIKM ’09, 2009, p. 543.

R. Mihalcea and P. Tarau, “TextRank: Bringing Order into Text,†Proc. 2004 Conf. Empir. Methods Nat. Lang. Process. , 2004.

B. Prasetyo, T. Uliniansyah, and O. Riandi, “SIDoBI: Indonesian Language Document Summarization System,†in International Conference on Rural Information and Communication Technology 2009, 2009, pp. 378–382.

P. P. Tardan, A. Erwin, K. I. Eng, and W. Muliady, “Automatic text summarization based on semantic analysis approach for documents in Indonesian language,†in 2013 International Conference on Information Technology and Electrical Engineering (ICITEE), 2013, pp. 47–52.

M. Fachrurrozi, N. Yusliani, and R. U. Yoanita, “Frequent Term based Text Summarization for Bahasa Indonesia,†in International Conference on Innovations in Engineering and Technology (ICIET’2013), 2013, pp. 30–32.

A. Ridok and T. C. Romadhona, “Peringkas dokumen otomatis menggunakan metode fuzzy model sistem inferensi Mamdani,†in Seminar Nasional Teknologi Informasi dan Multimedia, 2013, pp. 19–24.

Silvia, P. Rukmana, V. R. Aprilia, D. Suhartono, R. Wongso, and Meiliana, “Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm,†Proceeding Electr. Eng. Comput. Sci. Informatics, vol. 1, no. 1, pp. 148–153, Aug. 2014.

A. Ridok, “Peringkasan Dokumen Bahasa Indonesia Berbasis Non-Negative Matrix Factorization (NMF),†J. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 1, p. 39, Jul. 2014.

E. Y. Hidayat, F. Firdausillah, K. Hastuti, I. N. Dewi, A. Azhari, and A. Azhari, “Automatic Text Summarization Using Latent Drichlet Allocation (LDA) for Document Clustering,†Int. J. Adv. Intell. Informatics, vol. 1, no. 3, p. 132, Dec. 2015.

F. E. Gunawan, A. V. Juandi, and B. Soewito, “An automatic text summarization using text features and singular value decomposition for popular articles in Indonesia language,†in 2015 International Seminar on Intelligent Technology and Its Applications (ISITIA), 2015, pp. 27–32.

D. Gunawan, A. Amalia, and I. Charisma, “Automatic extraction of multiword expression candidates for Indonesian language,†in 2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), 2016, pp. 304–309.

P. M. Sabuna and D. B. Setyohadi, “Summarizing Indonesian text automatically by using sentence scoring and decision tree,†in 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE), 2017, pp. 1–6.

D. Gunawan, A. Pasaribu, R. F. Rahmat, and R. Budiarto, “Automatic Text Summarization for Indonesian Language Using TextTeaser,†IOP Conf. Ser. Mater. Sci. Eng., vol. 190, no. 1, p. 012048, Apr. 2017.

D. Gunawan, S. H. Harahap, and R. Fadillah Rahmat, “Multi-document Summarization by using TextRank and Maximal Marginal Relevance for Text in Bahasa Indonesia,†in Proceeding - 2019 International Conference on ICT for Smart Society: Innovation and Transformation Toward Smart Region, ICISS 2019, 2019.

D. Gunawan and A. Amalia, “Review of the recent research on automatic text summarization in Bahasa Indonesia,†in Proceedings of the 3rd International Conference on Informatics and Computing, ICIC 2018, 2018.

C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,†in Proc. ACL workshop on Text Summarization Branches Out, 2004.

I Ketut Gede Darma Putra,Rahmat Fauzi,Deden Witarsyah and I Putu Deva Jayantha Putra,"Classification of Tomato Plants Diseases Using Convolutional Neural Network," International Journal on Advanced Science, Engineering and Information Technology, vol. 10, no. 5, pp. 1821-1827, 2020, doi: 10.18517/ijaseit.10.5.11665.

I Ketut Gede Darma Putra,Deden Witarsyah,Muhardi Saputra and Putu Jhonarendra,"Palmprint Recognition Based on Edge Detection Features and Convolutional Neural Network," International Journal on Advanced Science, Engineering and Information Technology, vol. 11, no. 1, pp. 380-387, 2021, doi: 10.18517/ijaseit.11.1.11664.

L. C. Hao et al., "Mobile Malaysian Sign Language Application," 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bandung, Indonesia, 2022, pp. 1-5, doi: 10.1109/ICADEIS56544.2022.10037539.

Oka Sudana,Deden Witarsyah,Adhitya Putra and Sunia Raharja,"Mobile Application for Identification of Coffee Fruit Maturity using Digital Image Processing," International Journal on Advanced Science, Engineering and Information Technology, vol. 10, no. 3, pp. 980-986, 2020, doi: 10.18517/ijaseit.10.3.11135.

A. Rahmatulloh, R. I. Gunawan, I. Darmawan, R. Rizal and B. Z. Rahmat, "Optimization of Hijaiyah Letter Handwriting Recognition Model Based on Deep Learning," 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bandung, Indonesia, 2022, pp. 1-7, doi: 10.1109/ICADEIS56544.2022.10037496.

H. S. Zhou et al., "2D Mobile Vocab Library Learning Application," 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bandung, Indonesia, 2022, pp. 1-5, doi: 10.1109/ICADEIS56544.2022.10037521.

A. Priandhika Izzulhaq, R. Fauzi, S. Suakanto, A. Kadir Hassan Disina, H. Mahdin and I. Anka Salihu, "Development of User Management in Ihya Digital Ecosystem Using Iterative Incremental Method," 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bandung, Indonesia, 2022, pp. 01-06, doi: 10.1109/ICADEIS56544.2022.10037391.

N. K. Trivedi, R. G. Tiwari, A. Anand, V. Gautam, D. Witarsyah and A. Misra, "Application of Machine Learning for Diagnosis of Liver Cancer," 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS), Bandung, Indonesia, 2022, pp. 1-5, doi: 10.1109/ICADEIS56544.2022.10037379.

Ananthapadmanabha M V,Dhanesh Kumar A C,Sabariraju S,Eswar M and Mathi Senthilkumar,"Cluster Ensemble Method and Convolution Neural Network Model for Predicting Mental Illness," International Journal on Advanced Science, Engineering and Information Technology, vol. 13, no. 1, pp. 392-398, 2023, doi: 10.18517/ijaseit.13.1.17498.

Udoinyang G. Inyang,Funebi F. Ijebu,Francis B. Osang,Aderenle A. Afoluronsho,Samuel S. Udoh and Imo J. Eyoh,"A Dataset-Driven Parameter Tuning Approach for Enhanced K-Nearest Neighbour Algorithm Performance," International Journal on Advanced Science, Engineering and Information Technology, vol. 13, no. 1, pp. 380-391, 2023, doi: 10.18517/ijaseit.13.1.1676.




DOI: http://dx.doi.org/10.18517/ijaseit.13.3.14988

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development