Software Traceability in Agile Development Using Topic Modeling

Nuraisa Novia Hidayati, Siti Rochimah, Agus Budi Raharjo


Tracing the implementation of requirements for making better software identifies whether the application fulfils users' desires; progress of development; problematic areas in the testing process, and how far those apply to the source code. In this paper, the software development method we studied was the agile method, Extreme Programming (XP). The artifacts in the agile approach considered vital include the requirement documents, test documents, and source codes. We used Topic Modelling to map the content similarities from those documents to make trace links. The three topic modelling methods we compared consist of Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA), and Non-negative Matrix Factorization (NMF). The NMF method proved itself the most stable, with an accuracy value of 67% for the requirement, 59% for testing, and 48% for defect lists. The second application results proved more accurate with 70%, 79%, and 54%. Although NMF lost to LSA in the second application (LSA achieved an accuracy of 79%, 84%, and 56%), the precision and recall values showed almost similar results. We successfully found the link in the source code based on keywords extracted from each topic. This research provides a way of explaining the requirement in detail, simplifying it for tracing purposes such as the consistent use of terms, technical details inclusion, and mentioning all the variables involved. In the future, sentence structure and synonyms need recognition as part of pre-processing to build better trace links.


Software traceability; Agile; topic modeling; latent semantic analysis; latent dirichlet allocation; non-negative matrix factorization.

Full Text:



N. N. Hidayati and S. Rochimah, "Requirements traceability for detecting defects in agile software development," EECCIS 2020 - 2020 10th Electr. Power, Electron. Commun. Control. Informatics Semin., pp. 248–253, 2020, doi: 10.1109/EECCIS49483.2020.9263420.

B. Wang, R. Peng, Y. Li, H. Lai, and Z. Wang, "Requirements traceability technologies and technology transfer decision support: A systematic review," J. Syst. Softw., vol. 146, pp. 59–79, 2018, doi: 10.1016/j.jss.2018.09.001.

T. Vale, E. S. de Almeida, V. Alves, U. Kulesza, N. Niu, and R. de Lima, "Software product lines traceability: A systematic mapping study," Inf. Softw. Technol., vol. 84, pp. 1–18, 2017, doi: 10.1016/j.infsof.2016.12.004.

C. Mills, J. Escobar-Avila, and S. Haiduc, "Automatic traceability maintenance via machine learning classification," Proc. - 2018 IEEE Int. Conf. Softw. Maint. Evol. ICSME 2018, pp. 369–380, 2018, doi: 10.1109/ICSME.2018.00045.

D. Nanang, P. L. Penelusuran, and P. L. Penelusuran, “Pembangunan Link Penelusuran Kebutuhan Fungsional Dan Method Pada Kode Sumber Dengan Metode Pengambilan Informasi,†ELTEK, vol. 16, pp. 151–165, 2018, [Online]. Available:

A. S. Ahmadiyah, R. Sarno, and F. Revindasari, "Adopted topic modeling for business process and software component conformity checking," Telkomnika (Telecommunication Comput. Electron. Control., vol. 18, no. 6, pp. 2939–2947, 2020, doi: 10.12928/TELKOMNIKA.v18i6.13381.

S. Rani and M. Kumar, "Topic modeling and its applications in materials science and engineering," Mater. Today Proc., vol. 45, pp. 5591–5596, 2021, doi: 10.1016/j.matpr.2021.02.313.

J. Zhao, Q. P. Feng, P. Wu, J. L. Warner, J. C. Denny, and W. Q. Wei, "Using topic modeling via non-negative matrix factorization to identify relationships between genetic variants and disease phenotypes: A case study of Lipoprotein(a) (LPA)," PLoS One, vol. 14, no. 2, pp. 1–15, 2019, doi: 10.1371/journal.pone.0212112.

R. Albalawi, T. H. Yeap, and M. Benyoucef, "Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis," Front. Artif. Intell., vol. 3, no. July, pp. 1–14, 2020, doi: 10.3389/frai.2020.00042.

Y. Chen, H. Zhang, R. Liu, Z. Ye, and J. Lin, "Experimental explorations on short text topic mining between LDA and NMF based Schemes," Knowledge-Based Syst., vol. 163, pp. 1–13, 2019, doi: 10.1016/j.knosys.2018.08.011.

Q. Fu, Y. Zhuang, J. Gu, Y. Zhu, and X. Guo, "Agreeing to Disagree: Choosing Among Eight Topic-Modeling Methods," Big Data Res., vol. 23, p. 100173, 2021, doi: 10.1016/j.bdr.2020.100173.

H. Kaiya, A. Hazeyama, S. Ogata, T. Okubo, N. Yoshioka, and H. Washizaki, "Towards a knowledge base for software developers to choose suitable traceability techniques," Procedia Comput. Sci., vol. 159, pp. 1075–1084, 2019, doi: 10.1016/j.procs.2019.09.276.

A. Guo and T. Yang, "Research and improvement of feature words weight based on TFIDF algorithm," Proc. 2016 IEEE Inf. Technol. Networking, Electron. Autom. Control Conf. ITNEC 2016, pp. 415–419, 2016, doi: 10.1109/ITNEC.2016.7560393.

H. Suhartoyo and S. Rochimah, “Membangun Hubungan Kerunutan Artifak Pada Lingkungan Pengembangan Cepat,†SYSTEMIC, vol. 02, no. 01, pp. 1–17, 2016.

P. M. Prihatini, I. Putra, I. Giriantari, and M. Sudarma, "Indonesian text feature extraction using gibbs sampling and mean variational inference latent dirichlet allocation," QiR 2017 - 2017 15th Int. Conf. Qual. Res. Int. Symp. Electr. Comput. Eng., vol. 2017-Decem, pp. 40–44, 2017, doi: 10.1109/QIR.2017.8168448.

P. Suri and N. R. Roy, "Comparison between LDA & NMF for event-detection from large text stream data," 3rd IEEE Int. Conf. , pp. 1–5, 2017, doi: 10.1109/CIACT.2017.7977281.

T. D. Hien, D. Van Tuan, P. Van At, and L. H. Son, "Novel algorithm for non-negative matrix factorization," New Math. Nat. Comput., vol. 11, no. 2, pp. 121–133, 2015, doi: 10.1142/S1793005715400013.

H. Dalianis and H. Dalianis, "Evaluation Metrics and Evaluation," Clin. Text Min., no. 1967, pp. 45–53, 2018, doi: 10.1007/978-3-319-78503-5_6.

S. Vanbelle, "Comparing dependent kappa coefficients obtained on multilevel data," Biometrical J., vol. 59, no. 5, pp. 1016–1034, 2017, doi: 10.1002/bimj.201600093.

E. Bagli and G. Visani, "Metrics for Multi-Class Classification : an Overview," arXiv, vol. abs/2008.0, pp. 1–17, 2020.

S. A. Curiskis, B. Drake, T. R. Osborn, and P. J. Kennedy, "An evaluation of document clustering and topic modelling in two online social networks : Twitter and Reddit," Inf. Process. Manag., vol. 57, no. 2, p. 102034, 2020, doi: 10.1016/j.ipm.2019.04.002.

D. Braun and M. Langen, "Evaluating Natural Language Understanding Services for Conversational Question Answering Systems," Proc. 18th Annu. {SIG}dial Meet. Discourse Dialogue, no. August, pp. 174–185, 2017.

M. Belford, B. Mac Namee, and D. Greene, "Stability of topic modeling via matrix factorization," Expert Syst. Appl., vol. 91, pp. 159–169, 2018, doi: 10.1016/j.eswa.2017.08.047.

R. M. Suleman and I. Korkontzelos, "Extending latent semantic analysis to manage its syntactic blindness," Expert Syst. Appl., vol. 165, no. January 2020, p. 114130, 2021, doi: 10.1016/j.eswa.2020.114130.

A Amalia et al, "Automated Bahasa Indonesia essay evaluation with latent semantic analysis Automated Bahasa Indonesia essay evaluation with latent semantic analysis," J. Phys. Conf. Ser. 1235 012100, pp. 0–8, 2019, doi: 10.1088/1742-6596/1235/1/012100.

J. A. Lossio-Ventura, S. Gonzales, J. Morzan, H. Alatrista-Salas, T. Hernandez-Boussard, and J. Bian, "Evaluation of clustering and topic modeling methods over health-related tweets and emails," Artif. Intell. Med., vol. 117, no. March, p. 102096, 2021, doi: 10.1016/j.artmed.2021.102096.



  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development