Doc2Vec based Question and Answer Search System

HeeSeok Cho, Yong Kim


E-learning interaction acts as a positive factor, such as improving learning commitment and learning effect and reducing the dropout rate. As an important function of e-learning interaction, if a learner queries a content that is difficult to understand during learning, a question-and-answer bulletin board that responds to the question is provided by a professor. In the way that the instructor directly answers the learner's questions, real-time feedback is difficult, and the instructor's fatigue increases. The purpose of this study is to achieve the goal of reducing answering time and reducing answering costs by developing a question-and-answer search system that automatically searches for and provides answers to questions created by learners during learning. To this end, this study designed and implemented a question-and-answer search system that provides the most similar query answers to learners by analyzing questions and answers based on Doc2Vec, one of the word embedding technologies, which is a natural language processing technology.   By applying the results of this study to the question-and-answer system, it is expected that the learning effect can be enhanced by providing an immediate answer to the learner's question. In addition, organizations that pay response fees through the national budget, such as the Korea Educational Broadcasting Corporation, will be able to focus more on investments such as improving content quality through budget reduction.


eLearning; LMS; word2vec; doc2vec; question and answer search.

Full Text:



National IT Industry Promotion Agency(NIPA), Investigation on status of e-learning Industry, Ministry of Trade and Industry(MTI), 2018.

T.H. Kang, Case Studies on EBS CSAT, Educational Broadcasting System, 2016.

LAFLEN, Angela, SMITH, and Michelle,“Responding to student writing online: Tracking student interactions with instructor feedback in a Learning Management System,†Assessing Writing, vol 31,pp. 39-52. 2017.

W. Chang, Z. Xu, S. Zhou, Shenghan,and W. Cao, "Research on detection methods based on Doc2vec abnormal comments," Future generations computer systems,vol.86, pp. 656-662, 2018

D.H. Kim, D.S. Seo,S. Y. Cho,and P. S. Kang, "Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec," Information sciences,vol. 477 pp. 15-29. 2019.

B. Pan, C. C. Yu, O. C. Zhang, S. X. Xu, and S. Cao, "The Improved Model for word2vec Based on Part of Speech and Word Order," ACTA ELECTRONICA SINICA vol. 46, pp 1976-1982, 2018.

V. A.Nrusimha,W. Hannah,H. J. David,K.Matcheri, and T.J. Blake, "Chatbots and Conversational Agents in Mental Health: A Review of the Psychiatric Landscape," Canadian journal of psychiatry, vol. 64, pp. 456-464, 2019.

M. Tomas, “Efficient estimation of word representations in vector space,†arXiv preprint arXiv, 2013.

M.Mudasir, J.Rafiya, and S. Muzaffar, "Text document summarization using word embedding," Expert systems with applications, vol. 143, 2020.

(2017) Python Korean NLP website. [online]. Available:

(2015) KoNLPy website.[online]. Available: tagging-classes

J.Yaser, A. A. Mahmoud, and B.Elhadj, "Advanced Arabic Natural Language Processing (ANLP) and its applications: Introduction to the special issue," Information processing & management,vol. 56, pp. 259-261, 2019.

C.Jingqiang,Z. Hai, "Extractive summarization of documents with images based on multi-modal RNN," Future generations computer systems, vol. 99, pp. 186-196, 2019.

Y. Cheng, Z. Ye, M.Wang, and Q. Zhang, "Document classification based on convolutional neural network and hierarchical attention network", Neural Network World, vol. 29, pp. 83-98, 2019.

F. Yang,F.Lidan, "Ontology semantic integration based on convolutional neural network,"Neural Computing And Applications, vol. 31, pp. 8253-8266, 2019.

I. Alsmadi,H. G.Keng, "Term weighting scheme for short-text classification: Twitter corpuses,"Neural Computing And Applications, vol. 31, pp. 3819-3831, 2019.

S. N. Bhushan, A. Danti, "Classification of text documents based on score level fusion approach," Pattern recognition letters, vol. 94, pp. 118-126, 2017.

M.Chowkwanyun, "Big Data, Large-Scale Text Analysis, and Public Health Research," American journal of public health, vol. 109, pp. S126-S127, 2019.

Y. Q. Song, U.Shyam,P.Haoruo,M. Stephen, and R. Dan. "Toward any-language zero-shot topic classification of textual documents" Artificial intelligence,vol. 274, pp. 133-150, 2019.

K. Hu, H. Wu, K. Qi, J. Yu , S. Yang, T. Yu , J. Zheng, andB. Liu, "A domain keyword analysis approach extending Term Frequency-Keyword Active Index with Google Word2Vec model," Scientometrics, vol. 114, pp. 1031-1068, 2018.



  • There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development