Evaluation of Average Term Occurrences Weighting Technique for Arabic Textual Information Retrieval
Abstract
Keywords
Full Text:
PDFReferences
R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieal, vol. 9. ACM Press NewYourk, 1999.
E. Amigó, F. Giner, J. Gonzalo, and F. Verdejo, “On the foundations of similarity in information access,†Inf. Retr. J., vol. 23, no. 3, pp. 216–254, 2020, doi: 10.1007/s10791-020-09375-z.
D. Harman, “Information Retrieval: The Early Years,†Found. Trends® Inf. Retr., vol. 13, no. 5, pp. 425–577, 2019.
G. Domeniconi, G. Moro, R. Pasolini, and C. Sartori, “A study on term weighting for text categorization: A novel supervised variant of tf.idf,†DATA 2015 - 4th Int. Conf. Data Manag. Technol. Appl. Proc., pp. 26–37, 2015, doi: 10.5220/0005511900260037.
Z. H. Deng, K. H. Luo, and H. L. Yu, “A study of supervised term weighting scheme for sentiment analysis,†Expert Syst. Appl., vol. 41, no. 7, pp. 3506–3513, 2014, doi: 10.1016/j.eswa.2013.10.056.
D. Jones et al., “Improving engineering information retrieval by combining TD-IDF and product structure classification,†Proc. Int. Conf. Eng. Des. ICED, vol. 6, no. DS87-6, pp. 41–50, 2017.
S. Robertson, “Understanding inverse document frequency: On theoretical arguments for IDF,†J. Doc., vol. 60, no. 5, pp. 503–520, 2004, doi: 10.1108/00220410410560582.
I. A. & F. A. Belal Abuata, “Improving arabic question answering system by merging aner technique, updated question classification technique and stop words technique,†J. Theor. Appl. Inf. Technol., vol. 98, no. 23, pp. 24–38, 2020.
K. Chen, Z. Zhang, J. Long, and H. Zhang, “Turning from TF-IDF to TF-IGM for term weighting in text classification,†Expert Syst. Appl., vol. 66, pp. 1339–1351, 2016, doi: 10.1016/j.eswa.2016.09.009.
A. El Mahdaouy, S. O. El Alaoui, and E. Gaussier, “Semantically enhanced term frequency based on word embeddings for Arabic information retrieval,†Colloq. Inf. Sci. Technol. Cist, vol. 0, pp. 385–389, 2016, doi: 10.1109/CIST.2016.7805076.
O. A. S. Ibrahim and D. Landa-Silva, “Term frequency with average term occurrences for textual information retrieval,†Soft Comput., vol. 20, no. 8, pp. 3045–3061, 2016, doi: 10.1007/s00500-015-1935-7.
R. Bentrcia, S. Zidat, and F. Marir, “Extracting semantic relations from the Quranic Arabic based on Arabic conjunctive patterns,†J. King Saud Univ. - Comput. Inf. Sci., vol. 30, no. 3, pp. 382–390, 2018, doi: 10.1016/j.jksuci.2017.09.004.
B. Abuata and A. Al-Omari, “A rule-based stemmer for Arabic Gulf dialect,†J. King Saud Univ. - Comput. Inf. Sci., vol. 27, no. 2, pp. 104–112, 2015, doi: 10.1016/j.jksuci.2014.04.003.
A. El Mahdaouy, É. Gaussier, and S. O. El Alaoui, “Exploring term proximity statistic for Arabic information retrieval,†Colloq. Inf. Sci. Technol. Cist, vol. 2015-Janua, no. January, pp. 272–277, 2015, doi: 10.1109/CIST.2014.7016631.
A. A. A. A. Abdulla, H. Lin, B. Xu, and S. K. Banbhrani, “Improving biomedical information retrieval by linear combinations of different query expansion techniques,†BMC Bioinformatics, vol. 17, no. 2, 2016, doi: 10.1186/s12859-016-1092-8.
A. Aizawa, “An information-theoretic perspective of tf-idf measures,†Inf. Process. Manag., vol. 39, no. 1, pp. 45–65, 2003, doi: 10.1016/S0306-4573(02)00021-3.
R. Jin, C. Falusos, and A. G. Hauptmann, “Meta-scoring: Automatically evaluating term weighting schemes in IR without precision-recall,†SIGIR Forum (ACM Spec. Interes. Gr. Inf. Retrieval), pp. 83–89, 2001.
G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,†Inf. Process. Manag., vol. 24, no. 5, pp. 513–523, 1998.
Z. S. Zubi, “Using some web content mining techniques for Arabic text classification,†Proc. 8th WSEAS Int. Conf. Data Networks, Commun. Comput. DNCOCO ’09, pp. 73–84, 2009.
M. Habib, “An intelligent system for automated arabic text categorization,†2008.
S. E. Robertson, S. Walker, and M. M. Hancock-Beaulieu, “Large test collection experiments on an operational, interactive system: Okapi at TREC,†Inf. Process. Manag., vol. 31, no. 3, pp. 345–360, 1995, doi: 10.1016/0306-4573(94)00051-4.
S. Jimenez, S. P. Cucerzan, F. A. Gonzalez, A. Gelbukh, and G. Dueñas, “BM25-CTF: Improving TF and IDF factors in BM25 by using collection term frequencies,†J. Intell. Fuzzy Syst., vol. 34, no. 5, pp. 2887–2899, 2018, doi: 10.3233/JIFS-169475.
G. Pandey, Z. Ren, S. Wang, J. Veijalainen, and M. de Rijke, “Linear feature extraction for ranking,†Inf. Retr. J., vol. 21, no. 6, pp. 481–506, 2018, doi: 10.1007/s10791-018-9330-5.
G. A. Tinega, P. W. Mwangi, and D. R. Rimiru, “Text Mining in Digital Libraries using OKAPI BM25 Model,†Int. J. Comput. Appl. Technol. Res., vol. 7, no. 10, pp. 398–406, 2018, doi: 10.7753/ijcatr0710.1003.
A. Lipani, T. Roelleke, M. Lupu, and A. Hanbury, A systematic approach to normalization in probabilistic models, vol. 21, no. 6. Springer Netherlands, 2018.
M. Saad and W. Ashour, “OSAC: Open Source Arabic Corpora,†6th Int. Conf. Electr. Comput. Syst. (EECS’10), Nov 25-26, 2010, Lefke, Cyprus., pp. 118–123, 2010.
Nicola Ferro, “Reproducibility Challenges in Information Retrieval Evaluation,†J. Data Inf. Qual., vol. 8, no. 2, pp. 1–4, 2017.
DOI: http://dx.doi.org/10.18517/ijaseit.12.6.13215
Refbacks
- There are currently no refbacks.
Published by INSIGHT - Indonesian Society for Knowledge and Human Development