Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

Yin-Lai Yeong; Tien-Ping Tan; Keng Hoon Gan; Siti Khaotijah Mohammad

doi:10.18517/ijaseit.8.4-2.6816

Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

Yin-Lai Yeong, Tien-Ping Tan, Keng Hoon Gan, Siti Khaotijah Mohammad

Abstract

Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) are the state-of-the-art approaches in machine translation (MT). The translation produced by a SMT is based on the statistical analysis of text corpora, while NMT uses deep neural network to model and to generate a translation. SMT and NMT have their strength and weaknesses. SMT may produce better translation with a small parallel text corpus compared to NMT. Nevertheless, when the amount of parallel text available is large, the quality of the translation produced by NMT is often higher than SMT. Besides that, study also shown that the translation produced by SMT is better than NMT in cases where there is a domain mismatch between training and testing. SMT also has an advantage on long sentences. In addition, when a translation produced by an NMT is wrong, it is very difficult to find the error. In this paper, we investigate a hybrid approach that combine SMT and NMT to perform English to Malay translation. The motivation of using a hybrid machine translation is to combine the strength of both approaches to produce a more accurate translation. Our approach uses the multi-source encoder-decoder long short-term memory (LSTM) architecture. The architecture uses two encoders, one to embed the sentence to be translated, and another encoder to embed the initial translation produced by SMT. The translation from the SMT can be viewed as a â€œsuggestion translationâ€ to the neural MT. Our experiments show that the hybrid MT increases the BLEU scores of our best baseline machine translation in computer science domain and news domain from 21.21 and 48.35 to 35.97 and 61.81 respectively.

Keywords

Hybrid Machine Translation; Statistical Machine Translation; Neural Machine Translation

Full Text:

PDF

References

D. Bahdanau, K. Cho and Y. Bengio, â€œNeural machine translation by jointly learning to align and translateâ€. CoRR abs/1409.0473, 2014.

T. Luong, H. Pham and D. C. Manning, â€œEffective approaches to attention-based neural machine translation,â€ in Proc. Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal, pp. 1412â€“1421, Sept. 2015.

P. Koehn and R. Knowles, â€œSix challenges for neural machine translation,â€ in Proc. Workshop on Neural Machine Translation, pp. 28â€“39, 2017.

K. Cho, B. Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk and Y. Bengio, â€œLearning phrase representations using RNN encoder-decoder for statistical machine translation,â€ in Arxiv preprint arXiv:1406.1078, 2014.

I. Sutskever, O. Vinyals and Q. V. Le, â€œSequence to sequence learning with neural networks,â€ in Proc. International Conference on Neural Information Processing Systems, pp. 3104-3112, Dec. 2014.

L. Dahlmann, E. Matusov, P. Petrushkov and S. Khadivi, â€œNeural machine translation leveraging phrase-based models in a hybrid search,â€ in Proc. Conference on Empirical Methods in Natural Language Processing, Sept 2017.

F. Stahlberg, A. de Gispert, E. Hasler and B. Byrne, â€œNeural machine translation by minimising the Bayes-risk with respect to syntactic translation lattices,â€ in Proc. Conference of the European Chapter of the Association for Computational Linguistics, vol 2, pp. 362â€“368, April 2017.

X. Wang, Z. Lu, Z. Tu, H. Li, D. Xiong, and M. Zhang, â€œNeural machine translation advised by statistical machine translation,â€ in Proc. AAAI Conference on Artificial Intelligence, 2017.

J. Du and A. Way, â€œNeural pre-translation for hybrid machine translation,â€ in Proc. MT Summit XVI, vol.1, pp. 27â€“40, Sept. 2017.

R. Dabre, F. Cromieres and S. Kurohashi, â€œEnabling multi-source neural machine translation by concatenating source sentences in multiple languagesâ€. arXiv preprint arXiv:1702.06135. 2017.

B. Zoph and K. Knight, â€œMulti-source neural translation,â€ in Proc. NAACL-HLT, pp. 30-34, June 2016.

J. Zhang, Q. Liu and J. Zhou, â€œME-MD: An effective framework for neural machine translation with multiple encoders and decoders,â€ in Proc. IJCAI, pp. 3392-3398, Aug. 2017.

P. Koehn, â€œPharaoh: a beam search decoder for phrase-based statistical machine translation models,â€ in Proc. AMTA, pp. 115-124, Sept. 2004.

P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin and E. Herbst, â€œMoses: Open Source Toolkit for Statistical Machine Translation,â€ in Proc. ACL 2007, June 2007.

M. Olteanu, C. Davis, I. Volosen and D. Moldovan, â€œPhramer â€“ an open source statistical phrase-based translator,â€ in Proc. Workshop on Statistical Machine Translation, pp. 146-149, June. 2006.

F. J. Och and H. Ney, â€œA systematic comparison of various statistical alignment models,â€ in Proc. Computational Linguistics, vol. 29, no.1, pp. 19â€“51, 2003.

Y. Deng and W. Byrne, â€œMKKT: An alignment toolkit for statistical machine translation,â€ in Proc. Human Language Technology Conference of the NAACL, pp. 265-268, 2006.

I. Mohd Yassin, R. Jailani, M. S. A. Megat Ali, R. Baharom, A. H. Abu Hassan and Z. I. Rizman, â€œComparison between Cascade Forward and Multi-Layer Perceptron Neural Networks for NARX Functional Electrical Stimulation (FES)-Based Muscle Model,â€ International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 1, pp. 215-221, 2017.

M. A. Nielsen, Neural Networks and Deep Learning. Determination Press, 2015.

A. A. Amri, A. R. Ismail and A. Ahmad Zarir, â€œConvolutional neural networks and deep belief networks for analysing imbalanced class issue in handwritten dataset,â€ International Journal on Advanced Science, Engineering and Information Technology, vol. 7, no. 6, pp. 2302-2307, 2017.

I. Sutskever, O. Vinyals and Q. V. Le, â€œSequence to Sequence Learning with Neural Networksâ€. Advances in Neural Information Processing Systems, pp. 3104-3112, 2014.

A. GÃ©ron, Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. California, US: O'Reilly Media. 2017.

R. Sennrich, M. Volk, â€œMT-based Sentence Alignment for OCR-generated Parallel Texts,â€ in Proc. AMTA 2010, 2010.

(2018) The MalaysiaKini website. [Online]. Available: https://www.malaysiakini.com/

T.-P. Tan, H. Li, E. K. Tang, X. Xiao and E. S. Chng, â€œMASS: A Malay Language LVCSR Corpus Resource,â€ in Proc. Oriental Cocosda, pp. 25-30, Aug. 2009.

A. Stolcke, â€œSRILM â€“ an extensible language modeling toolkit,â€ in Proc. International Conference on Spoken Language Processing, pp. 901â€“904, 2002.

A. BÃ©rard, O. Pietquin, L. Besacier and C. Servan, â€œListen and translate: A Proof of Concept for End-to-End Speech-to-Text Translation,â€ in Proc. NIPS, pp. 1â€“5, 2016.

DOI: http://dx.doi.org/10.18517/ijaseit.8.4-2.6816

Refbacks

There are currently no refbacks.

Published by INSIGHT - Indonesian Society for Knowledge and Human Development