UP - logo
E-resources
Full text
Peer reviewed
  • Incorporating Statistical M...
    Wang, Xing; Tu, Zhaopeng; Zhang, Min

    IEEE/ACM transactions on audio, speech, and language processing, 12/2018, Volume: 26, Issue: 12
    Journal Article

    Neural machine translation (NMT) has gained more and more attention in recent years, mainly due to its simplicity yet state-of-the-art performance. However, previous research has shown that NMT suffers from several limitations: source coverage guidance, translation of rare words, and the limited vocabulary, while statistical machine translation (SMT) has complementary properties that correspond well to these limitations. It is straightforward to improve the translation performance by combining the advantages of two kinds of models. This paper proposes a general framework for incorporating the SMT word knowledge into NMT to alleviate above word-level limitations. In our framework, the NMT decoder makes more accurate word prediction by referring to the SMT word recommendations in both training and testing phases. Specifically, the SMT model offers informative word recommendations based on the NMT decoding information. Then, we use the SMT word predictions as prior knowledge to adjust the NMT word generation probability, which unitizes a neural network based classifier to digest the discrete word knowledge. In this paper, we use two model variants to implement the framework, one with a gating mechanism and the other with a direct competition mechanism. Experimental results on Chinese-to-English and English-to-German translation tasks show that the proposed framework can take advantage of the SMT word knowledge and consistently achieve significant improvements over NMT and SMT baseline systems.