UP - logo
E-resources
Full text
Peer reviewed
  • Low resource machine transl...
    Singh, Salam Michael; Singh, Thoudam Doren

    Expert systems with applications, 12/2022, Volume: 209
    Journal Article

    The language barrier is one of the practical challenges human being face during communication. To overcome this, researchers are focusing on using machines to translate a source language to a target language using the textual representations of the languages. Thus, machine translation (MT) could achieve a near human-level performance in terms of translation quality for several resource-rich languages. However, machine translation performance is still far from a production-level quality for the low resource languages. This work reports a semi-supervised neural machine translation system to boost the translation quality for an extremely resource constraint language pair, i.e. English–Manipuri. Our proposed approach exploits self-training and back-translation in a combined technique. The quantitative evaluation shows that the system performance improves by +0.9 BLEU score after introducing external noise to the input data. Additionally, a multi-reference test dataset developed in-house is used to evaluate the linguistic diversity of the highly agglutinative and morphologically rich Manipuri language. Experimental result attests that the proposed semi-supervised system outperforms the supervised, the pretrained mBART and existing semi-supervised baselines in terms of automatic score and subjective evaluation parameters by a significant margin up to +4.5 and +1.2 BLEU improvements against the supervised and mBART baselines respectively. •Backtranslation and forward-translation improve the low resource machine translation.•External perturbations to the noisy synthetic data help in converging the model.•Linguistic variations are tackled via the inclusion of multiple test references.•The proposed method is competitive with pre-trained models.