UNI-MB - logo
UMNIK - logo
 
(UM)
  • Statistical machine translation from Slovenian to English using reduced morphology
    Sepesy Maučec, Mirjam ; Brest, Janez
    This paper describes the study of word-based statistical machine translation to language pair Slovenian - English. The problem when dealing with Slovenian language is data sparsity and consequently, ... error-full translations. The aim of the work is to define the approach to reduce the inflectional morphology of the Slovenian language for translation into less inflected language. The reduction is performed by a Differential Evolution algorithm, which belongs to Evolutionary Algorithms, and is widely used for global optimization problems. The experiments were carried out using a freely-available parallel English-Slovenian SVEZ-IJS corpus, which is lemmatised and annotated with morpho-syntactic description (MSD) tags. A set of baseline experiments is described and compared with experiments done on reduced MSD tags. The paper reports an improvement in translation results when compared to using words, lemmas and fully morpho-syntactically annotated words.
    Vrsta gradiva - prispevek na konferenci
    Leto - 2009
    Jezik - angleški
    COBISS.SI-ID - 13398294
    DOI