ALL libraries (COBIB.SI union bibliographic/catalogue database)
  • Morphology in statistical machine translation
    Sepesy Maučec, Mirjam ; Brest, Janez
    In this paper we discuss statistical machine translation from more inflected language to less inflected one. Translation from Slovenian to English is used as an example of that type of translation. ... The focus is given on the morphological variation in source language, which is not reflected in the target language, but results in increased data sparsity. Morphological variation in source language is expressed using lemma-tag representation of words. Tag contains morpho-syntactic description of a word. The idea is to keep only the tags relevant for translation. Eliminating the rest of them results in data sparsity reduction. To determine the set of relevant tags expert knowledge is needed. We try to avoid it by using a global optimization algorithm. We choose a population based Differential Evolution algorithm. The experiments were carried out using freely available parallel English-Slovenian SVEZ-IJS corpus, which is lemmatised and annotated with morpho-syntactic description tags.
    Type of material - conference contribution
    Publish date - 2008
    Language - english
    COBISS.SI-ID - 12474134