Akademska digitalna zbirka SLovenije - logo
VSE knjižnice (vzajemna bibliografsko-kataložna baza podatkov COBIB.SI)
  • *MWELex - MWE Lexica of Croatian, Slovene and Serbian extracted from parsed corpora
    Ljubešić, Nikola, 1979- ; Dobrovoljc, Kaja ; Fišer, Darja, 1978-
    The paper presents *MWELex, a multilingual lexical of Croatian, Slovene and Serbian multi-word expressions that were extracted from parsed corpora. The lexica were built with the custom-built DepMWEx ... tool which uses dependency syntactic patterns to identify MWE candidates in parse trees. The extracted MWE candidates are subsequently scored by co-occurrence and organized by headwords producing a resource of 23 to 48 thousand headwords and 3.2 to 12 million MWE candidates per language. Similarly, precision over specific syntactic patterns varies greatly, 0.167-0.859 for Croatian, 0.158-1.00 for Slovene. The possible extension of the tool is demonstrated on a simplistic distributional-based extraction of non-transparent MWEs and cross-lingual linking of the extracted lexicons.
    Vrsta gradiva - članek, sestavni del ; neleposlovje za odrasle
    Leto - 2015
    Jezik - angleški
    COBISS.SI-ID - 59186786