DIKUL - logo
VSE knjižnice (vzajemna bibliografsko-kataložna baza podatkov COBIB.SI)
  • The siParl corpus of Slovenian parliamentary proceedings [Elektronski vir]
    Pančur, Andrej ; Erjavec, Tomaž, 1960-
    The paper describes the process of acquisition, up-translation, encoding, annotation, and distribution of siParl, a collection of the parliamentary debates from the Assembly of the Republic of ... Slovenia 1990/2018, covering the period from just before Slovenia became an independent country in 1991, and almost up to the present. The entire corpus, comprising over 8 thousand sessions, 1 million speeches and 200 million words was uniformly encoded in accordance with the TEI-based Parla-CLARIN schema forencoding corpora of parliamentary debates, and contains extensive meta-data about the speakers, a typology of sessions etc. and structural and editorial annotations. The corpus was also linguistically annotated using state-of-the-art tools. siParl is open source and maintained on GitHub with its major versions archived in the CLARIN.SI repository. It is also available for linguistic and content analysis through the online CLARIN.SI concordancers, thus offering an invaluable resource for scholars studying Slovenian political history.
    Vrsta gradiva - prispevek na konferenci
    Leto - 2020
    Jezik - angleški
    COBISS.SI-ID - 39765507