The event-related potential method has proven to be a useful tool for studying the effects of gender information in language. Studies have shown that mismatch between the antecedent and the following ...referent triggers two ERP components, N400 and P600. In the present study, we investigated how grammatical gender affects the mental representation of the grammatical subject. A match-mismatch paradigm was used to investigate how masculine grammatical gender and gender-balanced forms (the explicit mention of masculine and feminine forms as word pairs) as role nouns affect the processing of the referent in Slovenian. The morphological complexity of Slovenian language required the use of anaphoric verbs instead of nouns/pronouns, on which previous research was based. The results showed that following both the gender-balanced and the masculine generic forms, P600 (but not N400) was observed in response to the feminine verb but not to the masculine verb. The P600 amplitude was smaller in the case of the gender-balanced form than in the case of the masculine generic form only. We have concluded that gender-balanced forms are more open to feminine continuations than masculine generic forms. This is the first ERP study in Slovenian to address the effects of processing grammatical gender, thus contributing to existing research on languages with grammatical gender. The great strength of the study is that it is one of the first ERP studies to test the mental inclusivity of gender-balanced forms.
The rapid growth of social media, news sites, and blogs increases the opportunity to express and share an opinion on the Internet. Researchers from different fields take advantage of nearly limitless ...data. Thus, in the past decade, opinion mining or sentiment analysis has become an important research discipline. In this paper, we focus on the target-level sentiment analysis, wherein the task is to predict the sentiment concerning specific (multiple) entities that appear as coreference mentions throughout the document. We created a new annotated dataset of Slovene news articles, additionally annotated with named entities and coreferences that are the basis for the proposed task. Using entity-document representation, we compared the task with the traditional sentiment analysis, evaluating traditional machine learning and deep neural network approaches. According to existing approaches, the proposed task represents a challenging problem. The results show that we can achieve the best results using a customised BERT adapter (a minor improvement over a standard text-classification adapter). We outperformed existing aspect-based state-of-the-art approaches by 13%, reaching up to 77% accuracy and a 73% F1 score.
This article discusses dilemmas that have been sent to the Language Consulting Service of the ZRC SAZU Fran Ramovš Institute of the Slovenian Language by users and are related to feminatives in ...Slovenian, also shedding light on these dilemmas from the perspective of wider societal developments. Most dilemmas are connected to feminatives that are not included in dictionaries or are unfamiliar, but dilemmas often also arise when multiple feminatives are included in dictionaries or viable in terms of word formation. Though the Language Consulting Service is integrated into the search system of the Fran dictionary portal, the feminatives considered, which were not yet included in dictionaries when the corresponding questions were submitted, are at the time of writing still not included, which leads us to the question of a systematic treatment and dictionary presentation especially for feminatives that are uncommon in usage, which are most frequently the subjects of dilemmas in the Language Consulting Service.
U radu se analiziraju dvojbe koje su upućene službi jezičnih savjeta ZRC SAZU-a pri Institutu za slovenski jezik Frana Ramovša, a tiču se feminativa u slovenskome s jezičnoga i širega društvenog gledišta. Najčešće je riječ o dvojbama koje se odnose na feminative nezabilježene u rječnicima, koji su rijetki ili u slučajevima kad postoje sinonimni nazivi ili im je tvorba upitna.
The paper presents the results of the Janes project, which aimed to develop language resources and tools for Slovene user generated content. The paper first describes the 200 million word Janes ...corpus, containing tweets, forum posts, news comments, user and talk pages from Wikipedia, and blogs and blog comments, where each text is accompanied by rich metadata. The developed processing tools for Slovene user generated content are presented next, which include a tokeniser, word-normaliser, part-of-speech tagger and lemmatiser, and a named entity recogniser. A set of manually annotated datasets was also produced, both for tool training as well as for linguistic research. The developed resources and tools are made publicly available under Creative Commons licences in the repository of the CLARIN.SI research infrastructure and on GitHub, while the corpora are also available through the CLARIN.SI concordancers.