Online hatred of women in the Incels.me forum Jaki, Sylvia; De Smedt, Tom; Gwóźdź, Maja ...
Journal of Language Aggression and Conflict,
11/2019, Volume:
7, Issue:
2
Journal Article
Peer reviewed
This paper presents a study of the (now suspended) online discussion forum Incels.me and its users, involuntary celibates or incels, a virtual community of isolated men without a sexual life, who see ...women as the cause of their problems and often use the forum for misogynistic hate speech and other forms of incitement. Involuntary celibates have attracted media attention and concern, after a killing spree in April 2018 in Toronto, Canada. The aim of this study is to shed light on the group dynamics of the incel community, by applying mixed-methods quantitative and qualitative approaches to analyze how the users of the forum create in-group identity and how they construct major out-groups, particularly women. We investigate the vernacular used by incels, apply automatic profiling techniques to determine who they are, discuss the hate speech posted in the forum, and propose a Deep Learning system that is able to detect instances of misogyny, homophobia, and racism, with approximately 95% accuracy.
Cryptocurrencies have been the latest technological revolution in the world of finances. Although this revolution has not been completed yet, and as a payment method is still limited, their ...popularity has vastly increased since 2020 due to speculation about their value. As in any other field, any revolution in economics, technology, education, or society implies another parallel language revolution. This is how the introduction of cryptocurrencies has led to the emergence of some new forms of language. This quantitative case study aims to analyze the characteristics of that crypto language and identify some of the most usual words, acronyms, metaphors, and other popular expressions within this field. To achieve this purpose, a glossary published by the company Bit2Me was used along with the Google search bar, which provided the number of appearances on the net. Results showed that some neologisms had been created, acronyms prevailed over some words and expressions, and the use of animal metaphors was a usual practice. These results contribute to the field of electronic finances by showing that the community of cryptocurrency users have created their own linguistic rules to communicate among them with the use of specific words, as detailed in this paper.
Due to the worldwide accessibility to the Internet along with the continuous advances in mobile technologies, physical and digital worlds have become completely blended, and the proliferation of ...social media platforms has taken a leading role over this evolution. In this paper, we undertake a thorough analysis towards better visualising and understanding the factors that characterise and differentiate social media users affected by mental disorders. We perform different experiments studying multiple dimensions of language, including vocabulary uniqueness, word usage, linguistic style, psychometric attributes, emotions’ co-occurrence patterns, and online behavioural traits, including social engagement and posting trends.
Our findings reveal significant differences on the use of function words, such as adverbs and verb tense, and topic-specific vocabulary, such as biological processes. As for emotional expression, we observe that affected users tend to share emotions more regularly than control individuals on average. Overall, the monthly posting variance of the affected groups is higher than the control groups. Moreover, we found evidence suggesting that language use on micro-blogging platforms is less distinguishable for users who have a mental disorder than other less restrictive platforms. In particular, we observe on Twitter less quantifiable differences between affected and control groups compared to Reddit.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
We present the task of identifying the emotions conveyed by the lyrics of Italian opera arias. We shape the task as a multi-class supervised problem, considering the six emotions from Parrot’s tree: ...love, joy, admiration, anger, sadness, and fear. We manually annotated an opera corpus with 2.5k instances at the verse level and experimented with different classification models and representations to identify the expressed emotions. Our best-performing models consider character 3-gram representations and reach relatively low levels of macro-averaged F1. Such performance reflects the difficulty of the task at hand, partially caused by the size and nature of the corpus: relatively short verses written in 18th-century Italian. Building on what we learned from the verse-level setting, we adopt a higher granularity and increase the size of the corpus. First, we switch from verses to arias in order to have longer and more expressive texts. Second, we construct a new corpus with 40k arias (\(\sim\) 90k verses). This new dataset contains silver data, annotated by self-learning on the basis of an ensemble of binary classifiers.We then experiment with more sophisticated representations, by learning an embedding space and using it to train new models for the identification of emotions at the aria level, obtaining a significant performance boost.
This paper presents the framework and results of the Rest-Mex task at IberLEF 2023, focusing on sentiment analysis and text clustering of tourist texts. The study primarily focuses on texts related ...to tourist destinations in Mexico, although this edition included data from Cuba and Colombia for the first time. The sentiment analysis task aims to predict the polarity of opinions expressed by tourists, classifying the type of place visited, whether it's a tourist attraction, hotel, or restaurant, as well as the country it is located in. On the other hand, the text clustering task aims to classify news articles related to tourism in Mexico. For both tasks, corpora were built using Spanish opinions extracted from TripAdvisor and news articles from Mexican media. This article compares and discusses the results obtained by the participants in both sub-tasks. Additionally, a method is proposed to measure the easiness of a multi-class text classification corpus, along with an approach for system selection in a possible late fusion scheme.
In the prompt-specific holistic score prediction task for Automatic Essay Scoring (AES), the general approaches include pre-trained neural model, coherence model, and hybrid model that incorporate ...syntactic features with neural model. In this paper, we propose a novel approach to extract and represent essay coherence features with NSP that matches the state-of-the-art (SOTA) AES coherence model, and achieves the best performance for long essays. We apply syntactic feature dense embedding to augment BERT-based model and achieve the best performance for hybrid methodology for AES. In addition, we explore various ideas to combine coherence, syntactic information, and semantic embeddings, which no previous study has done. Our combined model also performs better than the SOTA available for combined model, even though it does not outperform our syntactic-enhanced neural model. We further compare with the pure neural models and analyze the strengths and weaknesses of our methodologies.
In order to simplify sentences, several rewriting operations can be performed, such as replacing complex words per simpler synonyms, deleting unnecessary information, and splitting long sentences. ...Despite this multi-operation nature, evaluation of automatic simplification systems relies on metrics that moderately correlate with human judgments on the simplicity achieved by executing specific operations (e.g., simplicity gain based on lexical replacements). In this article, we investigate how well existing metrics can assess sentence-level simplifications where multiple operations may have been applied and which, therefore, require more general simplicity judgments. For that, we first collect a new and more reliable data set for evaluating the correlation of metrics and human judgments of overall simplicity. Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new data set (and other existing data) to analyze the variation of the correlation between metrics’ scores and human judgments across three dimensions: the perceived simplicity level, the system type, and the set of references used for computation. We show that these three aspects affect the correlations and, in particular, highlight the limitations of commonly used operation-specific metrics. Finally, based on our findings, we propose a set of recommendations for automatic evaluation of multi-operation simplifications, suggesting which metrics to compute and how to interpret their scores.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Natural languages come in two different modalities. The impact of modality on the grammatical structure and linguistic theory has been discussed at great length in the last 20 years. By contrast, the ...impact of modality on linguistic data elicitation and collection, corpus studies, and experimental (psycholinguistic) studies is still underinvestigated. In this article, we address specific challenges that arise in judgment data elicitation and experimental studies of sign languages. These challenges are related to the socio-linguistic status of the Deaf community and the larger variability across signers within the same community, to the social status of sign languages, to properties of the visual-gestural modality and its interface with gesture, to methodological aspects of handling sign language data, and to specific linguistic features of sign languages. While some of these challenges also pertain to (some varieties of) spoken languages, other challenges are more modality-specific. The special combination of the challenges discussed in this article seems to be a specific facet empirical research on sign languages is faced with. In addition, we discuss the complementarity of theoretical approaches and experimental studies and show how the interaction of both approaches contributes to a better understanding of sign languages in particular and linguistic structures in general.