Harmful textual content is pervasive on social media, poisoning online communities and negatively impacting participation. A common approach to this issue is developing detection models that rely on ...human annotations. However, the tasks required to build such models expose annotators to harmful and offensive content and may require significant time and cost to complete. Generative AI models have the potential to understand and detect harmful textual content. We used ChatGPT to investigate this potential and compared its performance with MTurker annotations for three frequently discussed concepts related to harmful textual content on social media: Hateful, Offensive, and Toxic (HOT). We designed five prompts to interact with ChatGPT and conducted four experiments eliciting HOT classifications. Our results show that ChatGPT can achieve an accuracy of approximately 80% when compared to MTurker annotations. Specifically, the model displays a more consistent classification for non-HOT comments than HOT comments compared to human annotations. Our findings also suggest that ChatGPT classifications align with the provided HOT definitions. However, ChatGPT classifies “hateful” and “offensive” as subsets of “toxic.” Moreover, the choice of prompts used to interact with ChatGPT impacts its performance. Based on these insights, our study provides several meaningful implications for employing ChatGPT to detect HOT content, particularly regarding the reliability and consistency of its performance, its understanding and reasoning of the HOT concept, and the impact of prompts on its performance. Overall, our study provides guidance on the potential of using generative AI models for moderating large volumes of user-generated textual content on social media.
The analysis and detection of offensive content in textual information have become a great challenge for the Natural Language Processing community. Most of the research conducted so far on offensive ...language detection have addressed this task as a sole optimization objective. However, other linguistic phenomena that are arguably correlated with offensive language and therefore could be beneficial to recognize this type of problematic content on the Web, have not been explored in depth so far. Thus, the goal of this study is to investigate whether explicit and implicit concepts involved in the expression of offensive language help in the detection of this phenomenon and how to incorporate these concepts in a computational system. We propose a multi-task learning approach that includes such concepts according to the relevance shown by a feature selection method called mutual information. Our experiments show that some phenomena such as constructiveness, target group and person, figurative language (sarcasm and mockery), insults, improper language, and emotions combined together help to optimize the offensive language detection task, outperforming a state-of-the-art method (the transformer BETO) that we use as our baseline to compare the results.
•Addressing offensive language detection for Spanish texts.•Studying implicit and explicit linguistic phenomena for offensive language.•Assessing the impact of including phenomena via multi-task learning.•Performance comparison of multi-task learning models with a well-known Transformer.•Analyzing the knowledge transfer of the explored phenomena.
Exponential growth in social media has led to the increasing popularity of hate speech and hate-based propaganda. Hate speech or malicious expression refers to the use of offensive, violent, or ...offensive language and its religious conduct with a specific group of people who share a common property, such as gender, ethnicity, race, or beliefs. Online hate diffusion has now become a serious problem as it creates a series of international initiatives aimed at defining problems and developing effective countermeasures; this study delves into the exploration of the intention of hate speech posting on social media, especially on Twitter. Both dramaturgical models of social interaction and cultivation theory were used to explain the hate speech culture phenomenon. A qualitative method is proposed for this study as part of the exploration. Results revealed that most of the previous studies on hate speech focused on the field of computer science but rarely on the communication field. The paper presents the results of past studies and shows the new proposed framework. The investigation suggests future directions for the problem and possible solutions; it starts with the background of the research, the explanation of the problem, the meaning of the research, and pursuing the research questions and goals of the research before finally explaining the limits.
The classification of documents is one of the problems studied since ancient times and still continues to be studied. With the social media becoming a part of daily life and its misuse, the ...importance of text classification has started to increase. This paper investigates the effect of data augmentation with sentence generation on classification performance in an imbalanced dataset. We propose an LSTM based sentence generation method, Term Frequency-Inverse Document Frequency (TF-IDF) and Word2vec and apply Logistic Regression (LR), Support Vector Machine (SVM), K Nearest Neighbour (KNN), Multilayer Perceptron (MLP), Extremly Randomized Trees (Extra tree), Random Forest, eXtreme Gradient Boosting (Xgboost), Adaptive Boosting (AdaBoost) and Bagging. Our experiment results on imbalanced Offensive Language Identification Dataset (OLID) that machine learning with sentence generation significantly outperforms.
The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms, including hate speech, offensive language, bullying and harassment, ...misinformation, spam, violence, graphic content, sexual abuse, self-harm, and many others. Online platforms seek to moderate such content to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Researchers have developed different methods for automatically detecting harmful content, often focusing on specific sub-problems or on narrow communities, as what is considered harmful often depends on the platform and on the context. We argue that there is currently a dichotomy between what types of harmful content online platforms seek to curb, and what research efforts there are to automatically detect such content. We thus survey existing methods as well as content moderation policies by online platforms in this light and suggest directions for future work.
In the face of uncontrolled offensive content on social media, automated detection emerges as a critical need. This paper tackles this challenge by proposing a novel approach for identifying ...offensive language in multilingual, code-mixed, and script-mixed settings. The study presents a novel multilingual hybrid dataset constructed by merging diverse monolingual and bilingual resources. Further, we systematically evaluate the impact of input representations (Word2Vec, Global Vectors for Word Representation (or GloVe), Bidirectional Encoder Representations from Transformers (or BERT), and uniform initialization) and deep learning models (Convolutional Neural Network (or CNN), Bidirectional Long Short Term Memory (or Bi-LSTM), Bi-LSTM-Attention, and fine-tuned BERT) on detection accuracy. Our comprehensive experiments on a dataset of 42,560 social media comments from five languages (English, Hindi, German, Tamil, and Malayalam) reveal the superiority of fine-tuned BERT. Notably, it achieves a macro average F1-score of 0.79 for monolingual tasks and an impressive 0.86 for code-mixed and script-mixed tasks. These findings significantly advance offensive language detection methodologies and shed light on the complex dynamics of multilingual social media, paving the way for more inclusive and safer online communities.
Fake news, hate speech and offensive language are related evil triplets currently affecting modern societies. Text modality for the computational detection of these phenomena has been widely used. In ...recent times, multimodal studies in this direction are attracting a lot of interests because of the potentials offered by other modalities in contributing to the detection of these menaces. However, a major problem in multimodal content understanding is how to effectively model the complementarity of the different modalities due to their diverse characteristics and features. From a multimodal point of view, the three tasks have been studied mainly using image and text modalities. Improving the effectiveness of the diverse multimodal approaches is still an open research topic. In addition to the traditional text and image modalities, we consider image–texts which are rarely used in previous studies but which contain useful information for enhancing the effectiveness of a prediction model. In order to ease multimodal content understanding and enhance prediction, we leverage recent advances in computer vision and deep learning for these tasks. First, we unify the modalities by creating a text representation of the images and image–texts, in addition to the main text. Secondly, we propose a multi-layer deep neural network with inter-modal attention mechanism to model the complementarity among these modalities. We conduct extensive experiments involving three standard datasets covering the three tasks. Experimental results show that detection of fake news, hate speech and offensive language can benefit from this approach. Furthermore, we conduct robust ablation experiments to show the effectiveness of our approach. Our model predominantly outperforms prior works across the datasets.
•A unified deep learning model can be used for multimodal fake news, hate speech and offensive language detection.•Unifying modalities is useful for multimodal content understanding.•Inter-modal attention mechanism is effective for multimodal-based deep learning models.•The inter-modal attention deep learning framework is effective for fake news, hate speech and offensive language detection.•Incorporation of image-texts as additional modality improves performance. The model can be tuned to use desired number of modalities.
•Our research aims to detect different types of Arabic offensive language in twitter.•RBF records the highest results among the utilized traditional classifiers.•Results show that the lowest ...performance is recorded by KNN.
This research aims to detect different types of Arabic offensive language in twitter. It uses a multiclass classification system in which each tweet is categorized into one or more of the offensive language types based on the used word(s). In this study, five types are classified, which are: bullying, insult, racism, obscene, and non-offensive. To classify the abusive language, a cascaded model consisting of Bidirectional Encoder Representation of Transformers (BERT) models (AraBERT, ArabicBERT, XLMRoBERTa, GigaBERT, MBERT, and QARiB), deep learning models (1D-CNN, BiLSTM), and Radial Basis Function (RBF) is presented in this work. In addition, various types of machine learning models are utilized. The dataset is collected from twitter in which each class has the same number of tweets (balanced dataset). Each tweet is assigned to one or more of the selected offensive language types to build multiclass and multilabel systems. In addition, a binary dataset is constructed by assigning the tweets to offensive or non-offensive classes. The highest results are obtained from implementing the cascaded model started by ArabicBERT followed by BiLSTM and RBF with an accuracy, precision, recall, and F1-score of 98.4%, 98.2%,92.8%, and 98.4%, respectively. RBF records the highest results among the utilized traditional classifiers with an accuracy, precision, recall, and F1-score of 60% for each measurement individually, while KNN records the lowest results obtaining 45%, 46%, 45%, and 43% in terms of accuracy, precision, recall, and F1-score, respectively.
ABSTRACT
The mutual causal interdependence between offensive language use and negative emotions has been largely underexplored in the public event. Using 784,179 posts about the Tangshan violence ...event collected from Sina Weibo, nine themes were recognized based on framing theory. The mutual causal relationship between offensive language and negative emotions under each theme was examined through Convergent Cross Mapping. Results suggested that the mutual causal relationships between offensive language and negative emotion intensity under various themes were different with bidirectional causality under moral judgement, emotional venting, and power conflict, unidirectional causality under the vulnerable framework and the trust framework, and no causality under other themes. More detailed examination revealed special bidirectional or unidirectional causality between offensive language and some fine‐grained negative emotions under the vulnerable framework, the trust framework, and secondary opinion. This study provides insight into the interaction between offensive language and negative emotions and helps emergency managers make targeted strategies to solve the problems of offensive language use and negative emotions.