Deep learning-based approaches for abusive content detection and classification for multi-class online user-generated data

E-viri

Recenzirano Odprti dostop

Deep learning-based approaches for abusive content detection and classification for multi-class online user-generated data

Kaur, Simrat; Singh, Sarbjeet; Kaushal, Sakshi

International journal of cognitive computing in engineering, 2024, 2024-00-00, 2024-01-01, Letnik: 5

Journal Article

•an abusive language detection model that perform multiclass classification of offensive language.•experimented with five deep learning models: Bi-LSTM, LSTM, Bi-GRU, GRU, and multi-dense LSTM.•dataset is classified in to three levels: offensive language categorization (Level A), offensive language detection (Level B), and offensive language target identification (Level c).•Gated Recurrent Unit (GRU) achieved the highest accuracy for Level A (78.65 %) and Level B (88.59 %). However, for Level C, all models except for the Long Short-Term Memory (LSTM) model achieved near-perfect accuracy values of 99.9 %. With the rapid growth of social media culture, the use of offensive or hateful language has surged, which necessitates the development of effective abusive language detection models for online platforms. This paper focuses on developing a multi-class classification model to identify different types of offensive language. The input data is taken in the form of labeled tweets and is classified into offensive language detection, offensive language categorization, and offensive language target identification. The data undergoes pre-processing, which removes NaN value and punctuation, as well as performs tokenization followed by the generation of a word cloud to assess data quality. Further, the tf-idf technique is used for the selection of features. In the case of classifiers, multiple deep learning techniques, namely, bidirectional gated recurrent unit, multi-dense long short-term memory, bidirectional long short-term memory, gated recurrent unit, and long short-term memory, are applied where it has been found that all the models, except long short-term memory, achieved a high accuracy of 99.9 % for offensive language target identification. Bidirectional LSTM and multi-dense LSTM obtained the lowest loss and RMSE values of 0.01 and 0.1, respectively. This research provides valuable insights and contributes to the development of effective abusive language detection methods to promote a safe and respectful online environment. The insights gained can aid platform administrators in efficiently moderating content and taking appropriate actions against offensive language.

Išči dalje

Avtor

Kaur, Simrat | Singh, Sarbjeet | Kaushal, Sakshi

Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.

Leto	Faktor vpliva		Izdaja		Kategorija		Razvrstitev
Leto	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Povezave do osebnih bibliografij avtorjev	Povezave do podatkov o raziskovalcih v sistemu SICRIS

Vir: Osebne bibliografije in: SICRIS

Naloži sliko

Vnos na polico

Dodajanje gradiva na polico je uspelo.

Dodajanje gradiva na polico je spodletelo.

Dodajanje gradiva na polico ni bilo potrebno.

Trajna povezava

E-pošta

Faktor vpliva

Izberite knjižnično izkaznico:

Baze podatkov, v katerih je revija indeksirana

Citiranje

Tema