Akademska digitalna zbirka SLovenije - logo
E-viri
Recenzirano Odprti dostop
  • Towards Accurate Detection ...
    Alakrot, Azalden; Murray, Liam; Nikolov, Nikola S.

    Procedia computer science, 2018, 2018-00-00, Letnik: 142
    Journal Article

    We present the results of predictive modelling for the detection of anti-social behaviour in online communication in Arabic, such as comments which contain obscene or offensive words and phrases. We collected and labelled a large dataset of YouTube comments in Arabic which contains a broad range of both offensive and inoffensive comments. We used this dataset to train a Support Vector Machine classifier and experimented with combinations of word-level features, N-gram features and a variety of pre-processing techniques. We summarise the pre-processing steps and features that allow training a classifier which is more precise, with 90.05% accuracy, than classifiers reported by previous studies on Arabic text.