Socially Responsible and Explainable Methods and Corpora for Low-Resource Hate Speech Detection

Language is often used to discriminate, attack, and terrorize people. In the same settings, stereotypes and prejudices are communicated by language and potentially perpetuated at scale on the web and social. While the study on hateful communication is an urgent and relevant issue, there is a significant lack of research concerning explainable hate speech detection, as well as social bias in hate speech technologies. To fill these important research gaps, and due to the severity of social media abusive comments in Global South countries, this research project aims to investigate and provide socially responsible and explainable methods and data resources for low-resource hate speech detection. Specifically, we focus on the Brazilian Portuguese and Hausa African Indigenous languages.

As obtained results, we developed a novel optimized bag-of-words machine learning model by contextual lexicon for explainable hate speech detection. Our method embodies explicit and implicit pejorative terms from a specialized lexicon annotated with contextual information. The proposed method overcame literature baselines and it is the current state-of-the-art for Portuguese. Furthermore, we also created the first large-scale expert annotated corpus for Brazilian hate speech detection, the the first expert annotated corpus for Hausa hate speech detection and a specialized offnsive lexicon. The HateBR corpus was collected from the comment section of Brazilian politicians' accounts on Instagram and manually annotated by experts. It is composed of 7,000 documents annotated according to three different layers: a binary classification (offensive versus non-offensive comments), offensiveness-level classification (highly, moderately, and slightly offensive), and nine hate speech groups (xenophobia, racism, homophobia, sexism, religious intolerance, partyism, apology for the dictatorship, antisemitism, and fatphobia) . In addition, we also created the HausaHate corpus, which comprises 2,000 comments extracted from West African Facebook pages and manually annotated by NLP experts in two layers: binary classification (offensive and non-offensive), and hate speech targets (race, gender and non-target). The proposed specialized offensive lexicon titled MOL - Multilingual Offensive Lexicon, which was manually identified by a linguist from the proposed HateBR corpus, holds 1,000 explicit and implicit pejorative terms and expressions annotated with contextual information (offensiveness contextual-dependent and offensiveness contextual-independent). Both the corpus and the lexicon were annotated by three different experts and achieved high inter-annotator agreement. We also developed the first web system for the Brazilian Portuguese offensiveness analysis . The NoHateBrazil web system analyzes fine-grained offensiveness (highly, moderately, and slightly), and provides a new measure to evaluate the reliability of machine-learning prediction, which is shown to the user. Finally, we proposed a new explainable method to assess discriminatory social bias in machine learning-based hate speech classifiers. The method titled Social Stereotype Analysis (SSA) assesses the potential of hate-speech classifiers to reflect social stereotypes through the investigation of stereotypical beliefs by contrasting them with counter-stereotypes.

Head
Team

Publications
    • Context-Aware and Expert Data Resources for Brazilian Portuguese Hate Speech Detection
      Vargas, F., Carvalho, I., Pardo, T.A.S., Benevenuto, F.
      Cambridge Natural Language Processing Journal. 2024. pp.1-21. accepted

    • Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?
      Vargas, F., Carvalho, I., Hürriyetoğlu, A., Pardo, T.A.S., Benevenuto, F.
      Recent Advances in Natural Language Processing (RANLP 2023). pp.1-10. Varna, Bulgaria. see

    • NoHateBrazil: A Brazilian Portuguese Text Offensiveness Analysis System
      Vargas, F., Carvalho, I., Schmeisser-Nieto, W., Benevenuto, F., Pardo, T.A.S.
      Recent Advances in Natural Language Processing (RANLP 2023). pp.1-7. Varna, Bulgaria. see

    • HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
      Vargas, F., Carvalho, I., Góes, F.R., Pardo, T.A.S., Benevenuto, F.
      13th Conference on Language Resources and Evaluation (LREC 2022). pp.7174–7183. Marseille, France. see

    • Contextual-Lexicon Approach for Abusive Language Detection
      Vargas, F., Góes, F.R., Carvalho, I., Benevenuto, F., Pardo, T.A.S.
      Recent Advances in Natural Language Processing (RANLP 2021). pp.1442-1451. Held Online. see


Resources
Patents
Dataset
  • HateBR: Large-scale expert annotated corpus of Brazilian Instagram comments for abusive language detection.
  • HausaHate: An expert hate speech corpus of Facebook comments for the Hausa African Indigenous language.
Lexicon
Software

Sponsorship