Building Hate Speech Data Resources for the Hausa African Indigenous Language
In African countries, the hate speech phenomenon is especially serious due to a historical problem regarding ethnic conflicts. Specifically, the Western region still lacks more research on hate speech focusing on its indigenous languages. Moreover, as most of the existing hate speech data resources are developed for the English language, the research and development of hate speech technologies for African indigenous languages are less developed. Finally, in African countries, there is a constant concern related to language policies in order to recommend the adoption of indigenous languages (e.g. Hausa, Yoruba, Igbo, etc.) as national lingua franca towards obtaining emancipation from colonial legacy. In Nigeria, this would mean the promotion of Hausa over English, hence highlighting the importance of developing specific NLP data resources, methods and tools for the Hausa language. To fill this relevant gap, this research project aims to investigate and build data resources for Hausa African Idigenous language.
Leader
- Francielle Vargas. University of São Paulo, Brazil
Team
- Shamsuddeen H. Muhammad. Imperial College London, UK
- Ibrahim Said Ahmad. Northeastern University, USA
- Diego Alves. Saarland University, Germany
- Idris Abdulmumin. University of Pretoria, South Africa
- Diallo Mohamed. University of Saint Thomas Aquinas, Burkina Faso
- Samuel Guimarães. Federal University of Minas Gerais, Brazil
- Fabrício Benevenuto. Federal University of Minas Gerais, Brazil
Publications
-
HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection
Francielle Vargas, Samuel Guimarães, Shamsuddeen H. Muhammad, Diego Alves, Ibrahim Said Ahmad, Idris Abdulmumin, Diallo Mohamed, Thiago A.S. Pardo, Fabrício Benevenuto
8th Workshop on Online Abuse and Harms (WOAH@NAACL 2024). pp. 52–58. Mexico City, Mexico. see