Francielle Vargas

Ph.D. Candidate in Computer Science - Natural Language Processing
E-Mail | Google Scholar | Linkedin | GitHub | ORCID | Lattes | Curriculum Vitae |


I am a computer and language scientist with an M.Sc. and Ph.D. (final year) in Natural Language Processing. During my Ph.D, I was a visiting researcher at the University of Southern California (USC) in the USA and an invited researcher to speaker at the Leibniz Institute for the Social Sciences (GESIS) in Germany. I received my M.Sc. in Computer Science and Computational Mathematics from the University of São Paulo (2017). Previously, I obtained a B.S. in Computer Information Systems and a B.A. in Linguistics. I am interested in Natural Language Processing, Machine Learning and Computational Social Science. My research relies on improvement the explainability, robustness, and fairness of large-scale language models mostly focused on misinformation and hate speech applications. Hence, I use machine learning techniques including neural networks to design and guide the development of safer, trustworthy and responsible human language technologies.


Research Interests
  • Responsible AI, Explainability, Ethics, Bias Mitigation, and Fairness
  • Fact-Checking, Fake News and Media Bias Detection, Misinformation, Factuality
  • Hate Speech and Offensive Language Detection, Toxicity, Radicalism, Polarization
  • Opinion and Argument Mining, Emotion, Sentiment and Stylistic Analysis, Subjectivity

Research Projects

Awards & Honors
  • Google Latin America Research Award (LARA 2024)
  • NAACL Diversity and Inclusion Award (NAACL 2024)
  • Outstanding Academic Project and Honorable Mention (UFMG 2013)
  • Outstanding Academic Project and Honorable Mention (UFMG 2012)

Invited Talks

Publications
2024
    • Discourse Annotation Guideline for Low-Resource Languages
      Francielle Vargas, Wolfgang Schmeisser-Nieto, Zohar Rabinovich, Thiago A.S. Pardo, Fabrício Benevenuto
      Natural Language Engineering Journal. Cambridge core. pp. 1-44. to appear

    • HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection
      Francielle Vargas, Samuel Guimarães, Shamsuddeen H. Muhammad, Diego Alves, Ibrahim Said Ahmad, Idris Abdulmumin, Diallo Mohamed, Thiago Pardo, Fabrício Benevenuto
      8th Workshop on Online Abuse and Harms (WOAH @ NAACL 2024). pp. 52–58. Mexico City, Mexico. see

    • Extended Multimodal Hate Speech Event Detection During Russia-Ukraine Crisis
      Surendrabikram Thapa, Kritesh Rauniyar, Farhan Jafri, Hariram Veeramani, Raghav Jain, Sandesh Jain, Francielle Vargas, Ali Hürriyetoğlu, Usman Naseem
      7th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (CASE @ RANLP 2024). pp. 221–228. St. Julians, Malta. see

2023
    • Predicting Sentence-Level Factuality of News and Bias of Media Outlets
      Francielle Vargas, Kokil Jaidka, Thiago A.S. Pardo, Fabrício Benevenuto
      Recent Advances in Natural Language Processing (RANLP 2023). pp. 1197–1206. Varna, Bulgaria. see

    • Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?
      Franciell Vargas, Isabelle Carvalho, Ali Hürriyetoğlu, Thiago A.S. Pardo, Fabrício Benevenuto
      Recent Advances in Natural Language Processing (RANLP 2023). pp. 1187–1196. Varna, Bulgaria. see

    • NoHateBrazil: A Brazilian Portuguese Text Offensiveness Analysis System
      Franciell Vargas, Isabelle Carvalho, Wolfgang Schmeisser-Nieto, Fabrício Benevenuto, Thiago A.S. Pardo
      Recent Advances in Natural Language Processing (RANLP 2023). pp.1180–1186. Varna, Bulgaria. see

    • Multimodal Hate Speech Detection
      Surendrabikram Thapa, Farhan Jafr, Ali Hürriyetoğlu, Francielle Vargas, Roy Ka-Wei Le, Usman Naseem
      6th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (CASE @ EACL 2023). pp.151-159. Varna, Bulgaria. see

2022
    • HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
      Franciell Vargas, Isabelle Carvalho, Fabiana R. Góes, Thiago A.S. Pardo, Fabrício Benevenuto
      13th Conference on Language Resources and Evaluation (LREC 2022). pp. 7174–7183. Marseille, France. see

    • Rhetorical Structure Approach for Online Deception Detection: A Survey
      Francielle Vargas, Jonas D'Alessandro, Zohar Rabinovich, Fabrício Benevenuto, Thiago A.S. Pardo
      13th Conference on Language Resources and Evaluation (LREC 2022). pp. 5906‑5915. Marseille, France. see

    • Studying Dishonest Intentions in Brazilian Portuguese Texts
      Francielle Vargas, Thiago A.S. Pardo
      Deceptive AI. Springer International Publishing: Communications in Computer and Information Science, vol 1296. pp. 166–178. see

    • Extended Multilingual Protest News Detection
      Ali Hürriyetoğlu, Osman Mutlu, Fırat Duruşan, Onur Uca, Alaeddin Gürel, Benjamin J. Radford, Yaoyao Dai, Hansi Hettiarachchi, Niklas Stoehr, Tadashi Nomoto, Milena Slavcheva, Francielle Vargas, Aaqib Javid, Aaqib Javid, Erdem Yörük
      5th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (CASE @ EMNLP 2022). pp. 223–228. Abu Dhabi, Arab Emirates. see

2021
    • Contextual-Lexicon Approach for Abusive Language Detection
      Francielle Vargas, Fabiana R. Góes, Isabelle Carvalho, Fabrício Benevenuto, Thiago A.S. Pardo
      Recent Advances in Natural Language Processing (RANLP 2021). pp. 1442-1451. Held Online. see

    • Toward Discourse-Aware Models for Multilingual Fake News Detection
      Francielle Vargas, Fabrício Benevenuto, Thiago A.S. Pardo
      Recent Advances in Natural Language Processing (RANLP 2021). pp. 210-218. Held Online. see

    • Implicit Opinion Aspect Clues in Portuguese Texts: Analysis and Categorization
      Mateus Tarcinalli Machado, Thiago A.S. Pardo, Evandro Eduardo Seron Ruiz, Ariani Di Felippo, Francielle Vargas
      15th International Conference on the Computational Processing of Portuguese (PROPOR 2021). pp. 68-78. Fortaleza, Brazil. see

2020 and before
    • Linguistic Rules for Fine-Grained Opinion Extraction
      Francielle Vargas, Thiago A.S. Pardo
      5th International Workshop on Social Sensing: Special Edition on Narrative Analysis on Social Media (SocialSens @ ICWSM 2020). pp. 1-6. Held Online. see

    • Identifying Fine-Grained Opinion and Classifying Polarity on Coronavirus Pandemic
      Francielle Vargas, Rodolfo Sanches Saraiva Dos Santos, Pedro Regattieri Rocha
      9th Brazilian Conference on Intelligent Systems (BRACIS 2020) . pp.511-520. Rio Grande, Brazil. see

    • Aspect Clustering Methods for Sentiment Analysis
      Francielle Vargas, Thiago A.S. Pardo
      13th International Conference on the Computational Processing of Portuguese (PROPOR 2018). pp.365-374. Canela, Brazil. see

Committees
Organizing Committee
Program Committee
Natural Language Processing
Computational Social Science and Data Science

Resources
Automated Methods Datasets
  • HateBR: Large-scale expert annotated dataset of Brazilian Instagram comments for abusive language detection
  • HausaHate: An expert hate speech dataset of Facebook comments for the Hausa African Indigenous language
  • FactNews: Sentence-level annotated dataset to predict factuality of news articles and bias of media outlets
  • SentiAspect-pt: Aspect-based sentiment analysis annotated dataset of web consumer reviews
  • OPCovidBR: Aspect-based sentiment analysis annotated dataset of Covid-19 tweets
  • Deceiver: Multilingual RST-annotated dataset for fake news detection
Softwares Lexicons
  • MOL: Multilingual offensive lexicon annotated with contextual information
  • PRO: Taxonomies for aspect-based sentiment analysis

Teaching
  • SCC5809-2021: Neural Networks and Deep Learning . University of São Paulo
  • SCC0605-2020: Computing Theory and Compilers. University of São Paulo
  • SCC0227-2016: Computer Seminars I. University of São Paulo

Industry
  • 2021-2021: Data Scientst. Cisco-Webex
  • 2014-2015: System Analyst. Unisys
  • 2008-2010: System Analyst. Grupo Barcelos