Francielle Vargas
Ph.D. Candidate in Computer Science - Natural Language Processing
I am a computer and language scientist with an MSc and Ph.D. (candidate) in Natural Language Processing. I obtained my M.Sc. in Computer Science and Computational Mathematics from the University of São Paulo (awarded in 2017). As part of my Master's Thesis, I developed commonsense reasoning for aspect-based opinion mining and summarization. Previously, I obtained a B.S. in Information Systems and a B.A. in Linguistics
I am interested in Natural Language Processing, Machine Learning and Computational Social Science. My research lies in the investigation of safe, trusted, and socially responsible AI systems, as well as computational methods for discourse and pragmatic level-language understanding and generation. I rely on machine learning techniques, including neural networks, to model and guide natural language system development. The topics that I am currently researching include:
Publications
Preprints
-
FACTual: Explainable Fact-Checking Claims Through Factual Reasoning
Vargas, F. Pardo, T.A.S., Benevenuto, F.
pp.1-7. see
-
TEAR: A Hate Speech Dataset Of Hausa Facebook Comments
Vargas, F., Alves, D., Guimarães, S., Hassan, S., Lamine, D. M., Benevenuto, F.
pp.1-9. see
-
Discourse Annotation Guideline for Low-Resource Languages
Vargas, F. Schmeisser-Nieto, W., Rabinovich, Z., W., Pardo, T.A.S., Benevenuto, F.
Natural Language Engineering Journal. pp.1-60. see
-
Your Stereotypical Mileage May Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts
Karën Fort, Laura Alonso Alemany, Jonathan Baum, Luciana Benotti, Julien Bezançon, Claudia Borg, Marthese Borg, Yongjian Chen, Fanny Ducel, Yoann Dupont, Guido Ivetta, Zhijian Li, Margot Mieskes, Marco Naguib, Aurélie Névéol, Yuyan Qian, Matteo Radaelli, Wolfgang Sebastian Schmeisser-Nieto, Emma Raimundo Schulz, Thiziri Saci, Sarah Saidi, Javier Torroba Marchante, Francielle Vargas, Shilin Xie, Sergio E. Zanotto
pp.1-9. see
2023
-
Predicting Sentence-Level Factuality of News and Bias of Media Outlets
Vargas, F., Jaidka, K., Pardo, T.A.S., Benevenuto, F.
Recent Advances in Natural Language Processing (RANLP). pp.1-10. Varna, Bulgaria. accepted
-
Context-Aware and Expert Data Resources for Brazilian Portuguese Hate Speech Detection
Vargas, F. Carvalho, I., Pardo, T.A.S., Benevenuto, F.
Natural Language Engineering Journal. pp.1-21. accepted
-
Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?
Vargas, F., Carvalho, I., Hürriyetoğlu, A., Pardo, T.A.S., Benevenuto, F.
Recent Advances in Natural Language Processing (RANLP). pp.1-10. Varna, Bulgaria. accepted
-
NoHateBrazil: A Brazilian Portuguese Text Offensiveness Analysis System
Vargas, F., Carvalho, I., Schmeisser-Nieto, W., Benevenuto, F., Pardo, T.A.S.
Recent Advances in Natural Language Processing (RANLP). pp.1-7. Varna, Bulgaria. accepted
-
Multimodal Hate Speech Detection
Thapa, S, Jafr, F. A., Hürriyetoğlu, A., Vargas, F., Lee, R. K., Naseem, U.
6th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (RANLP). pp.151-159. Varna, Bulgaria. accepted
2022
-
Rhetorical Structure Approach for Online Deception Detection: A Survey
Vargas, F., D'Alessandro, J., Rabinovich, Z., Benevenuto, F., Pardo, T.A.S.
13th Conference on Language Resources and Evaluation (LREC). pp.5906‑5915. Marseille, France. see
-
HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Abusive Language Detection
Vargas, F., Carvalho, I., Góes, F.R., Pardo, T.A.S., Benevenuto, F.
13th Conference on Language Resources and Evaluation (LREC). pp.7174–7183. Marseille, France. see
-
Studying Dishonest Intentions in Brazilian Portuguese Texts
Vargas, F., Pardo, T.A.S.
Deceptive AI. Springer, vol 1296. pp.166–178. see
-
Extended Multilingual Protest News Detection
Hurriyetoglu, A., Mutlu, O., San, F. D., Uca, O., Gurel, A. S., Radford, B., Dai, Y., Hettiarachchi, H., Stoehr, N., Nomoto, T., Slavcheva, M., Vargas, F., Javid, A., Beyhan, F., Yoruk, E.
5th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (EMNLP). pp.223–228. Abu Dhabi, Arab Emirates. see
2021
-
Contextual-Lexicon Approach for Abusive Language Detection
Vargas, F., Góes, F.R., Carvalho, I., Benevenuto, F., Pardo, T.A.S.
Recent Advances in Natural Language Processing (RANLP). pp.1442-1451. Held Online. see
-
Towards Discourse-Aware Models for Multilingual Fake News Detection
Vargas, F., Benevenuto, F., Pardo, T.A.S.
Recent Advances in Natural Language Processing (RANLP). pp.210-218. Held Online. see
-
Implicit Opinion Aspect Clues in Portuguese Texts: Analysis and Categorization
Machado, M. T., Pardo, T.A.S., Ruiz, E. E. S., Di Felippo, Vargas, F.
15th International Conference on the Computational Processing of Portuguese (PROPOR). pp.68-78. Fortaleza, Brazil. see
2020
-
Linguistic Rules for Fine-Grained Opinion Extraction
Vargas, F., Pardo, T.A.S.
5th International Workshop on Social Sensing: Special Edition on Narrative Analysis on Social Media (ICWSM). pp.1-6. Held Online. see
-
Identifying Fine-Grained Opinion and Classifying Polarity on Coronavirus Pandemic
Vargas, F., Santos, R.S.S., F., Rocha, P.R.
9th Brazilian Conference on Intelligent Systems (BRACIS). pp.511-520. Rio Grande, Brazil. see
2019 and before
-
Aspect Clustering Methods for Sentiment Analysis
Vargas, F., Pardo, T.A.S.
13th International Conference on the Computational Processing of Portuguese (PROPOR). pp.365-374. Canela, Brazil. see
-
The Coreference Annotation of the CSTNews Corpus
Pardo, T.A.S., Baptista, J., Duran, M.S., Nunes, M.G.V., Nóbrega, F.A.A., Aluísio, S.M., Di Felippo, A., Seno, E.R.M., Silva, R.R., Anchieta, R.T., Brum, H.B., Dias, M.S., Martins, R.S.O., Maziero, E.G., Souza, J.W.C., Vargas, F.
2nd Workshop on Evaluation of Human Language Technologies for Iberian Language (SEPLN). pp.102-112. Murcia, Spain. see
Committees
Organizing Committee
- International AAAI Conference on Web and Social Media (ICWSM 2023) (ICWSM 2022) (ICWSM 2021)
Program Committee
- ACL Student Research Workshop (ACL 2023)
- Discourse and Pragmatics Track (ACL 2023)
- Computational Social Science and Cultural Analytics Track (ACL 2023)
- NLP Applications Track (EMNLP 2022)
- Discourse and Pragmatics Track (EMNLP 2022) (EMNLP 2023)
- Computational Social Science and Cultural Analytics Track (EMNLP 2022) (EMNLP 2023)
- NAACL Student Research Workshop (NAACL 2022)
- Workshop on Online Abuse and Harms (NAACL 2022) (ACL 2023)
- Fact Extraction and VERification (EACL 2023)
- Workshop on Computational Approaches to Discourse (COLING 2022) (ACL 2023)
- Workshop on Argument Mining and Workshop on Computational Models of Natural Argument (COLING 2022) (COMMA 2022) (ICLP 2023)
- Southern California Natural Language Processing Symposium (SoCal NLP 2022)
- Dataset and Demo Track (ICWSM 2023)
- Dataset and Demo Track (CIKM 2023)
- Digital and Social Media Track (HICSS 2024)
Journal Reviewer
- PLoS ONE (Since 2022 - Current)
Resources
Datasets
- HateBR: Large-scale expert annotated dataset of Brazilian Instagram comments for abusive language detection
- TEAR: A hate speech dataset of Facebook comments for the Hausa African Indigenous language
- AspectBR: Aspect-based sentiment analysis annotated dataset of web consumer reviews
- OPCovidBR: Aspect-based sentiment analysis annotated dataset of Covid-19 tweets
- FactNews: Sentence-level annotated dataset to predict factually and media bias
- Deceiver: Multilingual discourse-annotated dataset for fake news detection
Softwares
- NoHateBrazil: A Brazilian Portuguese text offensiveness analysis system
- OPCluster: Automatic extraction and clustering of fine-grained opinions
- FACTual: Automated fact-checking and news credibility
Lexicons
- MOL: Multilingual offensive lexicon annotated with contextual information
- PRO: Taxonomies for aspect-based sentiment analysis
Teaching
- SCC5809-2021: Neural Networks and Deep Learning . Graduate Teaching Assistant. University of São Paulo. see
- SCC0605-2020: Computing Theory and Compilers. Graduate Teaching Assistant. University of São Paulo. see
- SCC0227-2016: Computer Seminars I. Graduate Teaching Assistant. University of São Paulo. see
Projects
- 2022-Current: Towards Explicable Fact-Checking Through Factual Reasoning. University of São Paulo
- 2022-2023: MultiCrowsPairs: Measuring Social Biases in Multilingual Masked Language Models. Sorbonne University
- 2022-2023: Predicting Hate Speech for Hausa African Indigenous Language. Federal University of Minas Gerais
- 2020-2023: Discourse-Aware Computational Resources for Fake News Detection. University of São Paulo
- 2020-2022: Socially Responsible Methods and Resources for Hate Speech Detection. University of São Paulo
- 2022-2022: Expanding Evaluation Data for the Multilingual Protest News Detection. Koç University
- 2020-2020: Detecting Antisemitism on Social Media [pdf]. Indiana University Bloomington
- 2019-2020: Aspect-Based Sentiment Analysis. University of São Paulo
- 2015-2017: Summarization for Clever Information Access. University of São Paulo
Honors
- 2021: Research PhD Fellowship. Sinch Company. see
- 2013: Academic Relevance. Federal University of Minas Gerais. see
- 2013: Honorable Mention. Federal University of Minas Gerais. see
- 2012: Academic Relevance. Federal University of Minas Gerais. see