Francielle Vargas

Ph.D. Candidate in Computer Science - Natural Language Processing

I am a computer and language scientist with an M.Sc. and Ph.D. (final year) in Natural Language Processing. I obtained my M.Sc. in Computer Science and Computational Mathematics from the University of São Paulo (awarded in 2017). Previously, I obtained a B.S. in Computer Information Systems and a B.A. in Linguistics.

I am interested in Natural Language Processing, Machine Learning and Computational Social Science. My research lies in the investigation of safe, trustworthy, and socially responsible human language technologies. I rely on machine learning techniques, including neural networks, to design and guide natural language system development. The topics that I am currently researching are:

  • fact-checking, fake news and media bias detection, factuality, misinformation
  • hate speech and abusive language detection, radicalism, bias mitigation, fairness
  • opinion and argument mining, emotion, sentiment and stylistic analysis, subjectivity
  • computational models and datasets for discourse level-language understanding and generation

  • Research Projects

    Honors & Awards
    • 2024: Latin America Research Awards (LARA). Google
    • 2013: Outstanding Academic Achievement: Academic Relevance & Honorable Mention. UFMG
    • 2012: Outstanding Academic Achievement: Academic Relevance. UFMG

    Invited Speaker

    Publications
    2024
      • Discourse Annotation Guideline for Low-Resource Languages
        Vargas, F., Schmeisser-Nieto, W., Rabinovich, Z., W., Pardo, T.A.S., Benevenuto, F.
        Natural Language Engineering Journal. Cambridge core pp.1-33. accepted

      • Extended Multimodal Hate Speech Event Detection During Russia-Ukraine Crisis
        Thapa, S; Rauniyar, K.; Jafri, F. A.; Veeramani, H.; Jain, R.; Jain, S.; Vargas, F., Hürriyetoğlu, A.; Naseem, U.
        7th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (EACL). pp.221–228. St. Julians, Malta. see

    2023
      • Predicting Sentence-Level Factuality of News and Bias of Media Outlets
        Vargas, F., Jaidka, K., Pardo, T.A.S., Benevenuto, F.
        Recent Advances in Natural Language Processing (RANLP). pp. 1197–1206. Varna, Bulgaria. see

      • Socially Responsible Hate Speech Detection: Can Classifiers Reflect Social Stereotypes?
        Vargas, F., Carvalho, I., Hürriyetoğlu, A., Pardo, T.A.S., Benevenuto, F.
        Recent Advances in Natural Language Processing (RANLP). pp. 1187–1196. Varna, Bulgaria. see

      • NoHateBrazil: A Brazilian Portuguese Text Offensiveness Analysis System
        Vargas, F., Carvalho, I., Schmeisser-Nieto, W., Benevenuto, F., Pardo, T.A.S.
        Recent Advances in Natural Language Processing (RANLP). pp.1180–1186. Varna, Bulgaria. see

      • Multimodal Hate Speech Detection
        Thapa, S, Jafri, F. A., Hürriyetoğlu, A., Vargas, F., Lee, R. K., Naseem, U.
        6th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (RANLP). pp.151-159. Varna, Bulgaria. see

    2022
      • Rhetorical Structure Approach for Online Deception Detection: A Survey
        Vargas, F., D'Alessandro, J., Rabinovich, Z., Benevenuto, F., Pardo, T.A.S.
        13th Conference on Language Resources and Evaluation (LREC). pp.5906‑5915. Marseille, France. see

      • HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
        Vargas, F., Carvalho, I., Góes, F.R., Pardo, T.A.S., Benevenuto, F.
        13th Conference on Language Resources and Evaluation (LREC). pp.7174–7183. Marseille, France. see

      • Studying Dishonest Intentions in Texts
        Vargas, F., Pardo, T.A.S.
        Deceptive AI. Springer, vol 1296. pp.166–178. see

      • Extended Multilingual Protest News Detection
        Hurriyetoglu, A., Mutlu, O., San, F. D., Uca, O., Gurel, A. S., Radford, B., Dai, Y., Hettiarachchi, H., Stoehr, N., Nomoto, T., Slavcheva, M., Vargas, F., Javid, A., Beyhan, F., Yoruk, E.
        5th International Workshop Challenges and Applications of Automated Extraction of Socio-Political Events from Text (EMNLP). pp.223–228. Abu Dhabi, Arab Emirates. see

    2021
      • Contextual-Lexicon Approach for Abusive Language Detection
        Vargas, F., Góes, F.R., Carvalho, I., Benevenuto, F., Pardo, T.A.S.
        Recent Advances in Natural Language Processing (RANLP). pp.1442-1451. Held Online. see

      • Toward Discourse-Aware Models for Multilingual Fake News Detection
        Vargas, F., Benevenuto, F., Pardo, T.A.S.
        Recent Advances in Natural Language Processing (RANLP). pp.210-218. Held Online. see

    2020
      • Linguistic Rules for Fine-Grained Opinion Extraction
        Vargas, F., Pardo, T.A.S.
        5th International Workshop on Social Sensing: Special Edition on Narrative Analysis on Social Media (ICWSM). pp.1-6. Held Online. see

      • Identifying Fine-Grained Opinion and Classifying Polarity on Coronavirus Pandemic
        Vargas, F., Santos, R.S.S., F., Rocha, P.R.
        9th Brazilian Conference on Intelligent Systems (BRACIS). pp.511-520. Rio Grande, Brazil. see

    2019 and before
      • Aspect Clustering Methods for Sentiment Analysis
        Vargas, F., Pardo, T.A.S.
        13th International Conference on the Computational Processing of Portuguese (PROPOR). pp.365-374. Canela, Brazil. see

      • The Coreference Annotation of the CSTNews Corpus
        Pardo, T.A.S., Baptista, J., Duran, M.S., Nunes, M.G.V., Nóbrega, F.A.A., Aluísio, S.M., Di Felippo, A., Seno, E.R.M., Silva, R.R., Anchieta, R.T., Brum, H.B., Dias, M.S., Martins, R.S.O., Maziero, E.G., Souza, J.W.C., Vargas, F.
        2nd Workshop on Evaluation of Human Language Technologies for Iberian Language (SEPLN). pp.102-112. Murcia, Spain. see


    Committees

    Organizing Committee


    Program Committee

    Natural Language Processing

    Computational Social Science and Data Science

    Journal Reviewer


    Resources
    Patents Datasets
    • HateBR: Large-scale expert annotated corpus of Brazilian Instagram comments for abusive language detection
    • FactNews: Sentence-level annotated corpus to predict factuality of news articles and bias of media outlets
    • SentiAspect-pt: Aspect-based sentiment analysis annotated corpus of web consumer reviews
    • OPCovidBR: Aspect-based sentiment analysis annotated corpus of Covid-19 tweets
    • Deceiver: Multilingual RST-annotated corpus for fake news detection
    Softwares Lexicons
    • MOL: Multilingual offensive lexicon annotated with contextual information
    • PRO: Taxonomies for aspect-based sentiment analysis

    Teaching

    Industry
    • 2021-2023: Research Fellow. Sinch
    • 2021-2021: Data Scientst. Cisco-Webex
    • 2014-2015: System Analyst. Unisys

    Links

    © Copyright 2021 Francielle Vargas. Hosted by GitHub Pages.