SetembroBR | research

SetembroBR: Social media analysis for early detection of mental health disorders

Overview

The observation that individuals with mental health disorders such as depression and anxiety are often regular users of social media has led to the development of a wide range of studies in the Natural Language Processing (NLP) field for risk assessment based on the kind of language employed by these individuals. Existing work in the field is however largely dedicated to the English language, and tends to consider publications (e.g., tweets) produced at any time, including even those produced after the individual is already clinically diagnosed. Thus, models of this kind tend to focus more on the issue of distinguishing individuals with and without a certain disorder, but are perhaps less able to anticipate these as a means to prevent their possible aggravation. Based on these observations, this project proposes to explore the temporal information provided by the Twitter platform for the study and development of computational models for early recognition of depression and anxiety disorder in Portuguese using a database - called the SeptemberBR corpus - designed so as to include only texts that are chronologically prior to the date of diagnosis reported by social media users. A study of this kind, in addition to introducing a novel (and possibly more useful) formulation of the present computational problem, opens up the opportunity for a number of scientific contributions in the NLP field, including the modeling of textual and non-textual features and the use of recent neural learning methods, and enables novel solutions for a pressing issue of great social interest.

Current status

The project ran from May 2022 to April 2024 under FAPESP grant nr. 2021/08213-0. Complementary funding has been provided by CAPES grant nr.88887.475847/2020-00, and by the Center for Artificial Intelligence (C4AI-USP) with support by the São Paulo Research Foundation (FAPESP grant #2019/07665-4) and by the IBM Corporation.

The corpus has been publicy released for reuse - see download link.

Publications

dos Santos, Wesley Ramos; Amanda Maria Martins Funabashi ; Ivandré Paraboni (2020) Searching Brazilian Twitter for signs of mental health issues. 12th Language Resources and Evaluation Conference (LREC-2020). pp. 6113-6119, Marseille, France.

dos Santos, Wesley Ramos; Sungwon Yoon; Ivandré Paraboni (2023) Mental health prediction from social media text using mixture of experts. IEEE Latin America Transactions 21(6), pp.723-729. 10.1109/TLA.2023.10172137.

da Costa, Pablo Botton; Matheus Camasmie Pavan; Wesley Ramos dos Santos; Samuel Caetano da SIlva; Ivandré Paraboni (2023) BERTabaporu: assessing a genre-specific language model for Portuguese NLP. Recent Advances in Natural Language Processing (RANLP-2023).pp. 217-223, Varna, Bulgaria. BERTabaporu download.

dos Santos, Wesley Ramos; Ivandré Paraboni (2023) Predição de transtorno depressivo em redes sociais: BERT supervisionado ou ChatGPT zero-shot? XIV Simposio Brasileiro de Tecnologia da Informação e da Linguagem Humana (STIL-2023), pp. 11-21. 10.5753/stil.2023.233275.

de Oliveira, Rafael Lage; Ivandré Paraboni (2024) A Bag-of-Users approach to mental health prediction from social media data. 16th International Conference on Computational Processing of Portuguese (PROPOR 2024). Santiago de Compostela, Spain.

de Oliveira, Rafael Lage; João Trevisan Martins ; Ivandré Paraboni (2024) Mental health prediction from social media connections. New Review of Hypermedia and Multimedia. 10.1080/13614568.2024.2346227.

Paraboni, Ivandré ; Helena de Medeiros Caseli (2024) PLN na saúde mental: detecção de transtornos de saúde mental a partir de texto. In: Processamento de Linguagem Natural: Conceitos, Técnicas e Aplicações em Português, ed.3. : BPLN, pp. 667-687.

dos Santos, Wesley Ramos ; Ivandré Paraboni (2024) Prompt-based mental health screening from social media text. Brazilian Workshop on Social Network Analysis and Mining (CSBC 2024 - BraSNAM 2024), pp.186-192. 10.5753/brasnam.2024.1879.

dos Santos, Wesley Ramos; Rafael Lage de Oliveira ; Ivandré Paraboni (2024) SetembroBR: a social media corpus for depression and anxiety disorder prediction. Language Resources and Evaluation vol. 58, pp. 273-300. 10.1007/s10579-022-09633-0.

Nagamatu, Bruno Issamo Tagava; Ivandré Paraboni (2024) Detecção precoce de transtornos de saúde mental em português. Linguamática 16(2), pp. 3-19. 10.21814/lm.16.2.440.

dos Santos, Wesley Ramos ; Ivandré Paraboni (2025) Tracking mental health indicators on social media before and after diagnosis. 28th International Conference on Text, Speech and Dialogue (TSD 2025). Erlangen, Germany, pp. 3-14. 10.1007/978-3-032-02551-7_2.

dos Santos, Wesley Ramos; Ivandré Paraboni; Elton Hiroshi Matsushima; Camila Azevedo da Silva; Emily Samara de Moura Meira; João Victor Rodrigues Ferreira Guimarães; Julia da Silva Lins; Laura Enham de Azeredo; Luiz Guilherme Cerqueira Nunes; Vittória Thiengo Silveira Moreira Rego (2025) Mixture of experts for depression and anxiety disorder prediction from textual and non-textual social media data. IEEE Access. 10.1109/ACCESS.2025.3583259.