Code & Dataset

MDD&BD Risk (NAACL 2024)

With the supervision of a psychiatrist, the three trained annotators labeled 1,025 users and their 7,346 anonymized Reddit posts using the open-source text annotation tool Doccano. During annotations, we mainly consider two different label categories: (i) Diagnosis Type (e.g., MDD, BD) and (ii) BD Mood Level with a scale ranging from -3 to 3. If there is any conflict in the annotated labels across the annotators, all the annotators discuss and reach to an agreement under the supervision of the psychiatrists.

This dataset contains the assessment of the (i) BD symptoms (e.g., manic, anxiety) and (ii) suicidality levels (e.g., ideation, attempt) labeled 818 users and their 7,592 anonymized Reddit posts.


 

This dataset contains the assessment of the severity of suicidality of 866 Reddit users who had posted on the r/SuicideWatch subreddit from 2008 to 2015 and their 79,569 posts uploaded to 37,083 subreddits





These datasets contain the suicide-related and non-suicide-related Korean posts from Naver Cafe, and suicide-related dictionary data for generating suicide word embeddings for Chinese, English, and Korean, respectively.