A novel method for gathering data for a range of mental illnesses quickly and cheaply is presented, then analysis of four in particular: post-traumatic stress disorder, depression, bipolar disorder, and seasonal affective disorder are focused on.
Abstract:
The ubiquity of social media provides a rich opportunity to enhance the data available to mental health clinicians and researchers, enabling a better-informed and better-equipped mental health field. We present analysis of mental health phenomena in publicly available Twitter data, demonstrating how rigorous application of simple natural language processing methods can yield insight into specific disorders as well as mental health writ large, along with evidence that as-of-yet undiscovered linguistic signals relevant to mental health exist in social media. We present a novel method for gathering data for a range of mental illnesses quickly and cheaply, then focus on analysis of four in particular: post-traumatic stress disorder (PTSD), depression, bipolar disorder, and seasonal affective disorder (SAD). We intend for these proof-of-concept results to inform the necessary ethical discussion regarding the balance between the utility of such data and the privacy of mental health related information.
TL;DR: This paper develops a statistical methodology to infer which individuals could undergo transitions from mental health discourse to suicidal ideation, and utilizes semi-anonymous support communities on Reddit as unobtrusive data sources to infer the likelihood of these shifts.
TL;DR: Automated detection methods may help to identify depressed or otherwise at-risk individuals through the large-scale passive monitoring of social media, and in the future may complement existing screening procedures.
TL;DR: It is shown that the content shared by consenting users on Facebook can predict a future occurrence of depression in their medical records, and language predictive of depression includes references to typical symptoms, including sadness, loneliness, hostility, rumination, and increased self-reference.
TL;DR: The feasibility of using social media data to detect those at risk for suicide, using natural language processing and machine learning techniques to detect quantifiable signals around suicide attempts, and designs for an automated system for estimating suicide risk are described.
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
TL;DR: This book describes ggplot2, a new data visualization package for R that uses the insights from Leland Wilkisons Grammar of Graphics to create a powerful and flexible system for creating data graphics.
TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.
TL;DR: The Linguistic Inquiry and Word Count (LIWC) system as discussed by the authors is a text analysis system that counts words in psychologically meaningful categories to detect meaning in a wide variety of experimental settings, including to show attentional focus, emotionality, social relationships, thinking styles and individual differences.
Q1. What are the contributions in "Quantifying mental health signals in twitter" ?
The ubiquity of social media provides a rich opportunity to enhance the data available to mental health clinicians and researchers, enabling a better-informed and better-equipped mental health field. The authors present analysis of mental health phenomena in publicly available Twitter data, demonstrating how rigorous application of simple natural language processing methods can yield insight into specific disorders as well as mental health writ large, along with evidence that as-of-yet undiscovered linguistic signals relevant to mental health exist in social media. The authors present a novel method for gathering data for a range of mental illnesses quickly and cheaply, then focus on analysis of four in particular: post-traumatic stress disorder ( PTSD ), depression, bipolar disorder, and seasonal affective disorder ( SAD ). The authors intend for these proof-of-concept results to inform the necessary ethical discussion regarding the balance between the utility of such data and the privacy of mental health related information.
Q2. What are the future works mentioned in the paper "Quantifying mental health signals in twitter" ?
Crucially, the authors expect that these novel data collection methods can provide complementary information to existing survey-based methods, rather than supplant them. For many disorders rarer than depression ( which has comparatively high incidence rates ), the authors suspect that finding any data will be a challenge, in which case combining these methods with the existing survey collection methods may be the best way to obtain sufficient amounts of data for statistical analyses. Uncovering and interpreting these signals can be best accomplished through collaboration between NLP and mental health researchers. They indicate that individual- and population-level analyses can be made cheaper and more timely than current methods, yet there remains as-of-yet untapped information encoded in language use – promising a rich collaboration between the fields of natural language processing and mental health.