scispace - formally typeset
Search or ask a question
Author

Lukasz Dziurzynski

Bio: Lukasz Dziurzynski is an academic researcher from University of Pennsylvania. The author has contributed to research in topics: Personality & Vocabulary. The author has an hindex of 7, co-authored 7 publications receiving 2050 citations.

Papers
More filters
Journal ArticleDOI
25 Sep 2013-PLOS ONE
TL;DR: This represents the largest study, by an order of magnitude, of language and personality, and found striking variations in language with personality, gender, and age.
Abstract: We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personality.

1,435 citations

Journal ArticleDOI
TL;DR: Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level.
Abstract: Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions—especially anger—emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level.

409 citations

Proceedings Article
28 Jun 2013
TL;DR: This article used LDA to predict subjective well-being of people living in those counties as measured by representative surveys and found that LDA topics provide a greater behavioural and conceptual resolution into life satisfaction than the broad socioeconomic and demographic variables.
Abstract: The language used in tweets from 1,300 different US counties was found to be predictive of the subjective well-being of people living in those counties as measured by representative surveys. Topics, sets of co-occurring words derived from the tweets using LDA, improved accuracy in predicting life satisfaction over and above standard demographic and socio-economic controls (age, gender, ethnicity, income, and education). The LDA topics provide a greater behavioural and conceptual resolution into life satisfaction than the broad socio-economic and demographic variables. For example, tied in with the psychological literature, words relating to outdoor activities, spiritual meaning, exercise, and good jobs correlate with increased life satisfaction, while words signifying disengagement like ’bored’ and ’tired’ show a negative association.

220 citations

Proceedings ArticleDOI
01 Jan 2016
TL;DR: The task of predicting individual well-being, as measured by a life satisfaction scale, through the language people use on social media is presented, finding that a multi-level cascaded model, using both message-level predictions and userlevel features, performs best and outperforms popular lexicon-based happiness models.
Abstract: We present the task of predicting individual well-being, as measured by a life satisfaction scale, through the language people use on social media. Well-being, which encompasses much more than emotion and mood, is linked with good mental and physical health. The ability to quickly and accurately assess it can supplement multi-million dollar national surveys as well as promote whole body health. Through crowd-sourced ratings of tweets and Facebook status updates, we create message-level predictive models for multiple components of well-being. However, well-being is ultimately attributed to people, so we perform an additional evaluation at the user-level, finding that a multi-level cascaded model, using both message-level predictions and userlevel features, performs best and outperforms popular lexicon-based happiness models. Finally, we suggest that analyses of language go beyond prediction by identifying the language that characterizes well-being.

122 citations

Journal ArticleDOI
TL;DR: A new open language analysis approach is presented that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personality trait.
Abstract: Objective: We present a new open language analysis approach that identifies and visually summarizes the dominant naturally occurring words and phrases that most distinguished each Big Five personal...

104 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The aim of this review is to describe the chemical characteristics of compounds present in honey, their stability when heated or stored for long periods of time and the parameters of identity and quality.

810 citations

17 Dec 2010
TL;DR: The authors survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, using a corpus of digitized texts containing about 4% of all books ever printed.
Abstract: L'article, publie dans Science, sur une des premieres utilisations analytiques de Google Books, fondee sur les n-grammes (Google Ngrams) We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can ...

735 citations

Journal ArticleDOI
TL;DR: This work demonstrates how to recruit participants using Facebook, incentivize them effectively, and maximize their engagement, and outlines the most important opportunities and challenges associated with using Facebook for research.
Abstract: Facebook is rapidly gaining recognition as a powerful research tool for the social sciences. It constitutes a large and diverse pool of participants, who can be selectively recruited for both online and offline studies. Additionally, it facilitates data collection by storing detailed records of its users' demographic profiles, social interactions, and behaviors. With participants' consent, these data can be recorded retrospectively in a convenient, accurate, and inexpensive way. Based on our experience in designing, implementing, and maintaining multiple Facebook-based psychological studies that attracted over 10 million participants, we demonstrate how to recruit participants using Facebook, incentivize them effectively, and maximize their engagement. We also outline the most important opportunities and challenges associated with using Facebook for research, provide several practical guidelines on how to successfully implement studies on Facebook, and finally, discuss ethical considerations.

709 citations

Journal ArticleDOI
TL;DR: It is concluded that debate over the optimal name for this broad category of personal qualities obscures substantial agreement about the specific attributes worth measuring and medium-term innovations that may make measures of these personal qualities more suitable for educational purposes are highlighted.
Abstract: There has been perennial interest in personal qualities other than cognitive ability that determine success, including self-control, grit, growth mindset, and many others. Attempts to measure such qualities for the purposes of educational policy and practice, however, are more recent. In this article, we identify serious challenges to doing so. We first address confusion over terminology, including the descriptor "non-cognitive." We conclude that debate over the optimal name for this broad category of personal qualities obscures substantial agreement about the specific attributes worth measuring. Next, we discuss advantages and limitations of different measures. In particular, we compare self-report questionnaires, teacher-report questionnaires, and performance tasks, using self-control as an illustrative case study to make the general point that each approach is imperfect in its own way. Finally, we discuss how each measure's imperfections can affect its suitability for program evaluation, accountability, individual diagnosis, and practice improvement. For example, we do not believe any available measure is suitable for between-school accountability judgments. In addition to urging caution among policymakers and practitioners, we highlight medium-term innovations that may make measures of these personal qualities more suitable for educational purposes.

687 citations