scispace - formally typeset
Book ChapterDOI

Personality Recognition from Facebook Text

Reads0
Chats0
TLDR
Results suggest that word embedding models slightly outperform the alternatives under consideration, with the advantage of not requiring any language-specific lexical resources.
Abstract
This work concerns a study in the Natural Language Processing field aiming to recognise personality traits in Portuguese written text. To this end, we first built a corpus of Facebook status updates labelled with the personality traits of their authors, from which we trained a number of computational models of personality recognition. The models include a range of alternatives ranging from a standard approach relying on lexical knowledge from the LIWC dictionary and others, to purely text-based methods such as bag of words, word embeddings and others. Results suggest that word embedding models slightly outperform the alternatives under consideration, with the advantage of not requiring any language-specific lexical resources.

read more

Citations
More filters
Journal ArticleDOI

Computational personality recognition from Facebook text: psycholinguistic features, words and facets

TL;DR: Questions about psycholinguistics-motivated models of personality recognition when such knowledge sources are not available for the target language under consideration are dealt with in a series of individual experiments, whose initial results should aid the future development of more robust systems of this kind.
Book ChapterDOI

Personality Identification from Social Media Using Deep Learning: A Review

TL;DR: Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review.
Journal ArticleDOI

Knowledge Graph-Enabled Text-Based Automatic Personality Prediction

TL;DR: A novel knowledge graph-enabled approach to text-based APP that relies on the Big Five personality traits is presented, which indicated considerable improvements in prediction accuracies in all of the suggested classifiers.
Proceedings Article

Cross-domain Author Gender Classification in Brazilian Portuguese.

TL;DR: A cross-domain gender classification task based on four domains (Facebook, crowd sourced opinions, Blogs and E-gov requests) in the Brazilian Portuguese language is discussed, and a number of simple gender classification models using word- and psycholinguistics-based features alike are introduced.
Proceedings Article

Searching Brazilian Twitter for Signs of Mental Health Issues

TL;DR: The initial steps towards building a novel resource of this kind - a corpus intended to support both the recognition of mental health issues and the temporal analysis of these illnesses - in the Brazilian Portuguese language are described and initial results of a number of experiments in text classification addressing both tasks are described.
References
More filters
Proceedings Article

Distributed Representations of Sentences and Documents

TL;DR: Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.
Journal ArticleDOI

An alternative "description of personality": the big-five factor structure.

TL;DR: The generality of this 5-factor model is here demonstrated across unusually comprehensive sets of trait terms, which suggest their potential utility as Big-Five markers in future studies.
Posted Content

Distributed Representations of Sentences and Documents

TL;DR: The authors proposed paragraph vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and achieved new state-of-the-art results on several text classification and sentiment analysis tasks.
Proceedings Article

Linguistic Regularities in Continuous Space Word Representations

TL;DR: The vector-space word representations that are implicitly learned by the input-layer weights are found to be surprisingly good at capturing syntactic and semantic regularities in language, and that each relationship is characterized by a relation-specific vector offset.
Journal ArticleDOI

The MRC Psycholinguistic Database

TL;DR: A computerised database of psycholinguistic information is described, where semantic, syntactic, phonological and orthographic information about some or all of the 98,538 words in the database is accessible, by using a specially-written and very simple programming language.
Related Papers (5)