The FLDA model for aspect-based opinion mining: addressing the cold start problem

doi:10.1145/2488388.2488467

Proceedings ArticleDOI

The FLDA model for aspect-based opinion mining: addressing the cold start problem

- pp 909-918

TLDR

This paper proposes a probabilistic graphical model based on LDA, called Factorized LDA (FLDA), to address the cold start problem and demonstrates the improved effectiveness of the FLDA model in terms of likelihood of the held-out test set.

Abstract:

Aspect-based opinion mining from online reviews has attracted a lot of attention recently The main goal of all of the proposed methods is extracting aspects and/or estimating aspect ratings Recent works, which are often based on Latent Dirichlet Allocation (LDA), consider both tasks simultaneously These models are normally trained at the item level, ie, a model is learned for each item separately Learning a model per item is fine when the item has been reviewed extensively and has enough training data However, in real-life data sets such as those from Epinionscom and Amazoncom more than 90% of items have less than 10 reviews, so-called cold start items State-of-the-art LDA models for aspect-based opinion mining are trained at the item level and therefore perform poorly for cold start items due to the lack of sufficient training data In this paper, we propose a probabilistic graphical model based on LDA, called Factorized LDA (FLDA), to address the cold start problem The underlying assumption of FLDA is that aspects and ratings of a review are influenced not only by the item but also by the reviewer It further assumes that both items and reviewers can be modeled by a set of latent factors which represent their aspect and rating distributions Different from state-of-the-art LDA models, FLDA is trained at the category level and learns the latent factors using the reviews of all the items of a category, in particular the non cold start items, and uses them as prior for cold start items Our experiments on three real-life data sets demonstrate the improved effectiveness of the FLDA model in terms of likelihood of the held-out test set We also evaluate the accuracy of FLDA based on two application-oriented measures

The FLDA model for aspect-based opinion mining: addressing the cold start problem

Citations

Hidden factors and hidden topics: understanding rating dimensions with review text

A survey on opinion mining and sentiment analysis

Survey on Aspect-Level Sentiment Analysis

Inferring Networks of Substitutable and Complementary Products

Inferring Networks of Substitutable and Complementary Products

References

Latent dirichlet allocation

Latent Dirichlet Allocation

Matrix Factorization Techniques for Recommender Systems

Mining and summarizing customer reviews

Sentiment Analysis and Opinion Mining

Related Papers (5)

Latent dirichlet allocation

Aspect and sentiment unification model for online review analysis

Mining and summarizing customer reviews

A Joint Model of Text and Aspect Ratings for Sentiment Summarization

Topic sentiment mixture: modeling facets and opinions in weblogs