Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

Authorship Attribution with Latent Dirichlet Allocation

[...]

Yanir Seroussi¹, Ingrid Zukerman¹, Fabian Bohnert¹•Institutions (1)

Monash University, Clayton campus¹

23 Jun 2011

TL;DR: This paper proposes ways of employing Latent Dirichlet Allocation in authorship attribution and shows that this approach yields state-of-the-art performance for both a few and many candidate authors, in cases where these authors wrote enough texts to be modelled effectively.

...read moreread less

Abstract: The problem of authorship attribution -- attributing texts to their original authors -- has been an active research area since the end of the 19th century, attracting increased interest in the last decade. Most of the work on authorship attribution focuses on scenarios with only a few candidate authors, but recently considered cases with tens to thousands of candidate authors were found to be much more challenging. In this paper, we propose ways of employing Latent Dirichlet Allocation in authorship attribution. We show that our approach yields state-of-the-art performance for both a few and many candidate authors, in cases where these authors wrote enough texts to be modelled effectively.

...read moreread less

64 citations

Book Chapter•DOI•

Predicting best answerers for new questions in community question answering

[...]

Mingrong Liu¹, Yicen Liu¹, Qing Yang¹•Institutions (1)

Chinese Academy of Sciences¹

15 Jul 2010

TL;DR: Experimental results show the proposed method can effectively push new questions to the best answerers, and interests of answerers are modeled with the mixture of the Language Model and the Latent Dirichlet Allocation model.

...read moreread less

Abstract: Community question answering (CQA) has become a very popular web service to provide a platform for people to share knowledge. In current CQA services, askers post their questions to the system and wait for answerers to answer them passively. This procedure leads to several drawbacks. Since new questions are presented to all users in the system, the askers can not expect some experts to answer their questions. Meanwhile, answerers have to visit many questions and then pick out only a small part of them to answer. To overcome those drawbacks, a probabilistic framework is proposed to predict best answerers for new questions. By tracking answerers' answering history, interests of answerers are modeled with the mixture of the Language Model and the Latent Dirichlet Allocation model. User activity and authority information is also taken into consideration. Experimental results show the proposed method can effectively push new questions to the best answerers.

...read moreread less

64 citations

Proceedings Article•DOI•

Twitter Opinion Topic Model: Extracting Product Opinions from Tweets by Leveraging Hashtags and Sentiment Lexicon

[...]

Kar Wai Lim¹, Wray Buntine²•Institutions (2)

Australian National University¹, Monash University²

03 Nov 2014

TL;DR: Wang et al. as mentioned in this paper proposed an LDA-based opinion model named Twitter Opinion Topic Model (TOTM) for opinion mining and sentiment analysis, which leverages hashtags, mentions, emoticons and strong sentiment words that are present in tweets in its discovery process.

...read moreread less

Abstract: Aspect-based opinion mining is widely applied to review data to aggregate or summarize opinions of a product, and the current state-of-the-art is achieved with Latent Dirichlet Allocation (LDA)-based model. Although social media data like tweets are laden with opinions, their "dirty" nature (as natural language) has discouraged researchers from applying LDA-based opinion model for product review mining. Tweets are often informal, unstructured and lacking labeled data such as categories and ratings, making it challenging for product opinion mining. In this paper, we propose an LDA-based opinion model named Twitter Opinion Topic Model (TOTM) for opinion mining and sentiment analysis. TOTM leverages hashtags, mentions, emoticons and strong sentiment words that are present in tweets in its discovery process. It improves opinion prediction by modeling the target-opinion interaction directly, thus discovering target specific opinion words, neglected in existing approaches. Moreover, we propose a new formulation of incorporating sentiment prior information into a topic model, by utilizing an existing public sentiment lexicon. This is novel in that it learns and updates with the data. We conduct experiments on 9 million tweets on electronic products, and demonstrate the improved performance of TOTM in both quantitative evaluations and qualitative analysis. We show that aspect-based opinion analysis on massive volume of tweets provides useful opinions on products.

...read moreread less

64 citations

Journal Article•DOI•

Incorporating appraisal expression patterns into topic modeling for aspect and sentiment word identification

[...]

Xiaolin Zheng¹, Zhen Lin², Xiaowei Wang³, Kwei-Jay Lin⁴, Meina Song⁵ - Show less +1 more•Institutions (5)

Stanford University¹, University of Illinois at Urbana–Champaign², Zhejiang University³, University of California, Irvine⁴, Beijing University of Posts and Telecommunications⁵

01 May 2014-Knowledge Based Systems

TL;DR: An unsupervised dependency analysis-based approach is presented to extract Appraisal Expression Patterns (AEPs) from reviews, which represent the manner in which people express opinions regarding products or services and can be regarded as a condensed representation of the syntactic relationship between aspect and sentiment words.

...read moreread less

Abstract: With the considerable growth of user-generated content, online reviews are becoming extremely valuable sources for mining customers' opinions on products and services However, most of the traditional opinion mining methods are coarse-grained and cannot understand natural languages Thus, aspect-based opinion mining and summarization are of great interest in academic and industrial research In this paper, we study an approach to extract product and service aspect words, as well as sentiment words, automatically from reviews An unsupervised dependency analysis-based approach is presented to extract Appraisal Expression Patterns (AEPs) from reviews, which represent the manner in which people express opinions regarding products or services and can be regarded as a condensed representation of the syntactic relationship between aspect and sentiment words AEPs are high-level, domain-independent types of information, and have excellent domain adaptability An AEP-based Latent Dirichlet Allocation (AEP-LDA) model is also proposed This is a sentence-level, probabilistic generative model which assumes that all words in a sentence are drawn from one topic – a generally true assumption, based on our observation The model also assumes that every review corpus is composed of several mutually corresponding aspect and sentiment topics, as well as a background word topic The AEP information is incorporated into the AEP-LDA model for mining aspect and sentiment words simultaneously The experimental results on reviews of restaurants, hotels, MP3 players, and cameras show that the AEP-LDA model outperforms other approaches in identifying aspect and sentiment words

...read moreread less

63 citations

Journal Article•DOI•

On Bayesian analysis of a finite generalized Dirichlet mixture via a Metropolis-within-Gibbs sampling

[...]

Nizar Bouguila¹, Djemel Ziou², Riad I. Hammoud³•Institutions (3)

Concordia University Wisconsin¹, Université de Sherbrooke², Delphi Automotive³

23 Feb 2009-Pattern Analysis and Applications

TL;DR: This paper presents a fully Bayesian approach for generalized Dirichlet mixtures estimation and selection, based on the Monte Carlo simulation technique of Gibbs sampling mixed with a Metropolis-Hastings step, and obtains a posterior distribution which is conjugate to a generalizedDirichlet likelihood.

...read moreread less

Abstract: In this paper, we present a fully Bayesian approach for generalized Dirichlet mixtures estimation and selection. The estimation of the parameters is based on the Monte Carlo simulation technique of Gibbs sampling mixed with a Metropolis-Hastings step. Also, we obtain a posterior distribution which is conjugate to a generalized Dirichlet likelihood. For the selection of the number of clusters, we used the integrated likelihood. The performance of our Bayesian algorithm is tested and compared with the maximum likelihood approach by the classification of several synthetic and real data sets. The generalized Dirichlet mixture is also applied to the problems of IR eye modeling and introduced as a probabilistic kernel for Support Vector Machines.

...read moreread less

63 citations

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics