Home
/
Topics
/
Latent Dirichlet allocation

Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1992
1990
1989
1988
1985
1979
1976
1969
1965

Papers

PDF

Open Access

More filters

Proceedings Article•

The Inverse Regression Topic Model

[...]

Maxim Rabinovich¹, David M. Blei²•Institutions (2)

University of Cambridge¹, Princeton University²

21 Jun 2014

TL;DR: This paper introduces the inverse regression topic model (IRTM), a mixed-membership extension of MNIR that combines the strengths of both methodologies, and presents two inference algorithms for the IRTM: an efficient batch estimation algorithm and an online variant, which is suitable for large corpora.

...read moreread less

Abstract: Taddy (2013) proposed multinomial inverse regression (MNIR) as a new model of annotated text based on the influence of metadata and response variables on the distribution of words in a document. While effective, MNIR has no way to exploit structure in the corpus to improve its predictions or facilitate exploratory data analysis. On the other hand, traditional probabilistic topic models (like latent Dirichlet allocation) capture natural heterogeneity in a collection but do not account for external variables. In this paper, we introduce the inverse regression topic model (IRTM), a mixed-membership extension of MNIR that combines the strengths of both methodologies. We present two inference algorithms for the IRTM: an efficient batch estimation algorithm and an online variant, which is suitable for large corpora. We apply these methods to a corpus of 73K Congressional press releases and another of 150K Yelp reviews, demonstrating that the IRTM outperforms both MNIR and supervised topic models on the prediction task. Further, we give examples showing that the IRTM enables systematic discovery of in-topic lexical variation, which is not possible with previous supervised topic models.

...read moreread less

45 citations

Journal Article•DOI•

Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents.

[...]

Sun Kim¹, Nicolas Fiorini¹, W. John Wilbur¹, Zhiyong Lu¹•Institutions (1)

National Institutes of Health¹

03 Oct 2017-Journal of Biomedical Informatics

TL;DR: A query-document similarity measure motivated by the Word Mover's Distance that helps identify related words when no direct matches are found between a query and a document.

...read moreread less

45 citations

Journal Article•DOI•

On selecting a prior for the precision parameter of Dirichlet process mixture models

[...]

Robert M. Dorazio¹, Robert M. Dorazio²•Institutions (2)

University of Florida¹, United States Geological Survey²

01 Sep 2009-Journal of Statistical Planning and Inference

TL;DR: In this paper, an approach is developed for computing a prior for the precision parameter α that can be used in the presence or absence of prior information about the level of clustering.

...read moreread less

45 citations

Journal Article•DOI•

Topic modeling for expert finding using latent Dirichlet allocation

[...]

Saeedeh Momtazi¹, Felix Naumann¹•Institutions (1)

Hasso Plattner Institute¹

01 Sep 2013-Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery

TL;DR: The experimental results show that the proposed topic‐based approach outperforms the state‐of‐the‐art profile‐ and document‐based models, which use information retrieval methods to rank experts, in the search space given a field of expertise as an input query.

...read moreread less

Abstract: The task of expert finding is to rank the experts in the search space given a field of expertise as an input query. In this paper, we propose a topic modeling approach for this task. The proposed model uses latent Dirichlet allocation (LDA) to induce probabilistic topics. In the first step of our algorithm, the main topics of a document collection are extracted using LDA. The extracted topics present the connection between expert candidates and user queries. In the second step, the topics are used as a bridge to find the probability of selecting each candidate for a given query. The candidates are then ranked based on these probabilities. The experimental results on the Text REtrieval Conference (TREC) Enterprise track for 2005 and 2006 show that the proposed topic-based approach outperforms the state-of-the-art profile- and document-based models, which use information retrieval methods to rank experts. Moreover, we present the superiority of the proposed topic-based approach to the improved document-based expert finding systems, which consider additional information such as local context, candidate prior, and query expansion.

...read moreread less

45 citations

Book Chapter•DOI•

Predicting friendship links in social networks using a topic modeling approach

[...]

Rohit Parimi¹, Doina Caragea¹•Institutions (1)

Kansas State University¹

24 May 2011

TL;DR: A topic modeling approach to the problem of predicting new friendships based on interests and existing friendships is proposed, using Latent Dirichlet Allocation to model user interests and, thus, an implicit interest ontology is created.

...read moreread less

Abstract: In the recent years, the number of social network users has increased dramatically. The resulting amount of data associated with users of social networks has created great opportunities for data mining problems. One data mining problem of interest for social networks is the friendship link prediction problem. Intuitively, a friendship link between two users can be predicted based on their common friends and interests. However, using user interests directly can be challenging, given the large number of possible interests. In the past, approaches that make use of an explicit user interest ontology have been proposed to tackle this problem, but the construction of the ontology proved to be computationally expensive and the resulting ontology was not very useful. As an alternative, we propose a topic modeling approach to the problem of predicting new friendships based on interests and existing friendships. Specifically, we use Latent Dirichlet Allocation (LDA) to model user interests and, thus, we create an implicit interest ontology. We construct features for the link prediction problem based on the resulting topic distributions. Experimental results on several LiveJournal data sets of varying sizes show the usefulness of the LDA features for predicting friendships.

...read moreread less

45 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
…
127
128
129
130
131
132
133
…
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics