Home
/
Topics
/
Latent Dirichlet allocation

Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1992
1990
1989
1988
1985
1979
1976
1969
1965

Papers

PDF

Open Access

More filters

Patent•

Collapsed gibbs sampler for sparse topic models and discrete matrix factorization

[...]

Cédric Archambeau¹, Guillaume Bouchard¹•Institutions (1)

Xerox¹

19 Oct 2010

TL;DR: In this article, a topic model defining a set of topics is inferred by performing latent Dirichlet allocation (LDA) with an Indian Buffet Process (IBP) compound Dichlet prior probability distribution.

...read moreread less

Abstract: In an inference system for organizing a corpus of objects, feature representations are generated comprising distributions over a set of features corresponding to the objects. A topic model defining a set of topics is inferred by performing latent Dirichlet allocation (LDA) with an Indian Buffet Process (IBP) compound Dirichlet prior probability distribution. The inference is performed using a collapsed Gibbs sampling algorithm by iteratively sampling (1) topic allocation variables of the LDA and (2) binary activation variables of the IBP compound Dirichlet prior. In some embodiments the inference is configured such that each inferred topic model is a clean topic model with topics defined as distributions over sub-sets of the set of features selected by the prior. In some embodiments the inference is configured such that the inferred topic model associates a focused sub-set of the set of topics to each object of the training corpus.

...read moreread less

46 citations

Proceedings Article•

Online Latent Dirichlet Allocation with Infinite Vocabulary

[...]

Ke Zhai¹, Jordan Boyd-Graber¹•Institutions (1)

University of Maryland, College Park¹

16 Jun 2013

TL;DR: This work extends LDA by drawing topics from a Dirichlet process whose base distribution is a distribution over all strings rather than from a finiteDirichlet, and proposes heuristics to dynamically order, expand, and contract the set of words the authors consider in their vocabulary.

...read moreread less

Abstract: Topic models based on latent Dirichlet allocation (LDA) assume a predefined vocabulary. This is reasonable in batch settings but not reasonable for streaming and online settings. To address this lacuna, we extend LDA by drawing topics from a Dirichlet process whose base distribution is a distribution over all strings rather than from a finite Dirichlet. We develop inference using online variational inference and -- to only consider a finite number of words for each topic -- propose heuristics to dynamically order, expand, and contract the set of words we consider in our vocabulary. We show our model can successfully incorporate new words and that it performs better than topic models with finite vocabularies in evaluations of topic quality and classification performance.

...read moreread less

46 citations

Journal Article•DOI•

Extraction of descriptive driving patterns from driving data using unsupervised algorithms

[...]

Guofa Li¹, Guofa Li², Yaoyu Chen², Dongpu Cao¹, Xingda Qu², Bo Cheng³, Keqiang Li³ - Show less +3 more•Institutions (3)

University of Waterloo¹, Shenzhen University², Tsinghua University³

01 Jul 2021-Mechanical Systems and Signal Processing

TL;DR: The proposed unsupervised framework provides an effective and efficient data mining solution to facilitating deep and comprehensive understanding on drivers’ behavioral characteristics, which will benefit the development of AVs and ADASs.

...read moreread less

46 citations

Journal Article•DOI•

Latent variable modeling for the microbiome.

[...]

Kris Sankaran¹, Susan Holmes¹•Institutions (1)

Stanford University¹

01 Oct 2019-Biostatistics

TL;DR: This work explores the application of probabilistic latent variable models to microbiome data, with a focus on Latent Dirichlet allocation, Non-negative matrix factorization, and Dynamic Unigram models and develops guidelines for when different methods are appropriate.

...read moreread less

Abstract: The human microbiome is a complex ecological system, and describing its structure and function under different environmental conditions is important from both basic scientific and medical perspectives. Viewed through a biostatistical lens, many microbiome analysis goals can be formulated as latent variable modeling problems. However, although probabilistic latent variable models are a cornerstone of modern unsupervised learning, they are rarely applied in the context of microbiome data analysis, in spite of the evolutionary, temporal, and count structure that could be directly incorporated through such models. We explore the application of probabilistic latent variable models to microbiome data, with a focus on Latent Dirichlet allocation, Non-negative matrix factorization, and Dynamic Unigram models. To develop guidelines for when different methods are appropriate, we perform a simulation study. We further illustrate and compare these techniques using the data of Dethlefsen and Relman (2011, Incomplete recovery and individualized responses of the human distal gut microbiota to repeated antibiotic perturbation. Proceedings of the National Academy of Sciences108, 4554-4561), a study on the effects of antibiotics on bacterial community composition. Code and data for all simulations and case studies are available publicly.

...read moreread less

46 citations

Journal Article•DOI•

Supervised topic models for multi-label classification

[...]

Ximing Li¹, Jihong Ouyang¹, Xiaotang Zhou¹•Institutions (1)

Jilin University¹

03 Feb 2015-Neurocomputing

TL;DR: Two supervised topic models for multi-label classification problems are developed that outperform the state-of-the-art approaches and extend Latent Dirichlet Allocation (LDA) via two observations, i.e., the frequencies of the labels and the dependencies among different labels.

...read moreread less

45 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
…
126
127
128
129
130
131
132
…
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics