Home
/
Topics
/
Latent Dirichlet allocation

Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1992
1990
1989
1988
1985
1979
1976
1969
1965

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Accounting for burstiness in topic models

[...]

Gabriel Doyle¹, Charles Elkan¹•Institutions (1)

University of California, San Diego¹

14 Jun 2009

TL;DR: A topic model is introduced that uses Dirichlet compound multinomial (DCM) distributions to model this burstiness phenomenon and achieves better held-out likelihood than standard latentDirichlet allocation (LDA).

...read moreread less

Abstract: Many different topic models have been used successfully for a variety of applications However, even state-of-the-art topic models suffer from the important flaw that they do not capture the tendency of words to appear in bursts; it is a fundamental property of language that if a word is used once in a document, it is more likely to be used again We introduce a topic model that uses Dirichlet compound multinomial (DCM) distributions to model this burstiness phenomenon On both text and non-text datasets, the new model achieves better held-out likelihood than standard latent Dirichlet allocation (LDA) It is straightforward to incorporate the DCM extension into topic models that are more complex than LDA

...read moreread less

104 citations

Proceedings Article•DOI•

Image retrieval on large-scale image databases

[...]

Eva Hörster¹, Rainer Lienhart¹, Malcolm Slaney²•Institutions (2)

University of Augsburg¹, Yahoo!²

09 Jul 2007

TL;DR: This work studies the representation of images by Latent Dirichlet Allocation (LDA) models for content-based image retrieval, and shows the suitability of the approach for large-scale databases.

...read moreread less

Abstract: Online image repositories such as Flickr contain hundreds of millions of images and are growing quickly. Along with that the needs for supporting indexing, searching and browsing is becoming more and more pressing. In this work we will employ the image content as a source of information to retrieve images. We study the representation of images by Latent Dirichlet Allocation (LDA) models for content-based image retrieval. Image representations are learned in an unsupervised fashion, and each image is modeled as the mixture of topics/object parts depicted in the image. This allows us to put images into subspaces for higher-level reasoning which in turn can be used to find similar images. Different similarity measures based on the described image representation are studied. The presented approach is evaluated on a real world image database consisting of more than 246,000 images and compared to image models based on probabilistic Latent Semantic Analysis (pLSA). Results show the suitability of the approach for large-scale databases. Finally we incorporate active learning with user relevance feedback in our framework, which further boosts the retrieval performance.

...read moreread less

104 citations

Proceedings Article•DOI•

Cross-Cultural Analysis of Blogs and Forums with Mixed-Collection Topic Models

[...]

Michael J. Paul¹, Roxana Girju¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

06 Aug 2009

TL;DR: A new model, ccLDA, is proposed, which extends over the Latent Dirichlet Allocation (LDA) and cross-collection mixture (ccMix) models on blogs and forums and provides a qualitative and quantitative analysis of the model on the cross-cultural data.

...read moreread less

Abstract: This paper presents preliminary results on the detection of cultural differences from people's experiences in various countries from two perspectives: tourists and locals. Our approach is to develop probabilistic models that would provide a good framework for such studies. Thus, we propose here a new model, ccLDA, which extends over the Latent Dirichlet Allocation (LDA) (Blei et al., 2003) and cross-collection mixture (ccMix) (Zhai et al., 2004) models on blogs and forums. We also provide a qualitative and quantitative analysis of the model on the cross-cultural data.

...read moreread less

103 citations

Book Chapter•DOI•

Music Recommender Systems

[...]

Markus Schedl¹, Peter Knees¹, Brian McFee², Dmitry Bogdanov³, Marius Kaminskas⁴ - Show less +1 more•Institutions (4)

Johannes Kepler University of Linz¹, New York University², Pompeu Fabra University³, University College Cork⁴

01 Jan 2015

TL;DR: This chapter gives an introduction to music recommender systems research, highlighting the distinctive characteristics of music, as compared to other kinds of media, and pointing to the most important challenges faced by music recommendation research.

...read moreread less

Abstract: This chapter gives an introduction to music recommender systems research. We highlight the distinctive characteristics of music, as compared to other kinds of media. We then provide a literature survey of content-based music recommendation, contextual music recommendation, hybrid methods, and sequential music recommendation, followed by overview of evaluation strategies and commonly used data sets. We conclude by pointing to the most important challenges faced by music recommendation research.

...read moreread less

103 citations

Proceedings Article•DOI•

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition

[...]

Xie Chen¹, Tian Tan², Xunying Liu³, Pierre Lanchantin³, Moquan Wan³, Mark J. F. Gales³, Philip C. Woodland³ - Show less +3 more•Institutions (3)

California Institute of Technology¹, Shanghai Jiao Tong University², University of Cambridge³

01 Jan 2015

TL;DR: Experiments using a state-of-theart LVCSR system showed adaptation could yield perplexity reductions of 8% relatively over the baseline RNNLM and small but consistent word error rate reductions.

...read moreread less

Abstract: Copyright © 2015 ISCA. Recurrent neural network language models (RNNLMs) have recently become increasingly popular for many applications including speech recognition. In previous research RNNLMs have normally been trained on well-matched in-domain data. The adaptation of RNNLMs remains an open research area to be explored. In this paper, genre and topic based RNNLMadaptation techniques are investigated for a multi-genre broadcast transcription task. A number of techniques including Probabilistic Latent Semantic Analysis, Latent Dirichlet Allocation and Hierarchical Dirichlet Processes are used to extract show level topic information. These were then used as additional input to the RNNLM during training, which can facilitate unsupervised test time adaptation. Experiments using a state-of-theart LVCSR system trained on 1000 hours of speech and more than 1 billion words of text showed adaptation could yield perplexity reductions of 8% relatively over the baseline RNNLM and small but consistent word error rate reductions.

...read moreread less

102 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
…
57
58
59
60
61
62
63
…
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics