Home
/
Topics
/
Latent Dirichlet allocation

Topic

Latent Dirichlet allocation

About: Latent Dirichlet allocation is a research topic. Over the lifetime, 5351 publications have been published within this topic receiving 212555 citations. The topic is also known as: LDA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1992
1990
1989
1988
1985
1979
1976
1969
1965

Papers

PDF

Open Access

More filters

Proceedings Article•

Incorporating Knowledge Graph Embeddings into Topic Modeling.

[...]

Liang Yao¹, Yin Zhang¹, Baogang Wei¹, Zhe Jin¹, Rui Zhang¹, Yangyang Zhang¹, Qinfei Chen¹ - Show less +3 more•Institutions (1)

Zhejiang University¹

12 Feb 2017

TL;DR: This paper proposes a novel knowledge-based topic model by incorporating knowledge graph embeddings into topic modeling and improves the semantic coherence significantly and capture a better representation of a document in the topic space.

...read moreread less

Abstract: Probabilistic topic models could be used to extract low-dimension topics from document collections. However, such models without any human knowledge often produce topics that are not interpretable. In recent years, a number of knowledge-based topic models have been proposed, but they could not process fact-oriented triple knowledge in knowledge graphs. Knowledge graph embeddings, on the other hand, automatically capture relations between entities in knowledge graphs. In this paper, we propose a novel knowledge-based topic model by incorporating knowledge graph embeddings into topic modeling. By combining latent Dirichlet allocation, a widely used topic model with knowledge encoded by entity vectors, we improve the semantic coherence significantly and capture a better representation of a document in the topic space. Our evaluation results will demonstrate the effectiveness of our method.

...read moreread less

54 citations

Text Segmentation with Topic Models

[...]

Martin Riedl, Chris Biemann

01 Aug 2012

TL;DR: This article presents a general method to use information retrieved from the Latent Dirichlet Allocation (LDA) topic model for Text Segmentation: Using topic assignments instead of words in two well-known Text Se segmentation algorithms, namely TextTiling and C99, leads to significant improvements.

...read moreread less

Abstract: This article presents a general method to use information retrieved from the Latent Dirichlet Allocation (LDA) topic model for Text Segmentation: Using topic assignments instead of words in two well-known Text Segmentation algorithms, namely TextTiling and C99, leads to significant improvements. Further, we introduce our own algorithm called TopicTiling, which is a simplified version of TextTiling (Hearst, 1997). In our study, we evaluate and optimize parameters of LDA and TopicTiling. A further contribution to improve the segmentation accuracy is obtained through stabilizing topic assignments by using information from all LDA inference iterations. Finally, we show that TopicTiling outperforms previous Text Segmentation algorithms on two widely used datasets, while being computationally less expensive than other algorithms.

...read moreread less

54 citations

Journal Article•DOI•

Analyzing research trends in personal information privacy using topic modeling

[...]

Hyo Shin Choi¹, Won Sang Lee¹, So Young Sohn¹•Institutions (1)

Yonsei University¹

01 Jun 2017-Computers & Security

TL;DR: In this paper, the authors examined trends in academic research on personal information privacy using Scopus DB and extracted 2356 documents covering journal articles, reviews, book chapters, conference papers and working papers published between 1972 and August 2015.

...read moreread less

53 citations

Proceedings Article•DOI•

Dynamic Topic Adaptation for Phrase-based MT

[...]

Eva Hasler¹, Phil Blunsom¹, Philipp Koehn, Barry Haddow•Institutions (1)

University of Edinburgh¹

01 Apr 2014

TL;DR: This work explores topic adaptation on a diverse data set and presents a new bilingual variant of Latent Dirichlet Allocation to compute topic-adapted, probabilistic phrase translation features, and dynamically infer document-specific translation probabilities for test sets of unknown origin.

...read moreread less

Abstract: Translating text from diverse sources poses a challenge to current machine translation systems which are rarely adapted to structure beyond corpus level. We explore topic adaptation on a diverse data set and present a new bilingual variant of Latent Dirichlet Allocation to compute topic-adapted, probabilistic phrase translation features. We dynamically infer document-specific translation probabilities for test sets of unknown origin, thereby capturing the effects of document context on phrase translations. We show gains of up to 1.26 BLEU over the baseline and 1.04 over a domain adaptation benchmark. We further provide an analysis of the domain-specific data and show additive gains of our model in combination with other types of topic-adapted features.

...read moreread less

53 citations

Proceedings Article•

Sentence Subjectivity Detection with Weakly-Supervised Learning

[...]

Chenghua Lin, Yulan He¹, Richard M. Everson²•Institutions (2)

Open University¹, University of Exeter²

01 Nov 2011

TL;DR: A hierarchical Bayesian model based on latent Dirichlet allocation (LDA) is presented, called subjLDA, for sentence-level subjectivity detection, which automatically identifies whether a given sentence expresses opinion or states facts.

...read moreread less

Abstract: This paper presents a hierarchical Bayesian model based on latent Dirichlet allocation (LDA), called subjLDA, for sentence-level subjectivity detection, which automatically identifies whether a given sentence expresses opinion or states facts. In contrast to most of the existing methods relying on either labelled corpora for classifier training or linguistic pattern extraction for subjectivity classification, we view the problem as weakly-supervised generative model learning, where the only input to the model is a small set of domain independent subjectivity lexical clues. A mechanism is introduced to incorporate the prior information about the subjectivity lexical clues into model learning by modifying the Dirichlet priors of topic-word distributions. The subjLDA model has been evaluated on the Multi-Perspective Question Answering (MPQA) dataset and promising results have been observed in the preliminary experiments. We have also explored adding neutral words as prior information for model learning. It was found that while incorporating subjectivity clues bearing positive or negative polarity can achieve a significant performance gain, the prior lexical information from neutral words is less effective.

...read moreread less

53 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
…
110
111
112
113
114
115
116
…
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,513

Papers

245,225

Citations

No. of papers in the topic in previous years
Year	Papers
2023	323
2022	842
2021	418
2020	429
2019	473
2018	446

Latent Dirichlet allocation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics