Home
/
Topics
/
Probabilistic latent semantic analysis

Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1985
1984
1983
1982
1981
1980
1979
1978
1974
1973
1971
1970
1969
1965
1963
1960
1958
1956
1954

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Latent semantic analysis

[...]

Thomas K. Landauer, Susan T. Dumais

13 Nov 2008-Scholarpedia

TL;DR: An interesting application of SVD to text do cuments is described, where words of similar meaning get mapped to similar low dimensional locations by taking the top k singular values/ vectors.

...read moreread less

Abstract: We now described an interesting application of SVD to text do cuments. Suppose we represent documents as a bag of words, soXij is the number of times word j occurs in document i, for j = 1 : W andi = 1 : D, where W is the number of words and D is the number of documents. To find a document that contains a g iven word, we can use standard search procedures, but this can get confuse d by ynonomy (different words with the same meaning) andpolysemy (same word with different meanings). An alternative approa ch is to assume that X was generated by some low dimensional latent representation X̂ ∈ IR, whereK is the number of latent dimensions. If we compare documents in the latent space, we should get improved retrie val performance, because words of similar meaning get mapped to similar low dimensional locations. We can compute a low dimensional representation of X by computing the SVD, and then taking the top k singular values/ vectors: 1

...read moreread less

93 citations

Proceedings Article•DOI•

Probabilistic latent semantic visualization: topic model for visualizing documents

[...]

Tomoharu Iwata¹, Takeshi Yamada¹, Naonori Ueda¹•Institutions (1)

Nippon Telegraph and Telephone¹

24 Aug 2008

TL;DR: This work proposes a visualization method based on a topic model for discrete data such as documents that can be obtained by fitting the model to a given set of documents using the EM algorithm, resulting in documents with similar topics being embedded close together.

...read moreread less

Abstract: We propose a visualization method based on a topic model for discrete data such as documents. Unlike conventional visualization methods based on pairwise distances such as multi-dimensional scaling, we consider a mapping from the visualization space into the space of documents as a generative process of documents. In the model, both documents and topics are assumed to have latent coordinates in a two- or three-dimensional Euclidean space, or visualization space. The topic proportions of a document are determined by the distances between the document and the topics in the visualization space, and each word is drawn from one of the topics according to its topic proportions. A visualization, i.e. latent coordinates of documents, can be obtained by fitting the model to a given set of documents using the EM algorithm, resulting in documents with similar topics being embedded close together. We demonstrate the effectiveness of the proposed model by visualizing document and movie data sets, and quantitatively compare it with conventional visualization methods.

...read moreread less

93 citations

Journal Article•DOI•

Negotiating the semantic gap: from feature maps to semantic landscapes.

[...]

Rong Zhao¹, William I. Grosky¹•Institutions (1)

Wayne State University¹

01 Mar 2002-Pattern Recognition

TL;DR: The use of latent semantic indexing (LSI) in conjunction with normalization and term weighting for content-based image retrieval is examined, using two different approaches to image feature representation.

...read moreread less

92 citations

Proceedings Article•DOI•

Unsupervised Topic Modeling for Short Texts Using Distributed Representations of Words

[...]

Rangarajan Sridhar¹, Vivek Kumar•Institutions (1)

Stanley Medical College¹

01 Jun 2015

TL;DR: An unsupervised topic model for short texts that performs soft clustering over distributed representations of words using Gaussian mixture models whose components capture the notion of latent topics and which outperforms LDA on short texts through both subjective and objective evaluation.

...read moreread less

Abstract: We present an unsupervised topic model for short texts that performs soft clustering over distributed representations of words. We model the low-dimensional semantic vector space represented by the dense distributed representations of words using Gaussian mixture models (GMMs) whose components capture the notion of latent topics. While conventional topic modeling schemes such as probabilistic latent semantic analysis (pLSA) and latent Dirichlet allocation (LDA) need aggregation of short messages to avoid data sparsity in short documents, our framework works on large amounts of raw short texts (billions of words). In contrast with other topic modeling frameworks that use word cooccurrence statistics, our framework uses a vector space model that overcomes the issue of sparse word co-occurrence patterns. We demonstrate that our framework outperforms LDA on short texts through both subjective and objective evaluation. We also show the utility of our framework in learning topics and classifying short texts on Twitter data for English, Spanish, French, Portuguese and Russian.

...read moreread less

92 citations

Proceedings Article•

PixelVAE: A Latent Variable Model for Natural Images

[...]

Ishaan Gulrajani¹, Kundan Kumar², Faruk Ahmed³, Adrien Ali Taïga⁴, Francesco Visin⁵, David Vazquez⁶, Aaron Courville⁴ - Show less +3 more•Institutions (6)

Salesforce.com¹, Indian Institute of Technology Kanpur², Griffith University³, Université de Montréal⁴, Polytechnic University of Milan⁵, Autonomous University of Barcelona⁶

04 Nov 2016

TL;DR: PixelVAE as discussed by the authors is a VAE model with an autoregressive decoder based on PixelCNN, which achieves state-of-the-art performance on binarized MNIST, competitive performance on 64x64 ImageNet, and high quality samples on the LSUN bedrooms dataset.

...read moreread less

Abstract: Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and model global structure well but have difficulty capturing small details. PixelCNN models details very well, but lacks a latent code and is difficult to scale for capturing large structures. We present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. Our model requires very few expensive autoregressive layers compared to PixelCNN and learns latent codes that are more compressed than a standard VAE while still capturing most non-trivial structure. Finally, we extend our model to a hierarchy of latent variables at different scales. Our model achieves state-of-the-art performance on binarized MNIST, competitive performance on 64x64 ImageNet, and high-quality samples on the LSUN bedrooms dataset.

...read moreread less

92 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
…
49
50
51
52
53
54
55
…
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics