Home
/
Topics
/
Probabilistic latent semantic analysis

Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1985
1984
1983
1982
1981
1980
1979
1978
1974
1973
1971
1970
1969
1965
1963
1960
1958
1956
1954

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A semantic-grained perspective of latent knowledge modeling

[...]

Paola Della Rocca¹, Sabrina Senatore¹, Vincenzo Loia¹•Institutions (1)

University of Salerno¹

01 Jul 2017-Information Fusion

TL;DR: A semantically enhanced document retrieval system that describes each retrieved document with an ontological multi-grained network of the extracted conceptualization, and a SKOS-based ontology, ad-hoc created for a document corpus that enables the exploration of the concepts at different granularity levels.

...read moreread less

19 citations

Book Chapter•DOI•

Combining Latent Dirichlet Allocation and K-Means for Documents Clustering: Effect of Probabilistic Based Distance Measures

[...]

Quang Vu Bui¹, Quang Vu Bui², Karim Sayadi¹, Soufian Ben Amor, Marc Bui¹ - Show less +1 more•Institutions (2)

PSL Research University¹, University of the Sciences²

03 Apr 2017

TL;DR: The experimental results indicate that the probabilistic-based distance measures are better than the vector based distance measures including Euclidean when it comes to cluster a set of documents in the topic space.

...read moreread less

Abstract: This paper evaluates through an empirical study eight different distance measures used on the LDA + K-means model. We performed our analysis on two miscellaneous datasets that are commonly used. Our experimental results indicate that the probabilistic-based distance measures are better than the vector based distance measures including Euclidean when it comes to cluster a set of documents in the topic space. Moreover, we investigate the implication of the number of topics and show that K-means combined to the results of the Latent Dirichlet Allocation model allows us to have better results than the LDA + Naive and Vector Space Model.

...read moreread less

19 citations

Journal Article•DOI•

Flexible multi-task learning with latent task grouping

[...]

Shi Zhong¹, Jian Pu², Yu-Gang Jiang¹, Rui Feng¹, Xiangyang Xue¹ - Show less +1 more•Institutions (2)

Fudan University¹, Chinese Academy of Sciences²

12 May 2016-Neurocomputing

TL;DR: A flexible multi-task learning framework to identify latent grouping structures under agnostic settings, where the prior of the latent subspace is unknown to the learner, and provides proofs of theoretical guarantee on learning performance.

...read moreread less

19 citations

Journal Article•DOI•

Multi-bridge transfer learning

[...]

Xuegang Hu¹, Jianhan Pan¹, Peipei Li¹, Huizong Li¹, Wei He¹, Yuhong Zhang¹ - Show less +2 more•Institutions (1)

Hefei University of Technology¹

01 Apr 2016-Knowledge Based Systems

TL;DR: This paper proposes a novel transfer learning method, referred to as Multi-Bridge Transfer Learning (MBTL), to learn the distributions in the different latent spaces together and presents an iterative algorithm with convergence guarantee to solve MBTL.

...read moreread less

Abstract: MBTL constructs multiple latent spaces to exploit more common latent factors.MBTL reduces the discrepancies of the distributions in different latent spaces.To solve MBTL, we present an iterative algorithm with convergence guarantee.MBTL outperforms state-of-the-art learning methods on several datasets. Transfer learning, which aims to exploit the knowledge in the source domains to promote the learning tasks in the target domains, has attracted extensive research interests recently. The general idea of the previous approaches is to model the shared structure in one latent space as the bridge across domains by reducing the distribution divergences. However, there exist some latent factors in the other latent spaces, which can also be utilized to draw the corresponding distributions closer for establishing the bridges. In this paper, we propose a novel transfer learning method, referred to as Multi-Bridge Transfer Learning (MBTL), to learn the distributions in the different latent spaces together. Therefore, more latent factors shared can be utilized to transfer knowledge. Additionally, an iterative algorithm with convergence guarantee based on non-negative matrix tri-factorization techniques is proposed to solve the optimization problem. Comprehensive experiments demonstrate that MBTL can significantly outperform state-of-the-art learning methods on the topic and sentiment classification tasks.

...read moreread less

19 citations

Journal Article•DOI•

Estimating Topic Modeling Performance with Sharma-Mittal Entropy.

[...]

Sergei Koltcov¹, Vera Ignatenko¹, Olessia Koltsova¹•Institutions (1)

National Research University – Higher School of Economics¹

05 Jul 2019-Entropy

TL;DR: It is demonstrated that Sharma–Mittal entropy is a convenient tool for selecting both the number of topics and the values of hyper-parameters, simultaneously controlling for semantic stability, which none of the existing metrics can do.

...read moreread less

Abstract: Topic modeling is a popular approach for clustering text documents. However, current tools have a number of unsolved problems such as instability and a lack of criteria for selecting the values of model parameters. In this work, we propose a method to solve partially the problems of optimizing model parameters, simultaneously accounting for semantic stability. Our method is inspired by the concepts from statistical physics and is based on Sharma-Mittal entropy. We test our approach on two models: probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA) with Gibbs sampling, and on two datasets in different languages. We compare our approach against a number of standard metrics, each of which is able to account for just one of the parameters of our interest. We demonstrate that Sharma-Mittal entropy is a convenient tool for selecting both the number of topics and the values of hyper-parameters, simultaneously controlling for semantic stability, which none of the existing metrics can do. Furthermore, we show that concepts from statistical physics can be used to contribute to theory construction for machine learning, a rapidly-developing sphere that currently lacks a consistent theoretical ground.

...read moreread less

19 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
…
171
172
173
174
175
176
177
…
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics