Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Collaborative Dual-PLSA: mining distinction and commonality across multiple domains for text classification

[...]

Fuzhen Zhuang¹, Ping Luo², Zhiyong Shen², Qing He¹, Yuhong Xiong, Zhongzhi Shi¹, Hui Xiong³ - Show less +3 more•Institutions (3)

Chinese Academy of Sciences¹, Hewlett-Packard², Rutgers University³

26 Oct 2010

TL;DR: This study proposes a generative statistical model, named Collaborative Dual-PLSA, to simultaneously capture both the domain distinction and commonality among multiple domains and demonstrates the superiority of the proposed model over existing state-of-the-art methods of supervised and transfer learning.

...read moreread less

Abstract: The distribution difference among multiple data domains has been considered for the cross-domain text classification problem. In this study, we show two new observations along this line. First, the data distribution difference may come from the fact that different domains use different key words to express the same concept. Second, the association between this conceptual feature and the document class may be stable across domains. These two issues are actually the distinction and commonality across data domains. Inspired by the above observations, we propose a generative statistical model, named Collaborative Dual-PLSA (CD-PLSA), to simultaneously capture both the domain distinction and commonality among multiple domains. Different from Probabilistic Latent Semantic Analysis (PLSA) with only one latent variable, the proposed model has two latent factors y and z, corresponding to word concept and document class respectively. The shared commonality intertwines with the distinctions over multiple domains, and is also used as the bridge for knowledge transformation. We exploit an Expectation Maximization (EM) algorithm to learn this model, and also propose its distributed version to handle the situation where the data domains are geographically separated from each other. Finally, we conduct extensive experiments over hundreds of classification tasks with multiple source domains and multiple target domains to validate the superiority of the proposed CD-PLSA model over existing state-of-the-art methods of supervised and transfer learning. In particular, we show that CD-PLSA is more tolerant of distribution differences.

...read moreread less

46 citations

Journal Article•DOI•

Latent variable discovery in classification models

[...]

Nevin L. Zhang¹, Thomas D. Nielsen², Finn Verner Jensen²•Institutions (2)

Hong Kong University of Science and Technology¹, Aalborg University²

01 Mar 2004-Artificial Intelligence in Medicine

TL;DR: A violation of the naive Bayes model is interpreted as an indication of the presence of latent variables, and it is shown how latent variables can be detected.

...read moreread less

46 citations

Journal Article•DOI•

Person Re-Identification via Distance Metric Learning With Latent Variables

[...]

Chong Sun¹, Dong Wang¹, Huchuan Lu¹•Institutions (1)

Dalian University of Technology¹

01 Jan 2017-IEEE Transactions on Image Processing

TL;DR: An effective person re-identification method with latent variables is proposed, which represents a pedestrian as the mixture of a holistic model and a number of flexible models, and develops a latent metric learning method for learning the effective metric matrix.

...read moreread less

Abstract: In this paper, we propose an effective person re-identification method with latent variables, which represents a pedestrian as the mixture of a holistic model and a number of flexible models. Three types of latent variables are introduced to model uncertain factors in the re-identification problem, including vertical misalignments, horizontal misalignments and leg posture variations. The distance between two pedestrians can be determined by minimizing a given distance function with respect to latent variables, and then be used to conduct the re-identification task. In addition, we develop a latent metric learning method for learning the effective metric matrix, which can be solved via an iterative manner: once latent information is specified, the metric matrix can be obtained based on some typical metric learning methods; with the computed metric matrix, the latent variables can be determined by searching the state space exhaustively. Finally, extensive experiments are conducted on seven databases to evaluate the proposed method. The experimental results demonstrate that our method achieves better performance than other competing algorithms.

...read moreread less

46 citations

Book Chapter•DOI•

Maximum likelihood estimation in a latent variable problem

[...]

David R. Brillinger¹, Haiganoush K. Preisler²•Institutions (2)

University of California, Berkeley¹, University of California, San Francisco²

01 Jan 1983

TL;DR: This chapter discusses the maximum likelihood estimation in a latent variable problem, which is a viable approach to a broad class of latent variable problems.

...read moreread less

Abstract: Publisher Summary This chapter discusses the maximum likelihood estimation in a latent variable problem. Latent variates are random variables, which cannot be measured directly, but play essential roles in the description of observable quantities. They occur in a broad range of statistical problems. In the case that the dependent variate y is discrete, latent structure models play an important role, arising in connection with ability tests. Computing uniform residuals is an effective general means to proceed in latent variable problems. In some cases, the subject's ability can be eliminated by conditioning on an appropriate statistic. Maximum likelihood estimation is a viable approach to a broad class of latent variable problems. Generalized linear interactive modeling (GLIM) is an effective tool for carrying out the needed computations. GLIM also contains a high-level syntax for handling variables with factorial structure, vectors, and nonfull rank models. Its powerful directives shorten the length of the program considerably and allow simple simulation of the whole situation for checking programs and logic.

...read moreread less

46 citations

Proceedings Article•

Kernel Embeddings of Latent Tree Graphical Models

[...]

Le Song¹, Eric P. Xing², Ankur P. Parikh²•Institutions (2)

Georgia Institute of Technology¹, Carnegie Mellon University²

12 Dec 2011

TL;DR: This work presents a method based on kernel embeddings of distributions for latent tree graphical models with continuous and non-Gaussian variables that can recover the latent tree structures with provable guarantees and perform local-minimum free parameter learning and efficient inference.

...read moreread less

Abstract: Latent tree graphical models are natural tools for expressing long range and hierarchical dependencies among many variables which are common in computer vision, bioinformatics and natural language processing problems. However, existing models are largely restricted to discrete and Gaussian variables due to computational constraints; furthermore, algorithms for estimating the latent tree structure and learning the model parameters are largely restricted to heuristic local search. We present a method based on kernel embeddings of distributions for latent tree graphical models with continuous and non-Gaussian variables. Our method can recover the latent tree structures with provable guarantees and perform local-minimum free parameter learning and efficient inference. Experiments on simulated and real data show the advantage of our proposed approach.

...read moreread less

46 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics