Topic

Probabilistic latent semantic analysis

About: Probabilistic latent semantic analysis is a research topic. Over the lifetime, 2884 publications have been published within this topic receiving 198341 citations. The topic is also known as: PLSA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Learning Semantics with Deep Belief Network for Cross-Language Information Retrieval

[...]

Jungi Kim¹, Jinseok Nam¹, Iryna Gurevych¹•Institutions (1)

Technische Universität Darmstadt¹

01 Dec 2012

TL;DR: The proposed CLIR framework consists of deep belief networks for each language and the employ canonical correlation analysis to construct a shared semantic space and it is shown that the cross-lingual semantic analysis with DBN and CCA improves the state-of-the-art keyword-based CLIR performance.

...read moreread less

Abstract: This paper introduces a cross-language information retrieval (CLIR) framework that combines the state-of-the-art keyword-based approach with a latent semantic-based retrieval model. To capture and analyze the hidden semantics in cross-lingual settings, we construct latent semantic models that map text in different languages into a shared semantic space. Our proposed framework consists of deep belief networks (DBN) for each language and we employ canonical correlation analysis (CCA) to construct a shared semantic space. We evaluated the proposed CLIR approach on a standard ad hoc CLIR dataset, and we show that the cross-lingual semantic analysis with DBN and CCA improves the state-of-the-art keyword-based CLIR performance.

...read moreread less

34 citations

Proceedings Article•DOI•

Probabilistic matrix tri-factorization

[...]

Jiho Yoo¹, Seungjin Choi¹•Institutions (1)

Pohang University of Science and Technology¹

19 Apr 2009

TL;DR: An EM algorithm is developed to learn the PMTF model, showing its equivalence to multiplicative updates derived by an algebraic approach, and the useful behavior of PMTF is demonstrated in a task of document clustering.

...read moreread less

Abstract: Nonnegative matrix tri-factorization (NMTF) is a 3-factor decomposition of a nonnegative data matrix, X ≈ USV┬, where factor matrices, U, S, and V , are restricted to be nonnegative as well. Motivated by the aspect model used for dyadic data analysis as well as in probabilistic latent semantic analysis (PLSA), we present a probabilistic model with two dependent latent variables for NMTF, referred to as probabilistic matrix tri-factorization (PMTF). Each latent variable in the model is associated with the cluster variable for the corresponding object in the dyad, leading the model suited to co-clustering. We develop an EM algorithm to learn the PMTF model, showing its equivalence to multiplicative updates derived by an algebraic approach. We demonstrate the useful behavior of PMTF in a task of document clustering. Moreover, we incorporate the likelihood in the PMTF model into existing information criteria so that the number of clusters can be detected, while the algebraic NMTF cannot.

...read moreread less

34 citations

Journal Article•DOI•

Discriminative Joint-Feature Topic Model With Dual Constraints for WCE Classification

[...]

Yixuan Yuan¹, Xiwen Yao², Junwei Han², Lei Guo², Max Q.-H. Meng³ - Show less +1 more•Institutions (3)

Stanford University¹, Northwestern Polytechnical University², The Chinese University of Hong Kong³

01 Jul 2018-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A novel discriminative joint-feature topic model (DJTM) with dual constraints is proposed to classify multiple abnormalities in WCE images and demonstrates that this method outperforms existing multiple abnormalities classification methods for W CE images.

...read moreread less

Abstract: Wireless capsule endoscopy (WCE) enables clinicians to examine the digestive tract without any surgical operations, at the cost of a large amount of images to be analyzed. The main challenge for automatic computer-aided diagnosis arises from the difficulty of robust characterization of these images. To tackle this problem, a novel discriminative joint-feature topic model (DJTM) with dual constraints is proposed to classify multiple abnormalities in WCE images. We first propose a joint-feature probabilistic latent semantic analysis (PLSA) model, where color and texture descriptors extracted from same image patches are jointly modeled with their conditional distributions. Then the proposed dual constraints: visual words importance and local image manifold are embedded into the joint-feature PLSA model simultaneously to obtain discriminative latent semantic topics. The visual word importance is proposed in our DJTM to guarantee that visual words with similar importance come from close latent topics while the local image manifold constraint enforces that images within the same category share similar latent topics. Finally, each image is characterized by distribution of latent semantic topics instead of low level features. Our proposed DJTM showed an excellent overall recognition accuracy 90.78%. Comprehensive comparison results demonstrate that our method outperforms existing multiple abnormalities classification methods for WCE images.

...read moreread less

34 citations

Proceedings Article•DOI•

A cross-domain adaptation method for sentiment classification using probabilistic latent analysis

[...]

Sheng Gao¹, Haizhou Li¹•Institutions (1)

Institute for Infocomm Research Singapore¹

24 Oct 2011

TL;DR: A cross- domain topic indexing (CDTI) method, with which a common semantic space is found from the prior between-domain term correspondences and the term co-occurrences in the cross-domain documents, which shows that CDTI outperforms the state-of-the-art domain adaptation method, and the traditional latent semantic indexing method.

...read moreread less

Abstract: Sentiment classification is becoming attractive in recent years because of its potential commercial applications. It exploits supervised learning methods to learn the classifiers from the annotated training documents. The challenge in sentiment classification lies in that the sentiment domains are diverse, heterogeneous and fast-growing. The classifiers trained on one domain (source domain) could not classify a document from another domain (target domain). The domain adaptation technique is to address the problem by making use of labeled samples in the source domain, and unlabeled samples in the target domain. This paper presents a new solution, a cross-domain topic indexing (CDTI) method, with which a common semantic space is found from the prior between-domain term correspondences and the term co-occurrences in the cross-domain documents. These observations are characterized with the mixture model in CDTI, with each component being a possible topic shared by the source and target domains. Such common topics are found to index the cross-domain content. We evaluate the algorithms on a multi-domain sentiment classification task, which shows that CDTI outperforms the state-of-the-art domain adaptation method, i.e. spectral feature alignment (SFA), and the traditional latent semantic indexing method.

...read moreread less

34 citations

Journal Article•DOI•

Micro-blog topic detection method based on BTM topic model and K-means clustering algorithm

[...]

Weijiang Li¹, Yanming Feng¹, Dongjun Li², Zhengtao Yu¹•Institutions (2)

Kunming University of Science and Technology¹, Soochow University (Suzhou)²

16 Sep 2016-Automatic Control and Computer Sciences

TL;DR: BTM topic model is employed to process short texts–micro-blog data for alleviating the problem of sparsity, and K-means clustering algorithm is integrated into BTM (Biterm Topic Model) for topics discovery further.

...read moreread less

Abstract: The development of micro-blog, generating large-scale short texts, provides people with convenient communication. In the meantime, discovering topics from short texts genuinely becomes an intractable problem. It was hard for traditional topic model-to-model short texts, such as probabilistic latent semantic analysis (PLSA) and Latent Dirichlet Allocation (LDA). They suffered from the severe data sparsity when disposed short texts. Moreover, K-means clustering algorithm can make topics discriminative when datasets is intensive and the difference among topic documents is distinct. In this paper, BTM topic model is employed to process short texts–micro-blog data for alleviating the problem of sparsity. At the same time, we integrating K-means clustering algorithm into BTM (Biterm Topic Model) for topics discovery further. The results of experiments on Sina micro-blog short text collections demonstrate that our method can discover topics effectively.

...read moreread less

33 citations

Collapse

Network Information

Performance

Metrics

2,984

Papers

212,744

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	77
2021	14
2020	36
2019	27
2018	58

Probabilistic latent semantic analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics