Showing papers on "Probabilistic latent semantic analysis published in 2010"

PDF

Open Access

Journal Article•DOI•

Object Detection with Discriminatively Trained Part-Based Models

[...]

Pedro F. Felzenszwalb¹, Ross Girshick¹, David McAllester², Deva Ramanan³•Institutions (3)

University of Chicago¹, Toyota², University of California, Irvine³

01 Sep 2010-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

Abstract: We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL data sets. Our system relies on new methods for discriminative training with partially labeled data. We combine a margin-sensitive approach for data-mining hard negative examples with a formalism we call latent SVM. A latent SVM is a reformulation of MI--SVM in terms of latent variables. A latent SVM is semiconvex, and the training problem becomes convex once latent information is specified for the positive examples. This leads to an iterative training algorithm that alternates between fixing latent values for positive examples and optimizing the latent SVM objective function.

...read moreread less

10,501 citations

Journal Article•

Posterior Regularization for Structured Latent Variable Models

[...]

Kuzman Ganchev¹, João Graça², Jennifer Gillenwater², Ben Taskar¹•Institutions (2)

University of Pennsylvania¹, INESC-ID²

01 Mar 2010-Journal of Machine Learning Research

TL;DR: This work presents an efficient algorithm for learning with posterior regularization and illustrates its versatility on a diverse set of structural constraints such as bijectivity, symmetry and group sparsity in several large scale experiments, including multi-view learning, cross-lingual dependency grammar induction, unsupervised part-of-speech induction, and bitext word alignment.

...read moreread less

Abstract: We present posterior regularization, a probabilistic framework for structured, weakly supervised learning. Our framework efficiently incorporates indirect supervision via constraints on posterior distributions of probabilistic models with latent variables. Posterior regularization separates model complexity from the complexity of structural constraints it is desired to satisfy. By directly imposing decomposable regularization on the posterior moments of latent variables during learning, we retain the computational efficiency of the unconstrained model while ensuring desired constraints hold in expectation. We present an efficient algorithm for learning with posterior regularization and illustrate its versatility on a diverse set of structural constraints such as bijectivity, symmetry and group sparsity in several large scale experiments, including multi-view learning, cross-lingual dependency grammar induction, unsupervised part-of-speech induction, and bitext word alignment.

...read moreread less

570 citations

Book Chapter•DOI•

Latent Class Models

[...]

Jeroen K. Vermunt¹•Institutions (1)

Tilburg University¹

01 Jan 2010

TL;DR: A statistical model can be called a latent class (LC) or mixture model if it assumes that some of its parameters differ across unobserved subgroups, LCs, or mixture components as mentioned in this paper.

...read moreread less

Abstract: A statistical model can be called a latent class (LC) or mixture model if it assumes that some of its parameters differ across unobserved subgroups, LCs, or mixture components. This rather general idea has several seemingly unrelated applications, the most important of which are clustering, scaling, density estimation, and random-effects modeling. This article describes simple LC models for clustering, restricted LC models for scaling, and mixture regression models for nonparametric random-effects modeling, as well as gives an overview of recent developments in the field of LC analysis. Moreover, attention is paid to topics such as maximum likelihood estimation, identification issues, model selection, and software.

...read moreread less

431 citations

Proceedings Article•

Bayesian Gaussian Process Latent Variable Model

[...]

Michalis K. Titsias¹, Neil D. Lawrence²•Institutions (2)

National and Kapodistrian University of Athens¹, University of Sheffield²

31 Mar 2010

TL;DR: In this article, a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction is introduced, which can automatically select the dimensionality of the nonlinear latent space.

...read moreread less

Abstract: We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs.

...read moreread less

338 citations

Journal Article•DOI•

Bug localization using latent Dirichlet allocation

[...]

Stacy K. Lukins¹, Nicholas A. Kraft², Letha H. Etzkorn¹•Institutions (2)

University of Alabama in Huntsville¹, University of Alabama²

01 Sep 2010-Information & Software Technology

TL;DR: An effective static technique for automatic bug localization can be built around Latent Dirichlet allocation (LDA), and there is no significant relationship between the accuracy of the LDA-based technique and the size of the subject software system or the stability of its source code base.

...read moreread less

Abstract: Context: Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI). Latent Dirichlet allocation (LDA) is a generative statistical model that has significant advantages, in modularity and extensibility, over both LSI and probabilistic LSI (pLSI). Moreover, LDA has been shown effective in topic model based information retrieval. In this paper, we present a static LDA-based technique for automatic bug localization and evaluate its effectiveness. Objective: We evaluate the accuracy and scalability of the LDA-based technique and investigate whether it is suitable for use with open-source software systems of varying size, including those developed using agile methods. Method: We present five case studies designed to determine the accuracy and scalability of the LDA-based technique, as well as its relationships to software system size and to source code stability. The studies examine over 300 bugs across more than 25 iterations of three software systems. Results: The results of the studies show that the LDA-based technique maintains sufficient accuracy across all bugs in a single iteration of a software system and is scalable to a large number of bugs across multiple revisions of two software systems. The results of the studies also indicate that the accuracy of the LDA-based technique is not affected by the size of the subject software system or by the stability of its source code base. Conclusion: We conclude that an effective static technique for automatic bug localization can be built around LDA. We also conclude that there is no significant relationship between the accuracy of the LDA-based technique and the size of the subject software system or the stability of its source code base. Thus, the LDA-based technique is widely applicable.

...read moreread less

299 citations

Proceedings Article•DOI•

Latent variable graphical model selection via convex optimization

[...]

Venkat Chandrasekaran¹, Pablo A. Parrilo¹, Alan S. Willsky¹•Institutions (1)

Massachusetts Institute of Technology¹

06 Aug 2010

TL;DR: The modeling framework can be viewed as a combination of dimensionality reduction and graphical modeling (to capture remaining statistical structure not attributable to the latent variables) and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables.

...read moreread less

Abstract: Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latent-variable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out” over most of the observed variables. Next we propose a tractable convex program based on regularized maximum-likelihood for model selection in this latent-variable setting; the regularizer uses both the l 1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the high-dimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of low-rank matrices play an important role in our analysis.

...read moreread less

157 citations

Journal Article•DOI•

Large-scale prediction of human protein-protein interactions from amino acid sequence based on latent topic features.

[...]

Xiaoyong Pan¹, Ya-Nan Zhang¹, Hong-Bin Shen¹•Institutions (1)

Shanghai Jiao Tong University¹

31 Aug 2010-Journal of Proteome Research

TL;DR: A novel hierarchical LDA-RF (latent dirichlet allocation-random forest) model to predict human protein-protein interactions from protein primary sequences directly is proposed, which is featured by a high success rate and strong ability for handling large-scale data sets by digging the hidden internal structures buried into the noisy amino acid sequences in low dimensional latent semantic space.

...read moreread less

Abstract: Protein−protein interaction (PPI) is at the core of the entire interactomic system of any living organism. Although there are many human protein−protein interaction links being experimentally determined, the number is still relatively very few compared to the estimation that there are ∼300 000 protein−protein interactions in human beings. Hence, it is still urgent and challenging to develop automated computational methods to accurately and efficiently predict protein−protein interactions. In this paper, we propose a novel hierarchical LDA-RF (latent dirichlet allocation-random forest) model to predict human protein−protein interactions from protein primary sequences directly, which is featured by a high success rate and strong ability for handling large-scale data sets by digging the hidden internal structures buried into the noisy amino acid sequences in low dimensional latent semantic space. First, the local sequential features represented by conjoint triads are constructed from sequences. Then the gene...

...read moreread less

156 citations

Patent•

Semantic search system using semantic ranking scheme

[...]

Ji-hyun Lee¹, Chin-Wan Chung¹•Institutions (1)

KAIST¹

12 Feb 2010

TL;DR: In this paper, a semantic search system using a semantic ranking scheme is presented, which includes an ontology analyzer analyzing ontology data related to a search target to determine a weight value of each property according to a weighing method for each property; a semantic path extractor extracting all the semantic paths between resources and query keywords and determining a weighted semantic path according to the semantic path weight value determination scheme.

...read moreread less

Abstract: A semantic search system using a semantic ranking scheme including: an ontology analyzer analyzing ontology data related to a search target to determine a weight value of each property according to a weighing method for property; a semantic path extractor extracting all the semantic paths between resources and query keywords and determining a weight value of each extracted semantic path according to the semantic path weight value determination scheme by using the weight value of each property; a relevant resource searcher traversing an instance graph of ontology based on a semantic path having a pre-set length and weight value of more than an expectation level to search resources that have a semantic relationship with the query keywords and are declared as a type presented in the query; and a semantic relevance ranker selecting a top-k results having the highest rank from among the candidate results extracted by the relevant resource researcher by using a relevance scoring function.

...read moreread less

127 citations

Proceedings Article•

Translingual Document Representations from Discriminative Projections

[...]

John Platt¹, Kristina Toutanova¹, Wen-tau Yih¹•Institutions (1)

Microsoft¹

09 Oct 2010

TL;DR: This work uses discriminative training to create a projection of documents from multiple languages into a single translingual vector space and evaluates these algorithms on two tasks: parallel document retrieval for Wikipedia and Europarl documents, and cross-lingual text classification on Reuters.

...read moreread less

Abstract: Representing documents by vectors that are independent of language enhances machine translation and multilingual text categorization. We use discriminative training to create a projection of documents from multiple languages into a single translingual vector space. We explore two variants to create these projections: Oriented Principal Component Analysis (OPCA) and Coupled Probabilistic Latent Semantic Analysis (CPLSA). Both of these variants start with a basic model of documents (PCA and PLSA). Each model is then made discriminative by encouraging comparable document pairs to have similar vector representations. We evaluate these algorithms on two tasks: parallel document retrieval for Wikipedia and Europarl documents, and cross-lingual text classification on Reuters. The two discriminative variants, OPCA and CPLSA, significantly outperform their corresponding baselines. The largest differences in performance are observed on the task of retrieval when the documents are only comparable and not parallel. The OPCA method is shown to perform best.

...read moreread less

126 citations

Proceedings Article•

Semantic Role Features for Machine Translation

[...]

Ding Liu¹, Daniel Gildea¹•Institutions (1)

University of Rochester¹

23 Aug 2010

TL;DR: The authors proposed semantic role features for a tree-to-string transducer to model the reordering/deletion of source-side semantic roles, which significantly outperformed systems trained based on Max-Likelihood and EM.

...read moreread less

Abstract: We propose semantic role features for a Tree-to-String transducer to model the reordering/deletion of source-side semantic roles. These semantic features, as well as the Tree-to-String templates, are trained based on a conditional log-linear model and are shown to significantly outperform systems trained based on Max-Likelihood and EM. We also show significant improvement in sentence fluency by using the semantic role features in the log-linear model, based on manual evaluation.

...read moreread less

120 citations

Proceedings Article•

Factorized Orthogonal Latent Spaces

[...]

Mathieu Salzmann¹, Carl Henrik Ek², Raquel Urtasun¹, Trevor Darrell¹•Institutions (2)

University of California, Berkeley¹, Royal Institute of Technology²

31 Mar 2010

TL;DR: This paper proposes a robust approach to factorizing the latent space into shared and private spaces by introducing orthogonality constraints, which penalize redundant latent representations.

...read moreread less

Abstract: Existing approaches to multi-view learning are particularly effective when the views are either independent (i.e, multi-kernel approaches) or fully dependent (i.e., shared latent spaces). However, in real scenarios, these assumptions are almost never truly satisfied. Recently, two methods have attempted to tackle this problem by factorizing the information and learn separate latent spaces for modeling the shared (i.e., correlated) and private (i.e., independent) parts of the data. However, these approaches are very sensitive to parameters setting or initialization. In this paper we propose a robust approach to factorizing the latent space into shared and private spaces by introducing orthogonality constraints, which penalize redundant latent representations. Furthermore, unlike previous approaches, we simultaneously learn the structure and dimensionality of the latent spaces by relying on a regularizer that encourages the latent space of each data stream to be low dimensional. To demonstrate the benefits of our approach, we apply it to two existing shared latent space models that assume full dependence of the views, the sGPLVM and the sKIE, and show that our constraints improve the performance of these models on the task of pose estimation from monocular images.

...read moreread less

Proceedings Article•DOI•

Predictive Subspace Learning for Multi-view Data: a Large Margin Approach

[...]

Ning Chen¹, Jun Zhu¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

06 Dec 2010

TL;DR: A large-margin learning framework to discover a predictive latent subspace representation shared by multiple views based on an undirected latent space Markov network that fulfills a weak conditional independence assumption that multi-view observations and response variables are independent given a set of latent variables.

...read moreread less

Abstract: Learning from multi-view data is important in many applications, such as image classification and annotation. In this paper, we present a large-margin learning framework to discover a predictive latent subspace representation shared by multiple views. Our approach is based on an undirected latent space Markov network that fulfills a weak conditional independence assumption that multi-view observations and response variables are independent given a set of latent variables. We provide efficient inference and parameter estimation methods for the latent sub-space model. Finally, we demonstrate the advantages of large-margin learning on real video and web image data for discovering predictive latent representations and improving the performance on image classification, annotation and retrieval.

...read moreread less

Journal Article•DOI•

Learning Object Categories From Internet Image Searches

[...]

Rob Fergus¹, Li Fei-Fei², Pietro Perona³, Andrew Zisserman⁴•Institutions (4)

New York University¹, Stanford University², California Institute of Technology³, University of Oxford⁴

10 Jun 2010

TL;DR: A simple approach to learning models of visual object categories from images gathered from Internet image search engines, derived from the probabilistic latent semantic analysis technique for text document analysis, that can be used to automatically learn object models from these data.

...read moreread less

Abstract: In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models “on-the-fly.” We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets.

...read moreread less

Proceedings Article•

Evaluation of Unsupervised Emotion Models to Textual Affect Recognition

[...]

Sunghwan Mac Kim¹, Alessandro Valitutti², Rafael A. Calvo¹•Institutions (2)

University of Sydney¹, University of Trento²

05 Jun 2010

TL;DR: Experiments show that a categorical model using NMF results in better performances for SemEval and fairy tales, whereas a dimensional model performs better with ISEAR.

...read moreread less

Abstract: In this paper we present an evaluation of new techniques for automatically detecting emotions in text. The study estimates categorical model and dimensional model for the recognition of four affective states: Anger, Fear, Joy, and Sadness that are common emotions in three datasets: SemEval-2007 "Affective Text", ISEAR (International Survey on Emotion Antecedents and Reactions), and children's fairy tales. In the first model, WordNet-Affect is used as a linguistic lexical resource and three dimensionality reduction techniques are evaluated: Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis (PLSA), and Non-negative Matrix Factorization (NMF). In the second model, ANEW (Affective Norm for English Words), a normative database with affective terms, is employed. Experiments show that a categorical model using NMF results in better performances for SemEval and fairy tales, whereas a dimensional model performs better with ISEAR.

...read moreread less

Journal Article•DOI•

Audio-Based Semantic Concept Classification for Consumer Video

[...]

Keansub Lee¹, Daniel P. W. Ellis¹•Institutions (1)

Columbia University¹

01 Aug 2010-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A novel method for automatically classifying consumer video clips based on their soundtracks using a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections.

...read moreread less

Abstract: This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.

...read moreread less

Proceedings Article•DOI•

An Empirical Comparison of Four Text Mining Methods

[...]

Sangno Lee¹, Jeff Baker², Jaeki Song¹, James C. Wetherbe¹•Institutions (2)

Texas Tech University¹, American University of Sharjah²

05 Jan 2010

TL;DR: This paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods.

...read moreread less

Abstract: The amount of textual data that is available for researchers and businesses to analyze is increasing at a dramatic rate This reality has led IS researchers to investigate various text mining techniques This essay examines four text mining methods that are frequently used in order to identify their advantages and limitations The four methods that we examine are (1) latent semantic analysis, (2) probabilistic latent semantic analysis, (3) latent Dirichlet allocation, and (4) the correlated topic model We compare these four methods and highlight the optimal conditions under which to apply the various methods Our paper sheds light on the theory that underlies text mining methods and provides guidance for researchers who seek to apply these methods

...read moreread less

Proceedings Article•

Cross-Lingual Latent Topic Extraction

[...]

Duo Zhang¹, Qiaozhu Mei², ChengXiang Zhai¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of Michigan²

11 Jul 2010

TL;DR: A new topic model called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) is proposed which extends the probabilistic LatentSemantic Analysis model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary.

...read moreread less

Abstract: Probabilistic latent topic models have recently enjoyed much success in extracting and analyzing latent topics in text in an unsupervised way. One common deficiency of existing topic models, though, is that they would not work well for extracting cross-lingual latent topics simply because words in different languages generally do not co-occur with each other. In this paper, we propose a way to incorporate a bilingual dictionary into a probabilistic topic model so that we can apply topic models to extract shared latent topics in text data of different languages. Specifically, we propose a new topic model called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) which extends the Probabilistic Latent Semantic Analysis (PLSA) model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary. Both qualitative and quantitative experimental results show that the PCLSA model can effectively extract cross-lingual latent topics from multilingual text data.

...read moreread less

Proceedings Article•

Generative Alignment and Semantic Parsing for Learning from Ambiguous Supervision

[...]

Joohyun Kim¹, Raymond J. Mooney¹•Institutions (1)

University of Texas at Austin¹

23 Aug 2010

TL;DR: Experimental results on the Robocup sportscasting corpora in both English and Korean indicate that the probabilistic generative model presented produces more accurate semantic alignments than existing methods and also produces competitive semantic parsers and improved language generators.

...read moreread less

Abstract: We present a probabilistic generative model for learning semantic parsers from ambiguous supervision. Our approach learns from natural language sentences paired with world states consisting of multiple potential logical meaning representations. It disambiguates the meaning of each sentence while simultaneously learning a semantic parser that maps sentences into logical form. Compared to a previous generative model for semantic alignment, it also supports full semantic parsing. Experimental results on the Robocup sportscasting corpora in both English and Korean indicate that our approach produces more accurate semantic alignments than existing methods and also produces competitive semantic parsers and improved language generators.

...read moreread less

Book Chapter•DOI•

Inferring 3D shapes and deformations from single views

[...]

Yu Chen¹, Tae-Kyun Kim¹, Roberto Cipolla¹•Institutions (1)

University of Cambridge¹

05 Sep 2010

TL;DR: A novel probabilistic inference algorithm for 3D shape estimation is proposed by maximum likelihood estimates of the GPLVM latent variables and the camera parameters that best fit generated 3D shapes to given silhouettes.

...read moreread less

Abstract: In this paper we propose a probabilistic framework that models shape variations and infers dense and detailed 3D shapes from a single silhouette. We model two types of shape variations, the object phenotype variation and its pose variation using two independent Gaussian Process Latent Variable Models (GPLVMs) respectively. The proposed shape variation models are learnt from 3D samples without prior knowledge about object class, e.g. object parts and skeletons, and are combined to fully span the 3D shape space. A novel probabilistic inference algorithm for 3D shape estimation is proposed by maximum likelihood estimates of the GPLVM latent variables and the camera parameters that best fit generated 3D shapes to given silhouettes. The proposed inference involves a small number of latent variables and it is computationally efficient. Experiments on both human body and shark data demonstrate the efficacy of our new approach.

...read moreread less

Proceedings Article•

Term Weighting Schemes for Latent Dirichlet Allocation

[...]

Andrew T. Wilson¹, Peter A. Chew•Institutions (1)

Sandia National Laboratories¹

02 Jun 2010

TL;DR: It is shown that the 'problem' of high-frequency words can be dealt with more elegantly, and in a way that to the authors' knowledge has not been considered in LDA, through the use of appropriate weighting schemes comparable to those sometimes used in Latent Semantic Indexing (LSI).

...read moreread less

Abstract: Many implementations of Latent Dirichlet Allocation (LDA), including those described in Blei et al. (2003), rely at some point on the removal of stopwords, words which are assumed to contribute little to the meaning of the text. This step is considered necessary because otherwise high-frequency words tend to end up scattered across many of the latent topics without much rhyme or reason. We show, however, that the 'problem' of high-frequency words can be dealt with more elegantly, and in a way that to our knowledge has not been considered in LDA, through the use of appropriate weighting schemes comparable to those sometimes used in Latent Semantic Indexing (LSI). Our proposed weighting methods not only make theoretical sense, but can also be shown to improve precision significantly on a non-trivial cross-language retrieval task.

...read moreread less

Journal Article•DOI•

Bayesian Semiparametric Structural Equation Models with Latent Variables

[...]

Mingan Yang¹, David B. Dunson²•Institutions (2)

Saint Louis University¹, Duke University²

27 Jul 2010-Psychometrika

TL;DR: A broad class of semiparametric Bayesian SEMs, which allow mixed categorical and continuous manifest variables while also allowing the latent variables to have unknown distributions is proposed, based on centered Dirichlet process and CDP mixture models.

...read moreread less

Abstract: Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In this article, we propose a broad class of semiparametric Bayesian SEMs, which allow mixed categorical and continuous manifest variables while also allowing the latent variables to have unknown distributions. In order to include typical identifiability restrictions on the latent variable distributions, we rely on centered Dirichlet process (CDP) and CDP mixture (CDPM) models. The CDP will induce a latent class model with an unknown number of classes, while the CDPM will induce a latent trait model with unknown densities for the latent traits. A simple and efficient Markov chain Monte Carlo algorithm is developed for posterior computation, and the methods are illustrated using simulated examples, and several applications.

...read moreread less

Proceedings Article•DOI•

Topic models with power-law using Pitman-Yor process

[...]

Issei Sato¹, Hiroshi Nakagawa¹•Institutions (1)

University of Tokyo¹

25 Jul 2010

TL;DR: A novel topic model using the Pitman-Yor(PY) process is proposed, called the PY topic model, which captures two properties of a document; a power-law word distribution and the presence of multiple topics.

...read moreread less

Abstract: One important approach for knowledge discovery and data mining is to estimate unobserved variables because latent variables can indicate hidden specific properties of observed data. The latent factor model assumes that each item in a record has a latent factor; the co-occurrence of items can then be modeled by latent factors. In document modeling, a record indicates a document represented as a "bag of words," meaning that the order of words is ignored, an item indicates a word and a latent factor indicates a topic. Latent Dirichlet allocation (LDA) is a widely used Bayesian topic model applying the Dirichlet distribution over the latent topic distribution of a document having multiple topics. LDA assumes that latent topics, i.e., discrete latent variables, are distributed according to a multinomial distribution whose parameters are generated from the Dirichlet distribution. LDA also models a word distribution by using a multinomial distribution whose parameters follows the Dirichlet distribution. This Dirichlet-multinomial setting, however, cannot capture the power-law phenomenon of a word distribution, which is known as Zipf's law in linguistics. We therefore propose a novel topic model using the Pitman-Yor(PY) process, called the PY topic model. The PY topic model captures two properties of a document; a power-law word distribution and the presence of multiple topics. In an experiment using real data, this model outperformed LDA in document modeling in terms of perplexity.

...read moreread less

Journal Article•DOI•

The schema theory for semantic link network

[...]

Hai Zhuge¹, Yunchuan Sun¹•Institutions (1)

Chinese Academy of Sciences¹

01 Mar 2010-Future Generation Computer Systems

TL;DR: The schema theory for SLN including the concepts, rule-constraint normal forms and relevant algorithms are proposed, which provides the basis for normalized management of SLN and its applications.

...read moreread less

Proceedings Article•

Text Summarization of Turkish Texts using Latent Semantic Analysis

[...]

Makbule Gulcin Ozsoy, Ilyas Cicekli¹, Ferda Nur Alpaslan•Institutions (1)

Bilkent University¹

23 Aug 2010

TL;DR: Two new LSA based summarization algorithms are proposed and their performances are compared using their ROUGE-L scores to find out well-formed summaries.

...read moreread less

Abstract: Text summarization solves the problem of extracting important information from huge amount of text data. There are various methods in the literature that aim to find out well-formed summaries. One of the most commonly used methods is the Latent Semantic Analysis (LSA). In this paper, different LSA based summarization algorithms are explained and two new LSA based summarization algorithms are proposed. The algorithms are evaluated on Turkish documents, and their performances are compared using their ROUGE-L scores. One of our algorithms produces the best scores.

...read moreread less

Proceedings Article•DOI•

Estimating the Optimal Number of Latent Concepts in Source Code Analysis

[...]

Scott Grant¹, James R. Cordy¹•Institutions (1)

Queen's University¹

12 Sep 2010

TL;DR: A series of Latent Dirichlet Allocation models with varying topic counts are generated to evaluate the ability of the model to identify related source code blocks, and demonstrate the consequences of choosing too few or too many latent topics.

...read moreread less

Abstract: The optimal number of latent topics required to model the most accurate latent substructure for a source code corpus is an open question in source code analysis. Most estimates about the number of latent topics that exist in a software corpus are based on the assumption that the data is similar to natural language, but there is little empirical evidence to support this. In order to help determine the appropriate number of topics needed to accurately represent the source code, we generate a series of Latent Dirichlet Allocation models with varying topic counts. We use a heuristic to evaluate the ability of the model to identify related source code blocks, and demonstrate the consequences of choosing too few or too many latent topics.

...read moreread less

Journal Article•DOI•

A mixture of experts latent position cluster model for social network data

[...]

Isobel Claire Gormley¹, Thomas Brendan Murphy¹•Institutions (1)

University College Dublin¹

01 May 2010-Statistical Methodology

TL;DR: The mixture of experts framework allows covariates to enter the latent position cluster model in a number of ways, yielding different model interpretations, and is demonstrated through an illustrative example detailing relationships between a group of lawyers in the USA.

...read moreread less

Journal Article•DOI•

Principal semantic components of language and the measurement of meaning.

[...]

Alexei V. Samsonovic¹, Giorgio A. Ascoli¹•Institutions (1)

Krasnow Institute for Advanced Study¹

11 Jun 2010-PLOS ONE

TL;DR: A low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy is constructed, and provides a foundational metric system for the quantitative analysis of word meaning.

...read moreread less

Abstract: Metric systems for semantics, or semantic cognitive maps, are allocations of words or other representations in a metric space based on their meaning. Existing methods for semantic mapping, such as Latent Semantic Analysis and Latent Dirichlet Allocation, are based on paradigms involving dissimilarity metrics. They typically do not take into account relations of antonymy and yield a large number of domain-specific semantic dimensions. Here, using a novel self-organization approach, we construct a low-dimensional, context-independent semantic map of natural language that represents simultaneously synonymy and antonymy. Emergent semantics of the map principal components are clearly identifiable: the first three correspond to the meanings of “good/bad” (valence), “calm/excited” (arousal), and “open/closed” (freedom), respectively. The semantic map is sufficiently robust to allow the automated extraction of synonyms and antonyms not originally in the dictionaries used to construct the map and to predict connotation from their coordinates. The map geometric characteristics include a limited number (∼4) of statistically significant dimensions, a bimodal distribution of the first component, increasing kurtosis of subsequent (unimodal) components, and a U-shaped maximum-spread planar projection. Both the semantic content and the main geometric features of the map are consistent between dictionaries (Microsoft Word and Princeton's WordNet), among Western languages (English, French, German, and Spanish), and with previously established psychometric measures. By defining the semantics of its dimensions, the constructed map provides a foundational metric system for the quantitative analysis of word meaning. Language can be viewed as a cumulative product of human experiences. Therefore, the extracted principal semantic dimensions may be useful to characterize the general semantic dimensions of the content of mental states. This is a fundamental step toward a universal metric system for semantics of human experiences, which is necessary for developing a rigorous science of the mind.

...read moreread less

Journal Article•DOI•

The hidden Markov Topic model: a probabilistic model of semantic representation.

[...]

Mark Andrews¹, Gabriella Vigliocco¹•Institutions (1)

University College London¹

01 Jan 2010-Topics in Cognitive Science

TL;DR: A model that learns semantic representations from the distributional statistics of language, and infers semantic representations by taking into account the inherent sequential nature of linguistic data is described.

...read moreread less

Abstract: In this paper, we describe a model that learns semantic representations from the distributional statistics of language. This model, however, goes beyond the common bag-of-words paradigm, and infers semantic representations by taking into account the inherent sequential nature of linguistic data. The model we describe, which we refer to as a Hidden Markov Topics model, is a natural extension of the current state of the art in Bayesian bag-of-words models, that is, the Topics model of Griffiths, Steyvers, and Tenenbaum (2007), preserving its strengths while extending its scope to incorporate more fine-grained linguistic information.

...read moreread less

Journal Article•DOI•

Modeling Relations among Discrete Developmental Processes: A General Approach to Associative Latent Transition Analysis

[...]

Bethany C. Bray¹, Stephanie T. Lanza², Linda M. Collins²•Institutions (2)

Virginia Tech¹, Pennsylvania State University²

12 Oct 2010-Structural Equation Modeling

TL;DR: A flexible approach to modeling relations in development among two or more discrete, multidimensional latent variables based on the general framework of loglinear modeling with latent variables called associative latent transition analysis (ALTA).

...read moreread less

Abstract: To understand one developmental process, it is often helpful to investigate its relations with other developmental processes. Statistical methods that model development in multiple processes simultaneously over time include latent growth curve models with time-varying covariates, multivariate latent growth curve models, and dual trajectory models. These models are designed for growth represented by continuous, unidimensional trajectories. The purpose of this article is to present a flexible approach to modeling relations in development among two or more discrete, multidimensional latent variables based on the general framework of loglinear modeling with latent variables called associative latent transition analysis (ALTA). Focus is given to the substantive interpretation of different associative latent transition models, and exactly what hypotheses are expressed in each model. An empirical demonstration of ALTA is presented to examine the association between the development of alcohol use and sexual risk ...

...read moreread less

Journal Article•DOI•

Semiparametric Latent Variable Models With Bayesian P-Splines

[...]

Xinyuan Song, Zhaohua Lu

01 Jan 2010-Journal of Computational and Graphical Statistics

TL;DR: In this article, a semiparametric latent variable model is developed, in which outcome latent variables are related to explanatory latent variables and covariates through an additive structural equation formulated by a series of unspecified smooth functions.

...read moreread less

Abstract: This article aims to develop a semiparametric latent variable model, in which outcome latent variables are related to explanatory latent variables and covariates through an additive structural equation formulated by a series of unspecified smooth functions. The Bayesian P-splines approach, together with a Markov chain Monte Carlo algorithm, is proposed to estimate smooth functions, unknown parameters, and latent variables in the model. The performance of the developed methodology is demonstrated by a simulation study. An illustrative example in analyzing bone mineral density in older men is provided. An Appendix which includes technical details of the proposed MCMC algorithm and an R code in implementing the algorithm are available as the online supplemental materials.

...read moreread less

Collapse