scispace - formally typeset
Open AccessProceedings Article

A Sparsity Constraint for Topic Models - Application to Temporal Activity Mining

Reads0
Chats0
TLDR
This paper proposes a method that encourages sparsity, by adding regularization constraints on the searched distributions, which can be used with most topic models and lead to a simple modified version of the EM standard optimization procedure.
Abstract
We address the mining of sequential activity patterns from document logs given as word-time occurrences. We achieve this using topics that models both the cooccurrence and the temporal order in which words occur within a temporal window. Discovering such topics, which is particularly hard when multiple activities can occur simultaneously, is conducted through the joint inference of the temporal topics and of their starting times, allowing the implicit alignment of the same activity occurences in the document. A current issue is that while we would like topic starting times to be represented by sparse distributions, this is not achieved in practice. Thus, in this paper, we propose a method that encourages sparsity, by adding regularization constraints on the searched distributions. The constraints can be used with most topic models (e.g. PLSA, LDA) and lead to a simple modified version of the EM standard optimization procedure. The effect of the sparsity constraint on our activity model and the robustness improvment in the presence of difference noises have been validated on synthetic data. Its effectiveness is also illustrated in video activity analysis, where the discovered topics capture frequent patterns that implicitly represent typical trajectories of scene objects.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Optimization for Machine Learning

TL;DR: This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields and will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.
Journal ArticleDOI

Additive regularization of topic models

TL;DR: This paper introduces an alternative semi-probabilistic approach, which it is called additive regularization of topic models (ARTM), which regularizes an ill-posed problem of stochastic matrix factorization by maximizing a weighted sum of the log-likelihood and additional criteria.
Journal ArticleDOI

A Sequential Topic Model for Mining Recurrent Activities from Long Term Video Logs

TL;DR: This paper introduces a novel probabilistic activity modeling approach that mines recurrent sequential patterns called motifs from documents given as word-time count matrices, and proposes a general method that favors the recovery of sparse distributions by adding simple regularization constraints on the searched distributions to the data likelihood optimization criteria.
Book ChapterDOI

Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization

TL;DR: Additive Regularization of Topic Models (ARTM) as mentioned in this paper is a non-Bayesian approach that is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models.
References
More filters
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI

Unsupervised Learning by Probabilistic Latent Semantic Analysis

TL;DR: This paper proposes to make use of a temperature controlled version of the Expectation Maximization algorithm for model fitting, which has shown excellent performance in practice, and results in a more principled approach with a solid foundation in statistical inference.
Proceedings ArticleDOI

Dynamic topic models

TL;DR: A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.
Proceedings ArticleDOI

Topics over time: a non-Markov continuous-time model of topical trends

TL;DR: An LDA-style topic model is presented that captures not only the low-dimensional structure of data, but also how the structure changes over time, showing improved topics, better timestamp prediction, and interpretable trends.
Related Papers (5)