scispace - formally typeset
Search or ask a question

Showing papers by "Svetha Venkatesh published in 2012"


Posted Content
TL;DR: This paper explores and extends a probabilistic model known as Boltzmann Machine for collaborative filtering tasks, which seamlessly integrates both the similarity and co-occurrence in a principled manner and rivals existing well-known methods on moderate and large-scale movie recommendation.
Abstract: Collaborative filtering is an effective recommendation technique wherein the preference of an individual can potentially be predicted based on preferences of other members. Early algorithms often relied on the strong locality in the preference data, that is, it is enough to predict preference of a user on a particular item based on a small subset of other users with similar tastes or of other items with similar properties. More recently, dimensionality reduction techniques have proved to be equally competitive, and these are based on the co-occurrence patterns rather than locality. This paper explores and extends a probabilistic model known as Boltzmann Machine for collaborative filtering tasks. It seamlessly integrates both the similarity and co-occurrence in a principled manner. In particular, we study parameterisation options to deal with the ordinal nature of the preferences, and propose a joint modelling of both the user-based and item-based processes. Experiments on moderate and large-scale movie recommendation show that our framework rivals existing well-known methods.

62 citations


Journal ArticleDOI
TL;DR: An iterative algorithm is proposed for solving the robust CS problem that exploits the power of existing CS solvers and the upper bound on the recovery error in the case of non-Gaussian noise is reduced.
Abstract: Compressed sensing (CS) is a new information sampling theory for acquiring sparse or compressible data with much fewer measurements than those otherwise required by the Nyquist/Shannon counterpart. This is particularly important for some imaging applications such as magnetic resonance imaging or in astronomy. However, in the existing CS formulation, the use of the l2 norm on the residuals is not particularly efficient when the noise is impulsive. This could lead to an increase in the upper bound of the recovery error. To address this problem, we consider a robust formulation for CS to suppress outliers in the residuals. We propose an iterative algorithm for solving the robust CS problem that exploits the power of existing CS solvers. We also show that the upper bound on the recovery error in the case of non-Gaussian noise is reduced and then demonstrate the efficacy of the method through numerical studies.

45 citations


Proceedings ArticleDOI
16 Jun 2012
TL;DR: A novel approach to improving subspace clustering by exploiting the spatial constraints is presented, which encourages the sparse solution to be consistent with the spatial geometry of the tracked points, by embedding weights into the sparse formulation.
Abstract: We present a novel approach to improving subspace clustering by exploiting the spatial constraints. The new method encourages the sparse solution to be consistent with the spatial geometry of the tracked points, by embedding weights into the sparse formulation. By doing so, we are able to correct sparse representations in a principled manner without introducing much additional computational cost. We discuss alternative ways to treat the missing and corrupted data using the latest theory in robust lasso regression and suggest numerical algorithms so solve the proposed formulation. The experiments on the benchmark Johns Hopkins 155 dataset demonstrate that exploiting spatial constraints significantly improves motion segmentation.

38 citations


Posted Content
TL;DR: More computationally efficient algorithms are proposed by following latest advances in large-scale convex optimization for nonsmooth regularization to improve robust CS and effectively solve more sophisticated extensions where the original methods simply cannot.
Abstract: Compressed sensing (CS) is an important theory for sub-Nyquist sampling and recovery of compressible data. Recently, it has been extended by Pham and Venkatesh to cope with the case where corruption to the CS data is modeled as impulsive noise. The new formulation, termed as robust CS, combines robust statistics and CS into a single framework to suppress outliers in the CS recovery. To solve the newly formulated robust CS problem, Pham and Venkatesh suggested a scheme that iteratively solves a number of CS problems, the solutions from which converge to the true robust compressed sensing solution. However, this scheme is rather inefficient as it has to use existing CS solvers as a proxy. To overcome limitation with the original robust CS algorithm, we propose to solve the robust CS problem directly in this paper and drive more computationally efficient algorithms by following latest advances in large-scale convex optimization for non-smooth regularization. Furthermore, we also extend the robust CS formulation to various settings, including additional affine constraints, $\ell_1$-norm loss function, mixed-norm regularization, and multi-tasking, so as to further improve robust CS. We also derive simple but effective algorithms to solve these extensions. We demonstrate that the new algorithms provide much better computational advantage over the original robust CS formulation, and effectively solve more sophisticated extensions where the original methods simply cannot. We demonstrate the usefulness of the extensions on several CS imaging tasks.

34 citations


Journal ArticleDOI
TL;DR: This work presents a portable platform for pervasive delivery of early intervention therapy using multi-touch interfaces and principled ways to deliver stimuli of increasing complexity and adapt to a child's performance.

34 citations


Proceedings Article
01 Jan 2012
TL;DR: The model utilizes the hierarchical beta process as a nonparametric prior to automatically infer the number of shared and individual factors and provides a Gibbs sampling scheme using auxiliary variables for posterior inference.
Abstract: Joint analysis of multiple data sources is becoming increasingly popular in transfer learning, multi-task learning and cross-domain data mining. One promising approach to model the data jointly is through learning the shared and individual factor subspaces. However, performance of this approach depends on the subspace dimensionalities and the level of sharing needs to be specified a priori. To this end, we propose a nonparametric joint factor analysis framework for modeling multiple related data sources. Our model utilizes the hierarchical beta process as a nonparametric prior to automatically infer the number of shared and individual factors. For posterior inference, we provide a Gibbs sampling scheme using auxiliary variables. The effectiveness of the proposed framework is validated through its application on two real world problems - transfer learning in text and image retrieval.

24 citations


Journal ArticleDOI
TL;DR: An innovative face hallucination approach based on principle component analysis (PCA) and residue technique and the recursive and two-stage methods are proposed, which improve the results of face image enhancement.
Abstract: In this paper, we propose an innovative face hallucination approach based on principle component analysis (PCA) and residue technique. First, the relationship of projection coefficients between high-resolution and low-resolution images using PCA is investigated. Then based on this analysis, a high resolution global face image is constructed from a low resolution one. Next a high-resolution residue is derived based on the similarity between the projections on high and low resolution residue training sets. Finally by combining the global face and residue in high resolution, a high resolution face image is generated. Also the recursive and two-stage methods are proposed, which improve the results of face image enhancement. Extensive experiments validate the proposed approaches.

21 citations


Proceedings Article
01 Jan 2012
TL;DR: A novel approach to identification of latent hyper-groups in social communities based on users’ sentiment is presented, showing that a sentiment-based approach can yield useful insights into community formation and metacommunities.
Abstract: Participating in a community exemplifies the aspect of sharing, networking and interacting in a social media system. There has been extensive work on characterising on-line communities by their contents and tags using topic modelling tools. However, the role of sentiment and mood has not been studied. Arguably, mood is an integral feature of a text, and becomes more significant in the context of social media: two communities might discuss precisely the same topics, yet within an entirely different atmosphere. Such sentiment-related distinctions are important for many kinds of analysis and applications, such as community recommendation. We present a novel approach to identification of latent hyper-groups in social communities based on users’ sentiment. The results show that a sentiment-based approach can yield useful insights into community formation and metacommunities, having potential applications in, for example, mental health—by targeting support or surveillance to communities with negative mood—or in marketing—by targeting customer communities having the same sentiment on similar topics.

16 citations


Proceedings ArticleDOI
16 Dec 2012
TL;DR: This work proposes a modification of the decayed MCMC technique for incremental inference, providing the ability to discover theoretically unlimited patterns in unbounded video streams, and achieves near real-time execution and encouraging performance in abnormal activity detection.
Abstract: We propose a novel framework for large-scale scene understanding in static camera surveillance Our techniques combine fast rank-1 constrained robust PCA to compute the foreground, with non-parametric Bayesian models for inference Clusters are extracted in foreground patterns using a joint multinomial+Gaussian Dirichlet process model (DPM) Since the multinomial distribution is normalized, the Gaussian mixture distinguishes between similar spatial patterns but different activity levels (eg car vs bike) We propose a modification of the decayed MCMC technique for incremental inference, providing the ability to discover theoretically unlimited patterns in unbounded video streams A promising by-product of our framework is online, abnormal activity detection A benchmark video and two surveillance videos, with the longest being 140 hours long are used in our experiments The patterns discovered are as informative as existing scene understanding algorithms However, unlike existing work, we achieve near real-time execution and encouraging performance in abnormal activity detection

14 citations


Proceedings Article
22 Jul 2012
TL;DR: A novel sequential decision approach to modeling ordinal ratings in collaborative filtering problems that makes a novel use of the generalised extreme value distributions, which is found to be particularly suitable for modeling tasks and at the same time, facilitate the inference and learning procedure.
Abstract: We propose a novel sequential decision approach to modeling ordinal ratings in collaborative filtering problems. The rating process is assumed to start from the lowest level, evaluates against the latent utility at the corresponding level and moves up until a suitable ordinal level is found. Crucial to this generative process is the underlying utility random variables that govern the generation of ratings and their modelling choices. To this end, we make a novel use of the generalised extreme value distributions, which is found to be particularly suitable for our modeling tasks and at the same time, facilitate our inference and learning procedure. The proposed approach is flexible to incorporate features from both the user and the item. We evaluate the proposed framework on three well-known datasets: MovieLens, Dating Agency and Netflix. In all cases, it is demonstrated that the proposed work is competitive against state-of-the-art collaborative filtering methods.

13 citations


Proceedings Article
01 Jan 2012
TL;DR: This work proposes a nonparametric Bayesian, linear Poisson gamma model for count data and uses it for dictionary learning and presents an auxiliary variable Gibbs sampler, which turns the intractable inference into a tractable one.
Abstract: We propose a nonparametric Bayesian, linear Poisson gamma model for count data and use it for dictionary learning. A key property of this model is that it captures the parts-based representation similar to nonnegative matrix factorization. We present an auxiliary variable Gibbs sampler, which turns the intractable inference into a tractable one. Combining this inference procedure with the slice sampler of Indian buffet process, we show that our model can learn the number of factors automatically. Using synthetic and real-world datasets, we show that the proposed model outperforms other state-of-the-art nonparametric factor models.

Proceedings Article
01 Jan 2012
TL;DR: In this paper, Gaussian restricted Boltzmann machines (RBMs) are used to capture latent opinion prole of citizens around the world, and are competitive against state-of-the-art collaborative ltering techniques on large-scale public datasets.
Abstract: Ordinal data is omnipresent in almost all multiuser-generated feedback - questionnaires, preferences etc. This paper investigates modelling of ordinal data with Gaussian restricted Boltzmann machines (RBMs). In particular, we present the model architecture, learning and inference procedures for both vector-variate and matrix-variate ordinal data. We show that our model is able to capture latent opinion prole of citizens around the world, and is competitive against state-of-art collaborative ltering techniques on large-scale public datasets. The model thus has the potential to extend application of RBMs to diverse domains such as recommendation systems, product reviews and expert assessments.

Journal ArticleDOI
TL;DR: Evaluation in typical acoustic echo setups shows that the proposed method outperforms other conventional doubletalk detectors in terms of probability of miss detection even under poor echo-to-noise ratio (ENR), low echo- to-far-end ratio (EFR) conditions, and echo path change.
Abstract: A new class of doubletalk detector based on exploiting a spectral slit is proposed. This is achieved by spectrally deleting a frequency band in the far-end signal such that when the near-end signal is present, only the near-end spectral information is present. The proposed method relies solely on the detection of speech activity period in the slit area, and significantly, it requires no estimation of the echo path. Evaluation in typical acoustic echo setups shows that the proposed method outperforms other conventional doubletalk detectors in terms of probability of miss detection even under poor echo-to-noise ratio (ENR), low echo-to-far-end ratio (EFR) conditions, and echo path change.

Proceedings ArticleDOI
09 Jul 2012
TL;DR: Temporal Semantic Compression is extended for interactive video browsing, which uses an arbitrary frame-by-frame interest measure to sub-sample video in real time, with user interface elements that visualize these measures and the effect of compressing on them.
Abstract: Almost every aspect of how we create, transmit, and consume video has changed, but video interfaces still mimic those from video's inception. We extend Temporal Semantic Compression for interactive video browsing, which uses an arbitrary frame-by-frame interest measure to sub-sample video in real time, with user interface elements that visualize these measures and the effect of compressing on them. We experiment with a novel interest measure for popularity, and design novel visualizations for expressing interest measures and the compression interaction. We conduct the first formative evaluation of the TSC paradigm, with 8 subjects, and report design implications arising from it.

Proceedings Article
14 Aug 2012
TL;DR: In this article, a modified hierarchical beta process prior is proposed for hierarchical modeling of multiple data sources, which allows factors to be shared across different sources and enables tractable inference even when the likelihood and the prior over parameters are nonconjugate.
Abstract: Hierarchical beta process has found interesting applications in recent years. In this paper we present a modified hierarchical beta process prior with applications to hierarchical modeling of multiple data sources. The novel use of the prior over a hierarchical factor model allows factors to be shared across different sources. We derive a slice sampler for this model, enabling tractable inference even when the likelihood and the prior over parameters are non-conjugate. This allows the application of the model in much wider contexts without restrictions. We present two different data generative models - a linear Gaussian-Gaussian model for real valued data and a linear Poisson-gamma model for count data. Encouraging transfer learning results are shown for two real world applications - text modeling and content based image retrieval.

Posted Content
TL;DR: In this article, a modified hierarchical beta process prior is proposed for hierarchical modeling of multiple data sources, which allows factors to be shared across different sources, enabling tractable inference even when the likelihood and the prior over parameters are non-conjugate.
Abstract: Hierarchical beta process has found interesting applications in recent years. In this paper we present a modified hierarchical beta process prior with applications to hierarchical modeling of multiple data sources. The novel use of the prior over a hierarchical factor model allows factors to be shared across different sources. We derive a slice sampler for this model, enabling tractable inference even when the likelihood and the prior over parameters are non-conjugate. This allows the application of the model in much wider contexts without restrictions. We present two different data generative models a linear GaussianGaussian model for real valued data and a linear Poisson-gamma model for count data. Encouraging transfer learning results are shown for two real world applications text modeling and content based image retrieval.

Proceedings ArticleDOI
09 Jul 2012
TL;DR: This work introduces a new method for face recognition using a versatile probabilistic model known as Restricted Boltzmann Machine (RBM) to regularise the standard data likelihood learning with an information-theoretic distance metric defined on intra-personal images.
Abstract: We introduce a new method for face recognition using a versatile probabilistic model known as Restricted Boltzmann Machine (RBM). In particular, we propose to regularise the standard data likelihood learning with an information-theoretic distance metric defined on intra-personal images. This results in an effective face representation which captures the regularities in the face space and minimises the intra-personal variations. In addition, our method allows easy incorporation of multiple feature sets with controllable level of sparsity. Our experiments on a high variation dataset show that the proposed method is competitive against other metric learning rivals. We also investigated the RBM method under a variety of settings, including fusing facial parts and utilising localised feature detectors under varying resolutions. In particular, the accuracy is boosted from 71.8% with the standard whole-face pixels to 99.2% with combination of facial parts, localised feature extractors and appropriate resolutions.

Proceedings Article
01 Jan 2012
TL;DR: This paper first employs the recently proposed infinite HMM and collapsed Gibbs inference to automatically infer data segmentation followed by constructing abnormality detection models which are localized to each segmentation.
Abstract: This paper examines a new problem in large scale stream data: abnormality detection which is localized to a data segmentation process. Unlike traditional abnormality detection methods which typically build one unified model across data stream, we propose that building multiple detection models focused on different coherent sections of the video stream would result in better detection performance. One key challenge is to segment the data into coherent sections as the number of segments is not known in advance and can vary greatly across cameras; and a principled way approach is required. To this end, we first employ the recently proposed infinite HMM and collapsed Gibbs inference to automatically infer data segmentation followed by constructing abnormality detection models which are localized to each segmentation. We demonstrate the superior performance of the proposed framework in a real-world surveillance camera data over 14 days.

Proceedings Article
01 Jan 2012
TL;DR: Here, a probabilistic log-linear model is constructed over a set of ordered subsets of Rank over sets that is competitive against state-of-the-art methods on large-scale collaborative filtering tasks.
Abstract: Ranking over sets arise when users choose between groups of items. For example, a group may be of those movies deemed 5 stars to them, or a customized tour package. It turns out, to model this data type properly, we need to investigate the general combinatorics problem of partitioning a set and ordering the subsets. Here we construct a probabilistic log-linear model over a set of ordered subsets. Inference in this combinatorial space is highly challenging: The space size approaches (N!=2)6:93145 N+1 as N approaches innity. We propose a split-and-merge Metropolis-Hastings procedure that can explore the statespace eciently. For discovering hidden aspects in the data, we enrich the model with latent binary variables so that the posteriors can be eciently evaluated. Finally, we evaluate the proposed model on large-scale collaborative ltering tasks and demonstrate that it is

Proceedings Article
09 Jul 2012
TL;DR: In this article, an Embedded Restricted Boltzmann machine (eRBM) is proposed to represent mixed data using a layer of hidden variables transparent across different types of data.
Abstract: Analysis and fusion of social measurements is important to understand what shapes the public's opinion and the sustainability of the global development. However, modeling data collected from social responses is challenging as the data is typically complex and heterogeneous, which might take the form of stated facts, subjective assessment, choices, preferences or any combination thereof. Model-wise, these responses are a mixture of data types including binary, categorical, multicategorical, continuous, ordinal, count and rank data. The challenge is therefore to effectively handle mixed data in the a unified fusion framework in order to perform inference and analysis. To that end, this paper introduces eRBM (Embedded Restricted Boltzmann Machine) — a probabilistic latent variable model that can represent mixed data using a layer of hidden variables transparent across different types of data. The proposed model can comfortably support large-scale data analysis tasks, including distribution modelling, data completion, prediction and visualisation. We demonstrate these versatile features on several moderate and large-scale publicly available social survey datasets.

Proceedings ArticleDOI
10 Dec 2012
TL;DR: A ℓ1-norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities.
Abstract: We present a novel method for document clustering using sparse representation of documents in conjunction with spectral clustering. An \ell_{1} - norm optimization formulation is posed to learn the sparse representation of each document, allowing us to characterize the affinity between documents by considering the overall information instead of traditional pair wise similarities. This document affinity is encoded through a graph on which spectral clustering is performed. The decomposition into multiple subspaces allows documents to be part of a sub-group that shares a smaller set of similar vocabulary, thus allowing for cleaner clusters. Extensive experimental evaluations on two real-world datasets from Reuters-21578 and 20Newsgroup corpora show that our proposed method consistently outperforms state-of-the-art algorithms. Significantly, the performance improvement over other methods is prominent for this datasets.

Proceedings Article
01 Jan 2012
TL;DR: This work proposes an explicit but effective solution using ℓp norm and explains theoretically and numerically why such metric norm would be able to suppress outliers and thus can significantly improve classification performance comparable to the state-of-arts algorithms on some challenging datasets.
Abstract: We address the limitation of sparse representation based classification with group information for multi-pose face recognition. First, we observe that the key issue of such classification problem lies in the choice of the metric norm of the residual vectors, which represent the fitness of each class. Then we point out that limitation of the current sparse representation classification algorithms is the wrong choice of the Ł2 norm, which does not match with data statistics as these residual values may be considerably non-Gaussian. We propose an explicit but effective solution using i p norm and explain theoretically and numerically why such metric norm would be able to suppress outliers and thus can significantly improve classification performance comparable to the state-of-arts algorithms on some challenging datasets.

Proceedings ArticleDOI
29 Oct 2012
TL;DR: It is proposed that MIR add to its toolbox a linguistic perspective, and three useful emphases of research are highlighted: genre, emergence, and effect.
Abstract: With Multimedia Information Retrieval frustrated by the seemingly intractable semantic gap, we turn to the related field of linguistics for fresh inspiration and ideas for old problems and new opportunities. An explosion in the amount and ease with which multimedia items are created and shared, courtesy of new devices and Web 2.0, prompts us to consider what happens when those items are viewed not as artefacts, or "built things", but as utterances. These conversations occur in a mixture of mediums, including text, images, audio, and video, and channels, including Twitter, Facebook, Youtube and blogs, and range in scope from one-to-one exchanges to loosely-bounded meta-conversations that cross cultures and span the globe via remixing of shared meanings like memes. We propose that MIR add to its toolbox a linguistic perspective, and highlight three useful emphases of research: genre, emergence, and effect.