Showing papers by "Eric P. Xing published in 2010"

PDF

Open Access

Proceedings Article•DOI•

Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification

[...]

Li-Jia Li¹, Hao Su¹, Li Fei-Fei¹, Eric P. Xing²•Institutions (2)

Stanford University¹, Carnegie Mellon University²

06 Dec 2010

TL;DR: A high-level image representation, called the Object Bank, is proposed, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task.

...read moreread less

Abstract: Robust low-level image features have been proven to be effective representations for a variety of visual recognition tasks such as object recognition and scene classification; but pixels, or even local image patches, carry little semantic meanings. For high level visual tasks, such low-level image representations are potentially not enough. In this paper, we propose a high-level image representation, called the Object Bank, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task. Leveraging on the Object Bank representation, superior performances on high level visual recognition tasks can be achieved with simple off-the-shelf classifiers such as logistic regression and linear SVM. Sparsity algorithms make our representation more efficient and scalable for large scene datasets, and reveal semantically meaningful feature patterns.

...read moreread less

1,027 citations

Proceedings Article•

A Latent Variable Model for Geographic Lexical Variation

[...]

Jacob Eisenstein¹, Brendan O'Connor¹, Noah A. Smith¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

09 Oct 2010

TL;DR: A multi-level generative model that reasons jointly about latent topics and geographical regions is presented, which recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency.

...read moreread less

Abstract: The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as "sports" or "entertainment" are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author's geographic location from raw text, outperforming both text regression and supervised topic models.

...read moreread less

691 citations

Journal Article•DOI•

Discrete temporal models of social networks

[...]

Steve Hanneke, Wenjie Fu, Eric P. Xing

01 Jan 2010-Electronic Journal of Statistics

TL;DR: This paper propose a family of statistical models for social network evolution over time, which represent an extension of Exponential Random Graph Models (ERGMs) and give examples of their use for hypothesis testing and classification.

...read moreread less

Abstract: We propose a family of statistical models for social network evolution over time, which represents an extension of Exponential Random Graph Models (ERGMs). Many of the methods for ERGMs are readily adapted for these models, including maximum likelihood estimation algorithms. We discuss models of this type and their properties, and give examples, as well as a demonstration of their use for hypothesis testing and classification. We believe our temporal ERG models represent a useful new framework for modeling time-evolving social networks, and rewiring networks from other domains such as gene regulation circuitry, and communication networks.

...read moreread less

463 citations

Proceedings Article•DOI•

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

[...]

Seyoung Kim¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

21 Jun 2010

TL;DR: This work considers the problem of learning a sparse multi-task regression, where the structure in the outputs can be represented as a tree with leaf nodes as outputs and internal nodes as clusters of the outputs at multiple granularity, and proposes a structured regularization based on a group-lasso penalty.

...read moreread less

Abstract: We consider the problem of learning a sparse multi-task regression, where the structure in the outputs can be represented as a tree with leaf nodes as outputs and internal nodes as clusters of the outputs at multiple granularity. Our goal is to recover the common set of relevant inputs for each output cluster. Assuming that the tree structure is available as prior knowledge, we formulate this problem as a new multi-task regularized regression called tree-guided group lasso. Our structured regularization is based on a group-lasso penalty, where groups are defined with respect to the tree structure. We describe a systematic weighting scheme for the groups in the penalty such that each output variable is penalized in a balanced manner even if the groups overlap. We present an efficient optimization method that can handle a large-scale problem. Using simulated and yeast datasets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns compared to other methods for multi-task learning.

...read moreread less

397 citations

Journal Article•DOI•

Estimating time-varying networks

[...]

Mladen Kolar, Le Song, Amr Ahmed, Eric P. Xing

01 Mar 2010-The Annals of Applied Statistics

TL;DR: In this paper, a temporally smoothed l 1-regularized logistic regression formalism is proposed to estimate time-varying networks from time series of entity attributes, which can be cast as a standard convex optimization problem and solved efficiently using generic solvers scalable to large networks.

...read moreread less

Abstract: Stochastic networks are a plausible representation of the relational information among entities in dynamic systems such as living cells or social communities. While there is a rich literature in estimating a static or temporally invariant network from observation data, little has been done toward estimating time-varying networks from time series of entity attributes. In this paper we present two new machine learning methods for estimating time-varying networks, which both build on a temporally smoothed l1-regularized logistic regression formalism that can be cast as a standard convex-optimization problem and solved efficiently using generic solvers scalable to large networks. We report promising results on recovering simulated time-varying networks. For real data sets, we reverse engineer the latent sequence of temporally rewiring political networks between Senators from the US Senate voting records and the latent evolving regulatory networks underlying 588 genes across the life cycle of Drosophila melanogaster from the microarray time course.

...read moreread less

268 citations

Journal Article•DOI•

A state-space mixed membership blockmodel for dynamic network tomography

[...]

Eric P. Xing, Wenjie Fu, Le Song

01 Jun 2010-The Annals of Applied Statistics

TL;DR: In this article, the authors propose a model-based approach to analyze the dynamic tomography of such time-evolving networks, which allows actors to behave differently over time and carry out different roles/functions when interacting with different peers.

...read moreread less

Abstract: In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper we propose a model-based approach to analyze what we will refer to as the dynamic tomography of such time-evolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biological functions, underlying the observed network topologies. Our model builds on earlier work on a mixed membership stochastic blockmodel for static networks, and the state-space model for tracking object trajectory. It overcomes a major limitation of many current network inference techniques, which assume that each actor plays a unique and invariant role that accounts for all its interactions with other actors; instead, our method models the role of each actor as a time-evolving mixed membership vector that allows actors to behave differently over time and carry out different roles/functions when interacting with different peers, which is closer to reality. We present an efficient algorithm for approximate inference and learning using our model; and we applied our model to analyze a social network between monks (i.e., the Sampson’s network), a dynamic email communication network between the Enron employees, and a rewiring gene interaction network of fruit fly collected during its full life cycle. In all cases, our model reveals interesting patterns of the dynamic roles of the actors.

...read moreread less

223 citations

Journal Article•DOI•

Smoothing proximal gradient method for general structured sparse regression

[...]

Xi Chen, Qihang Lin, Seyoung Kim, Jaime G. Carbonell, Eric P. Xing - Show less +1 more

26 May 2010-arXiv: Machine Learning

TL;DR: This paper proposes a general optimization approach, the smoothing proximal gradient method, which can solve structured sparse regression problems with any smooth convex loss under a wide spectrum of structured sparsity-inducing penalties.

...read moreread less

Abstract: We study the problem of estimating high-dimensional regression models regularized by a structured sparsity-inducing penalty that encodes prior structural information on either the input or output variables. We consider two widely adopted types of penalties of this kind as motivating examples: (1) the general overlapping-group-lasso penalty, generalized from the group-lasso penalty; and (2) the graph-guided-fused-lasso penalty, generalized from the fused-lasso penalty. For both types of penalties, due to their nonseparability and nonsmoothness, developing an efficient optimization method remains a challenging problem. In this paper we propose a general optimization approach, the smoothing proximal gradient (SPG) method, which can solve structured sparse regression problems with any smooth convex loss under a wide spectrum of structured sparsity-inducing penalties. Our approach combines a smoothing technique with an effective proximal gradient method. It achieves a convergence rate significantly faster than the standard first-order methods, subgradient methods, and is much more scalable than the most widely used interior-point methods. The efficiency and scalability of our method are demonstrated on both simulation experiments and real genetic data sets.

...read moreread less

196 citations

Proceedings Article•

Timeline: a dynamic hierarchical dirichlet process model for recovering birth/death and evolution of topics in text stream

[...]

Amr Ahmed¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

08 Jul 2010

TL;DR: In this paper, infinite dynamic topic models, iDTM, are introduced that can accommodate the evolution of all the aforementioned aspects of the latent structure such as the number of topics, the topics' distribution and popularity are time-evolving.

...read moreread less

Abstract: Topic models have proven to be a useful tool for discovering latent structures in document collections. However, most document collections often come as temporal streams and thus several aspects of the latent structure such as the number of topics, the topics' distribution and popularity are time-evolving. Several models exist that model the evolution of some but not all of the above aspects. In this paper we introduce infinite dynamic topic models, iDTM, that can accommodate the evolution of all the aforementioned aspects. Our model assumes that documents are organized into epochs, where the documents within each epoch are exchangeable but the order between the documents is maintained across epochs. iDTM allows for unbounded number of topics: topics can die or be born at any epoch, and the representation of each topic can evolve according to a Markovian dynamics. We use iDTM to analyze the birth and evolution of topics in the NIPS community and evaluated the efficacy of our model on both simulated and real datasets with favorable outcome.

...read moreread less

175 citations

Proceedings Article•

Turbo Parsers: Dependency Parsing by Approximate Variational Inference

[...]

André F. T. Martins¹, Noah A. Smith¹, Eric P. Xing¹, Pedro Aguiar², Mário A. T. Figueiredo² - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, Instituto Superior Técnico²

09 Oct 2010

TL;DR: A unified view of two state-of-the-art non-projective dependency parsers, both approximate, is presented and a new aggressive online algorithm to learn the model parameters is proposed, which makes use of the underlying variational representation.

...read moreread less

Abstract: We present a unified view of two state-of-the-art non-projective dependency parsers, both approximate: the loopy belief propagation parser of Smith and Eisner (2008) and the relaxed linear program of Martins et al. (2009). By representing the model assumptions with a factor graph, we shed light on the optimization problems tackled in each method. We also propose a new aggressive online algorithm to learn the model parameters, which makes use of the underlying variational representation. The algorithm does not require a learning rate parameter and provides a single framework for a wide family of convex loss functions, including CRFs and structured SVMs. Experiments show state-of-the-art performance for 14 languages.

...read moreread less

145 citations

Proceedings Article•DOI•

Predictive Subspace Learning for Multi-view Data: a Large Margin Approach

[...]

Ning Chen¹, Jun Zhu¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

06 Dec 2010

TL;DR: A large-margin learning framework to discover a predictive latent subspace representation shared by multiple views based on an undirected latent space Markov network that fulfills a weak conditional independence assumption that multi-view observations and response variables are independent given a set of latent variables.

...read moreread less

Abstract: Learning from multi-view data is important in many applications, such as image classification and annotation. In this paper, we present a large-margin learning framework to discover a predictive latent subspace representation shared by multiple views. Our approach is based on an undirected latent space Markov network that fulfills a weak conditional independence assumption that multi-view observations and response variables are independent given a set of latent variables. We provide efficient inference and parameter estimation methods for the latent sub-space model. Finally, we demonstrate the advantages of large-margin learning on real video and web image data for discovering predictive latent representations and improving the performance on image classification, annotation and retrieval.

...read moreread less

113 citations

Posted Content•

Graph-Structured Multi-task Regression and an Efficient Optimization Method for General Fused Lasso Manuscript

[...]

Xi Chen, Seyoung Kim, Qihang Lin, Jaime G. Carbonell, Eric P. Xing - Show less +1 more

20 May 2010-arXiv: Machine Learning

TL;DR: This paper proposes graph-guided fused lasso (GFlasso) for structured multi-task regression that exploits the graph structure over the output variables and introduces a novel penalty function based on fusion penalty to encourage highly correlated outputs to share a common set of relevant inputs.

...read moreread less

Abstract: We consider the problem of learning a structured multi-task regression, where the output consists of multiple responses that are related by a graph and the correlated response variables are dependent on the common inputs in a sparse but synergistic manner. Previous methods such as l1/l2-regularized multi-task regression assume that all of the output variables are equally related to the inputs, although in many real-world problems, outputs are related in a complex manner. In this paper, we propose graph-guided fused lasso (GFlasso) for structured multi-task regression that exploits the graph structure over the output variables. We introduce a novel penalty function based on fusion penalty to encourage highly correlated outputs to share a common set of relevant inputs. In addition, we propose a simple yet efficient proximal-gradient method for optimizing GFlasso that can also be applied to any optimization problems with a convex smooth loss and the general class of fusion penalty defined on arbitrary graph structures. By exploiting the structure of the non-smooth ''fusion penalty'', our method achieves a faster convergence rate than the standard first-order method, sub-gradient method, and is significantly more scalable than the widely adopted second-order cone-programming and quadratic-programming formulations. In addition, we provide an analysis of the consistency property of the GFlasso model. Experimental results not only demonstrate the superiority of GFlasso over the standard lasso but also show the efficiency and scalability of our proximal-gradient method.

...read moreread less

Proceedings Article•

Adaptive Multi-Task Lasso: with Application to eQTL Detection

[...]

Seunghak Lee¹, Jun Zhu¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

06 Dec 2010

TL;DR: This paper proposes a novel regularized regression approach for detecting eQTLs which takes into account related traits simultaneously while incorporating many regulatory features and results confirm that the model outperforms previous methods for finding eZTLs.

...read moreread less

Abstract: To understand the relationship between genomic variations among population and complex diseases, it is essential to detect eQTLs which are associated with phenotypic effects. However, detecting eQTLs remains a challenge due to complex underlying mechanisms and the very large number of genetic loci involved compared to the number of samples. Thus, to address the problem, it is desirable to take advantage of the structure of the data and prior information about genomic locations such as conservation scores and transcription factor binding sites. In this paper, we propose a novel regularized regression approach for detecting eQTLs which takes into account related traits simultaneously while incorporating many regulatory features. We first present a Bayesian network for a multi-task learning problem that includes priors on SNPs, making it possible to estimate the significance of each covariate adaptively. Then we find the maximum a posteriori (MAP) estimation of regression coefficients and estimate weights of covariates jointly. This optimization procedure is efficient since it can be achieved by using a projected gradient descent and a coordinate descent procedure iteratively. Experimental results on simulated and real yeast datasets confirm that our model outperforms previous methods for finding eQTLs.

...read moreread less

Proceedings Article•

Staying Informed: Supervised and Semi-Supervised Multi-View Topical Analysis of Ideological Perspective

[...]

Amr Ahmed¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

09 Oct 2010

TL;DR: This paper addresses the problem of modeling ideological perspective on a topical level using a factored topic model, and develops efficient inference algorithms using Collapsed Gibbs sampling for posterior inference, and gives various evaluations and illustrations of the utility of the model on various document collections with promising results.

...read moreread less

Abstract: With the proliferation of user-generated articles over the web, it becomes imperative to develop automated methods that are aware of the ideological-bias implicit in a document collection. While there exist methods that can classify the ideological bias of a given document, little has been done toward understanding the nature of this bias on a topical-level. In this paper we address the problem of modeling ideological perspective on a topical level using a factored topic model. We develop efficient inference algorithms using Collapsed Gibbs sampling for posterior inference, and give various evaluations and illustrations of the utility of our model on various document collections with promising results. Finally we give a Metropolis-Hasting inference algorithm for a semi-supervised extension with decent results.

...read moreread less

Proceedings Article•

Large Margin Learning of Upstream Scene Understanding Models

[...]

Jun Zhu¹, Li-Jia Li¹, Li Fei-Fei², Eric P. Xing²•Institutions (2)

Carnegie Mellon University¹, Stanford University²

06 Dec 2010

TL;DR: A joint max-margin and max-likelihood learning method for upstream scene understanding models, in which latent topic discovery and prediction model estimation are closely coupled and well-balanced.

...read moreread less

Abstract: Upstream supervised topic models have been widely used for complicated scene understanding. However, existing maximum likelihood estimation (MLE) schemes can make the prediction model learning independent of latent topic discovery and result in an unbalanced prediction rule for scene classification. This paper presents a joint max-margin and max-likelihood learning method for upstream scene understanding models, in which latent topic discovery and prediction model estimation are closely coupled and well-balanced. The optimization problem is efficiently solved with a variational EM procedure, which iteratively solves an online loss-augmented SVM. We demonstrate the advantages of the large-margin approach on both an 8-category sports dataset and the 67-class MIT indoor scene dataset for scene categorization.

...read moreread less

Journal Article•DOI•

Multi-population GWA mapping via multi-task regularized regression

[...]

Kriti Puniyani¹, Seyoung Kim¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

01 Jun 2010-Bioinformatics

TL;DR: A multi-population group lasso algorithm using L1/L2-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation, which demonstrates the effectiveness of the method on HapMap-simulated and lactase persistence datasets.

...read moreread less

Abstract: Motivation: Population heterogeneity through admixing of different founder populations can produce spurious associations in genome-wide association studies that are linked to the population structure rather than the phenotype. Since samples from the same population generally co-evolve, different populations may or may not share the same genetic underpinnings for the seemingly common phenotype. Our goal is to develop a unified framework for detecting causal genetic markers through a joint association analysis of multiple populations. Results: Based on a multi-task regression principle, we present a multi-population group lasso algorithm using L1/L2-regularized regression for joint association analysis of multiple populations that are stratified either via population survey or computational estimation. Our algorithm combines information from genetic markers across populations, to identify causal markers. It also implicitly accounts for correlations between the genetic markers, thus enabling better control over false positive rates. Joint analysis across populations enables the detection of weak associations common to all populations with greater power than in a separate analysis of each population. At the same time, the regression-based framework allows causal alleles that are unique to a subset of the populations to be correctly identified. We demonstrate the effectiveness of our method on HapMap-simulated and lactase persistence datasets, where we significantly outperform state of the art methods, with greater power for detecting weak associations and reduced spurious associations. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: epxing@cs.cmu.edu

...read moreread less

Proceedings Article•

Conditional Topic Random Fields

[...]

Jun Zhu¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

21 Jun 2010

TL;DR: An efficient variational inference algorithm that scales linearly in terms of topic numbers, and a maximum likelihood estimation (MLE) procedure for parameter estimation, for the supervised version of CTRF and an arguably more discriminative max-margin learning method.

...read moreread less

Abstract: Generative topic models such as LDA are limited by their inability to utilize nontrivial input features to enhance their performance, and many topic models assume that topic assignments of different words are conditionally independent. Some work exists to address the second limitation but no work exists to address both. This paper presents a conditional topic random field (CTRF) model, which can use arbitrary nonlocal features about words and documents and incorporate the Markov dependency between topic assignments of neighboring words. We develop an efficient variational inference algorithm that scales linearly in terms of topic numbers, and a maximum likelihood estimation (MLE) procedure for parameter estimation. For the supervised version of CTRF, we also develop an arguably more discriminative max-margin learning method. We evaluate CTRF on real review rating data and demonstrate the advantages of CTRF over generative competitors, and we show the advantages of max-margin learning over MLE.

...read moreread less

Book Chapter•DOI•

Image segmentation with topic random field

[...]

Bin Zhao¹, Li Fei-Fei², Eric P. Xing¹•Institutions (2)

Carnegie Mellon University¹, Stanford University²

05 Sep 2010

TL;DR: Topic Random Field (TRF) is proposed, which defines a Markov Random Field over hidden labels of an image, to enforce the spatial coherence between topic labels for neighboring regions and achieves better segmentation performance.

...read moreread less

Abstract: Recently, there has been increasing interests in applying aspect models (e.g., PLSA and LDA) in image segmentation. However, these models ignore spatial relationships among local topic labels in an image and suffers from information loss by representing image feature using the index of its closest match in the codebook. In this paper, we propose Topic Random Field (TRF) to tackle these two problems. Specifically, TRF defines a Markov Random Field over hidden labels of an image, to enforce the spatial coherence between topic labels for neighboring regions. Moreover, TRF utilizes a noise channel to model the generation of local image features, and avoids the off-line process of building visual codebook. We provide details of variational inference and parameter learning for TRF. Experimental evaluations on three image data sets show that TRF achieves better segmentation performance.

...read moreread less

Posted Content•

Estimating Networks With Jumps

[...]

Mladen Kolar¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

17 Dec 2010-arXiv: Machine Learning

TL;DR: In this paper, the authors study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying nonstationary time series data, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to an invariant model widely considered in current literature of structural estimation.

...read moreread less

Abstract: We study the problem of estimating a temporally varying coefficient and varying structure (VCVS) graphical model underlying nonstationary time series data, such as social states of interacting individuals or microarray expression profiles of gene networks, as opposed to i.i.d. data from an invariant model widely considered in current literature of structural estimation. In particular, we consider the scenario in which the model evolves in a piece-wise constant fashion. We propose a procedure that minimizes the so-called TESLA loss (i.e., temporally smoothed L1 regularized regression), which allows jointly estimating the partition boundaries of the VCVS model and the coefficient of the sparse precision matrix on each block of the partition. A highly scalable proximal gradient method is proposed to solve the resultant convex optimization problem; and the conditions for sparsistent estimation and the convergence rate of both the partition boundaries and the network structure are established for the first time for such estimators.

...read moreread less

Proceedings Article•DOI•

Grafting-light: fast, incremental feature selection and structure learning of Markov random fields

[...]

Jun Zhu¹, Ni Lao¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

25 Jul 2010

TL;DR: This paper presents a fast algorithm called Grafting-Light, which iteratively performs one-step of orthant-wise gradient descent over free parameters and selects new features and is guaranteed to converge to the global optimum and can effectively select significant features.

...read moreread less

Abstract: Feature selection is an important task in order to achieve better generalizability in high dimensional learning, and structure learning of Markov random fields (MRFs) can automatically discover the inherent structures underlying complex data. Both problems can be cast as solving an l1-norm regularized parameter estimation problem. The existing Grafting method can avoid doing inference on dense graphs in structure learning by incrementally selecting new features. However, Grafting performs a greedy step to optimize over free parameters once new features are included. This greedy strategy results in low efficiency when parameter learning is itself non-trivial, such as in MRFs, in which parameter learning depends on an expensive subroutine to calculate gradients. The complexity of calculating gradients in MRFs is typically exponential to the size of maximal cliques. In this paper, we present a fast algorithm called Grafting-Light to solve the l1-norm regularized maximum likelihood estimation of MRFs for efficient feature selection and structure learning. Grafting-Light iteratively performs one-step of orthant-wise gradient descent over free parameters and selects new features. This lazy strategy is guaranteed to converge to the global optimum and can effectively select significant features. On both synthetic and real data sets, we show that Grafting-Light is much more efficient than Grafting for both feature selection and structure learning, and performs comparably with the optimal batch method that directly optimizes over all the features for feature selection but is much more efficient and accurate for structure learning of MRFs.

...read moreread less

Proceedings Article•

On Sparse Nonparametric Conditional Covariance Selection

[...]

Mladen Kolar¹, Ankur P. Parikh¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

21 Jun 2010

TL;DR: A penalized kernel smoothing method for the problem of selecting nonzero elements of the conditional precision matrix, known as conditional covariance selection, which is derived under minimal assumptions on the underlying probability distribution and works well in the high-dimensional setting.

...read moreread less

Abstract: We develop a penalized kernel smoothing method for the problem of selecting nonzero elements of the conditional precision matrix, known as conditional covariance selection. This problem has a key role in many modern applications such as finance and computational biology. However, it has not been properly addressed. Our estimator is derived under minimal assumptions on the underlying probability distribution and works well in the high-dimensional setting. The efficiency of the algorithm is demonstrated on both simulation studies and the analysis of the stock market.

...read moreread less

Posted Content•

An E-cient Proximal Gradient Method for General Structured Sparse Learning

[...]

Xi Chen¹, Qihang Lin¹, Seyoung Kim¹, Jaime G. Carbonell¹, Eric P. Xing¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

26 May 2010

TL;DR: This paper proposes a general optimization framework, called proximal gradient method, which can solve the structured sparse learning problems with a smooth convex loss and a wide spectrum of non-smooth and non-separable structured-sparsity-inducing penalties, including the overlapping-group-lasso and graph-guided fusion penalties.

...read moreread less

Abstract: We study the problem of learning high dimensional regression models regularized by a structured-sparsity-inducing penalty that encodes prior structural information on either input or output sides. We consider two widely adopted types of such penalties as our motivating examples: 1) overlapping-group-lasso penalty, based on ‘1=‘2 mixed-norm, and 2) graph-guided fusion penalty. For both types of penalties, due to their non-separability, developing an e‐cient optimization method has remained a challenging problem. In this paper, we propose a general optimization framework, called proximal gradient method, which can solve the structured sparse learning problems with a smooth convex loss and a wide spectrum of non-smooth and non-separable structured-sparsity-inducing penalties, including the overlapping-group-lasso and graph-guided fusion penalties. Our method exploits the structure of such penalties, decouples the non-separable penalty function via the dual norm, introduces its smooth approximation, and solves this approximation function. It achieves a convergence rate signiflcantly faster than the standard flrst-order method, subgradient method, and is much more scalable than the most widely used method, namely interior-point method for second-order cone programming and quadratic programming formulations. The e‐ciency and scalability of our method are demonstrated on both simulated and real genetic datasets.

...read moreread less

Journal Article•DOI•

SPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images

[...]

Kriti Puniyani¹, Christos Faloutsos¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

15 Jun 2010-Bioinformatics

TL;DR: SPEX2 is presented, an automatic system for embryonic ISH image processing, which can extract, transform, compare, classify and cluster spatial gene expression patterns in Drosophila embryos and achieves excellent performance in automatic image annotation.

...read moreread less

Abstract: Motivation: Microarray profiling of mRNA abundance is often ill suited for temporal–spatial analysis of gene expressions in multicellular organisms such as Drosophila. Recent progress in image-based genome-scale profiling of whole-body mRNA patterns via in situ hybridization (ISH) calls for development of accurate and automatic image analysis systems to facilitate efficient mining of complex temporal–spatial mRNA patterns, which will be essential for functional genomics and network inference in higher organisms. Results: We present SPEX2, an automatic system for embryonic ISH image processing, which can extract, transform, compare, classify and cluster spatial gene expression patterns in Drosophila embryos. Our pipeline for gene expression pattern extraction outputs the precise spatial locations and strengths of the gene expression. We performed experiments on the largest publicly available collection of Drosophila ISH images, and show that our method achieves excellent performance in automatic image annotation, and also finds clusters that are significantly enriched, both for gene ontology functional annotations, and for annotation terms from a controlled vocabulary used by human curators to describe these images. Availability: Software will be available at http://www.sailing.cs.cmu.edu/ Contact: ude.umc.sc@gnixpe Supplementary information: Supplementary data are avilable at Bioinformatics online.

...read moreread less

Proceedings Article•

Social Links from Latent Topics in Microblogs

[...]

Kriti Puniyani¹, Jacob Eisenstein¹, Shay B. Cohen¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

06 Jun 2010

TL;DR: The long term goal is to develop joint sociolinguistic models that explain the social basis of linguistic variation by combining large linguistic corpora with explicit representations of social network structures.

...read moreread less

Abstract: Language use is overlaid on a network of social connections, which exerts an influence on both the topics of discussion and the ways that these topics can be expressed (Halliday, 1978). In the past, efforts to understand this relationship were stymied by a lack of data, but social media offers exciting new opportunities. By combining large linguistic corpora with explicit representations of social network structures, social media provides a new window into the interaction between language and society. Our long term goal is to develop joint sociolinguistic models that explain the social basis of linguistic variation.

...read moreread less

Posted Content•

An Efficient Proximal-Gradient Method for Single and Multi-task Regression with Structured Sparsity

[...]

Xi Chen, Qihang Lin, Seyoung Kim, Javier Peña, Jaime G. Carbonell¹, Eric P. Xing¹ - Show less +2 more•Institutions (1)

Carnegie Mellon University¹

26 May 2010

TL;DR: An efficient proximal-gradient method is proposed that achieves a faster convergence rate and is much more efficient and scalable than solving the SOCP formulation and can be successfully applied to a very large-scale dataset in genetic association analysis.

...read moreread less

Abstract: We consider the optimization problem of learning regression models with a mixed-norm penalty that is defined over overlapping groups to achieve structured sparsity. It has been previously shown that such penalty can encode prior knowledge on the input or output structure to learn an structuredsparsity pattern in the regression parameters. However, because of the non-separability of the parameters of the overlapping groups, developing an efficient optimization method has remained a challenge. An existing method casts this problem as a second-order cone programming (SOCP) and solves it by interior-point methods. However, this approach is computationally expensive even for problems of moderate size. In this paper, we propose an efficient proximal-gradient method that achieves a faster convergence rate and is much more efficient and scalable than solving the SOCP formulation. Our method exploits the structure of the non-smooth structured-sparsity-inducing norm, introduces its smooth approximation, and solves this approximation function instead of optimizing the original objective function directly. We demonstrate the efficiency and scalability of our method on simulated datasets and show that our method can be successfully applied to a very large-scale dataset in genetic association analysis.

...read moreread less

Book Chapter•DOI•

Modeling and analysis of dynamic behaviors of web image collections

[...]

Gunhee Kim¹, Eric P. Xing¹, Antonio Torralba²•Institutions (2)

Carnegie Mellon University¹, Massachusetts Institute of Technology²

05 Sep 2010

TL;DR: A scalable and parallelizable sequential Monte Carlo based method is developed to construct the similarity network of a large-scale dataset that provides a base representation for wide ranges of dynamics analysis.

...read moreread less

Abstract: Can we model the temporal evolution of topics in Web image collections? If so, can we exploit the understanding of dynamics to solve novel visual problems or improve recognition performance? These two challenging questions are the motivation for this work. We propose a nonparametric approach to modeling and analysis of topical evolution in image sets. A scalable and parallelizable sequential Monte Carlo based method is developed to construct the similarity network of a large-scale dataset that provides a base representation for wide ranges of dynamics analysis. In this paper, we provide several experimental results to support the usefulness of image dynamics with the datasets of 47 topics gathered from Flickr. First, we produce some interesting observations such as tracking of subtopic evolution and outbreak detection, which cannot be achieved with conventional image sets. Second, we also present the complementary benefits that the images can introduce over the associated text analysis. Finally, we show that the training using the temporal association significantly improves the recognition performance.

...read moreread less

Discovering Demographic Language Variation

[...]

Brendan O'Connor¹, Jacob Eisenstein¹, Eric P. Xing¹, Noah A. Smith¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2010

TL;DR: A Bayesian generative model of how demographic social factors influence lexical choice is proposed for a corpus of geo-tagged Twitter messages originating from mobile phones, cross-referenced against U.S. Census demographic data.

...read moreread less

Abstract: We propose a Bayesian generative model of how demographic social factors influence lexical choice. We apply the method to a corpus of geo-tagged Twitter messages originating from mobile phones, cross-referenced against U.S. Census demographic data. Our method discovers communities jointly defined by linguistic and demographic properties.

...read moreread less

Journal Article•DOI•

Invited paper: Structured literature image finder: Parsing text and figures in biomedical literature

[...]

Amr Ahmed¹, Andrew Arnold¹, Luis Pedro Coelho¹, Joshua D. Kangas¹, Abdul-Saboor Sheikh¹, Eric P. Xing¹, William W. Cohen¹, Robert F. Murphy¹ - Show less +4 more•Institutions (1)

Carnegie Mellon University¹

01 Jul 2010-Journal of Web Semantics

TL;DR: The SLIF project combines text-mining and image processing to extract structured information from biomedical literature to support browsing the collection based on latent topic models which are derived from both the annotated text and the image data.

...read moreread less

Posted Content•

Online Multiple Kernel Learning for Structured Prediction

[...]

André F. T. Martins¹, Noah A. Smith¹, Eric P. Xing¹, Pedro Aguiar², Mário A. T. Figueiredo - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, Technical University of Lisbon²

13 Oct 2010-arXiv: Machine Learning

TL;DR: This work proposes a new family of online proximal algorithms for MKL (as well as for group-lasso and variants thereof), which overcomes that drawback of batch learning and shows regret, convergence, and generalization bounds for the proposed method.

...read moreread less

Abstract: Despite the recent progress towards efficient multiple kernel learning (MKL), the structured output case remains an open research front. Current approaches involve repeatedly solving a batch learning problem, which makes them inadequate for large scale scenarios. We propose a new family of online proximal algorithms for MKL (as well as for group-lasso and variants thereof), which overcomes that drawback. We show regret, convergence, and generalization bounds for the proposed method. Experiments on handwriting recognition and dependency parsing testify for the successfulness of the approach.

...read moreread less

Learning Structured Classifiers with Dual Coordinate Ascent

[...]

André F. T. Martins, Kevin Gimpel, Noah A. Smith, Eric P. Xing, Mário A. T. Figueiredo, Pedro Aguiar - Show less +2 more

01 Jun 2010

TL;DR: A new aggressive online algorithm is introduced that optimizes any loss in this family of convex loss functions, properly including CRFs, structured SVMs, and the structured perceptron, without the need to specify any learning rate parameter.

...read moreread less

Abstract: : We present a unified framework for online learning of structured classifiers that handles a wide family of convex loss functions, properly including CRFs, structured SVMs, and the structured perceptron. We introduce a new aggressive online algorithm that optimizes any loss in this family. For the structured hinge loss, this algorithm reduces to 1-best MIRA; in general, it can be regarded as a dual coordinate ascent algorithm. The approximate inference scenario is also addressed. Our experiments on two NLP problems show that the algorithm converges to accurate models at least as fast as stochastic gradient descent, without the need to specify any learning rate parameter.

...read moreread less

Book Chapter•DOI•

MoGUL: detecting common insertions and deletions in a population

[...]

Seunghak Lee¹, Eric P. Xing², Michael Brudno¹•Institutions (2)

University of Toronto¹, Carnegie Mellon University²

25 Apr 2010

TL;DR: The MoGUL (Mixture of Genotypes Variant Locator) framework is developed, which identifies potential locations with indels by examining mate pairs generated from all sequenced individuals simultaneously, uses a Bayesian network with appropriate priors to explicitly model each individual as homozygous or heterozygous for each locus, and computes the expected Minor Allele Frequency (MAF) for all predicted variants.

...read moreread less

Abstract: While the discovery of structural variants in the human population is ongoing, most methods for this task assume that the genome is sequenced to high coverage (e.g 40x), and use the combined power of the many sequenced reads and mate pairs to identify the variants In contrast, the 1000 Genomes Project hopes to sequence hundreds of human genotypes, but at low coverage (4-6x), and most of the current methods are unable to discover insertion/deletion and structural variants from this data. In order to identify indels from multiple low-coverage individuals we have developed the MoGUL (Mixture of Genotypes Variant Locator) framework, which identifies potential locations with indels by examining mate pairs generated from all sequenced individuals simultaneously, uses a Bayesian network with appropriate priors to explicitly model each individual as homozygous or heterozygous for each locus, and computes the expected Minor Allele Frequency (MAF) for all predicted variants We have used MoGUL to identify variants in 1000 Genomes data, as well as in simulated genotypes, and show good accuracy at predicting indels, especially for MAF > 0.06 and indel size > 20 base pairs.

...read moreread less