scispace - formally typeset
Search or ask a question
Author

Shihao Ji

Other affiliations: Duke University, Yahoo!, Intel  ...read more
Bio: Shihao Ji is an academic researcher from Georgia State University. The author has contributed to research in topics: Ranking (information retrieval) & Hidden Markov model. The author has an hindex of 21, co-authored 63 publications receiving 3566 citations. Previous affiliations of Shihao Ji include Duke University & Yahoo!.


Papers
More filters
Journal ArticleDOI
TL;DR: The underlying theory, an associated algorithm, example results, and comparisons to other compressive-sensing inversion algorithms in the literature are presented.
Abstract: The data of interest are assumed to be represented as N-dimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M Lt N of basis-function coefficients associated with B. Compressive sensing is a framework whereby one does not measure one of the aforementioned N-dimensional signals directly, but rather a set of related measurements, with the new measurements a linear combination of the original underlying N-dimensional signal. The number of required compressive-sensing measurements is typically much smaller than N, offering the potential to simplify the sensing system. Let f denote the unknown underlying N-dimensional signal, and g a vector of compressive-sensing measurements, then one may approximate f accurately by utilizing knowledge of the (under-determined) linear relationship between f and g, in addition to knowledge of the fact that f is compressible in B. In this paper we employ a Bayesian formalism for estimating the underlying signal f based on compressive-sensing measurements g. The proposed framework has the following properties: i) in addition to estimating the underlying signal f, "error bars" are also estimated, these giving a measure of confidence in the inverted signal; ii) using knowledge of the error bars, a principled means is provided for determining when a sufficient number of compressive-sensing measurements have been performed; iii) this setting lends itself naturally to a framework whereby the compressive sensing measurements are optimized adaptively and hence not determined randomly; and iv) the framework accounts for additive noise in the compressive-sensing measurements and provides an estimate of the noise variance. In this paper we present the underlying theory, an associated algorithm, example results, and provide comparisons to other compressive-sensing inversion algorithms in the literature.

2,259 citations

Journal ArticleDOI
TL;DR: It has been demonstrated that with appropriate design of the compressive measurements used to define v, the decompressive mapping vrarru may be performed with error with asymptotic properties analogous to those of the best adaptive transform-coding algorithm applied in the basis Psi.
Abstract: Compressive sensing (CS) is a framework whereby one performs N nonadaptive measurements to constitute a vector v isin RN used to recover an approximation u isin RM desired signal u isin RM with N 1 sets of compressive measurements {vi}i=1,L are performed, each of the associated {ui}i=1,Lare recovered one at a time, independently. In many applications the L ldquotasksrdquo defined by the mappings virarrui are not statistically independent, and it may be possible to improve the performance of the inversion if statistical interrelationships are exploited. In this paper, we address this problem within a multitask learning setting, wherein the mapping vrarru for each task corresponds to inferring the parameters (here, wavelet coefficients) associated with the desired signal vi, and a shared prior is placed across all of the L tasks. Under this hierarchical Bayesian modeling, data from all L tasks contribute toward inferring a posterior on the hyperparameters, and once the shared prior is thereby inferred, the data from each of the L individual tasks is then employed to estimate the task-dependent wavelet coefficients. An empirical Bayesian procedure for the estimation of hyperparameters is considered; two fast inference algorithms extending the relevance vector machine (RVM) are developed. Example results on several data sets demonstrate the effectiveness and robustness of the proposed algorithms.

467 citations

01 Jan 2007
TL;DR: This paper addresses the problem within a multi-task learning setting, wherein the mapping vi !
Abstract: Compressive sensing (CS) is a framework whereby one performs n non-adaptive measurements to constitute an n-dimensional vector v, with v used to recover an m-dimensional approximation ^ u to a desired m-dimensional signal u, with n ? m; this is performed under the assumption that u is sparse in the basis represented by the matrix “, the columns of which define discrete basis vectors. It has been demonstrated that with appropriate design of the compressive measurements used to define v, the decompressive mapping v ! ^ u may be performed with error kui^ uk 2 having asymptotic properties (large n and m > n) analogous to those of the best adaptive transform-coding algorithm applied in the basis “. The mapping v ! ^ constitutes an inverse problem, often solved using ‘1 regularization or related techniques. In most previous research, if multiple compressive measurements fvigi=1;M are performed, each of the associated f^ uigi=1;M are recovered one at a time, independently. In many applications the M “tasks” defined by the mappings vi ! ^ ui are not statistically independent, and it may be possible to improve the performance of the inversion if statistical inter-relationships are exploited. In this paper we address this problem within a multi-task learning setting, wherein the mapping vi ! ^ ui for each task corresponds to inferring the parameters (here, wavelet coefficients) associated with the desired signal ui, and a shared prior is placed across all of the M tasks. In this multi-task learning framework data from all M tasks contribute toward inferring a posterior on the hyperparameters, and once the shared prior is thereby inferred, the data from each of the M individual tasks is then employed to estimate the task-dependent wavelet coefficients. An empirical Bayes procedure and fast inference algorithm is developed. Example results are presented on several data sets.

134 citations

Journal ArticleDOI
Olivier Chapelle1, Shihao Ji2, Ciya Liao2, Emre Velipasaoglu1, Larry Lai1, Su-Lin Wu1 
TL;DR: This work argues that this is a better metric than some previously proposed intent aware metrics and shows that it has a better correlation with abandonment rate and proposes an algorithm to rerank web search results based on optimizing an objective function corresponding to this metric.
Abstract: We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, we first analyze the properties of an intent-based metric, ERR-IA, to measure relevance and diversity altogether. We argue that this is a better metric than some previously proposed intent aware metrics and show that it has a better correlation with abandonment rate. We then propose an algorithm to rerank web search results based on optimizing an objective function corresponding to this metric and evaluate it on shopping related queries.

119 citations

Journal ArticleDOI
TL;DR: This work formally defines the cost-sensitive classification problem and solves it via a partially observable Markov decision process (POMDP) via a myopic approach, with an adaptive stopping criterion linked to the standard POMDP.

116 citations


Cited by
More filters
Book
24 Aug 2012
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

8,059 citations

01 Jan 2009
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
Abstract: The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer training labels if it is allowed to choose the data from which it learns. An active learner may pose queries, usually in the form of unlabeled data instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant or easily obtained, but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for successful active learning, a summary of problem setting variants and practical issues, and a discussion of related topics in machine learning research are also presented.

5,227 citations

Book
Tie-Yan Liu1
27 Jun 2009
TL;DR: Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated.
Abstract: This tutorial is concerned with a comprehensive introduction to the research area of learning to rank for information retrieval. In the first part of the tutorial, we will introduce three major approaches to learning to rank, i.e., the pointwise, pairwise, and listwise approaches, analyze the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures, evaluate the performance of these approaches on the LETOR benchmark datasets, and demonstrate how to use these approaches to solve real ranking applications. In the second part of the tutorial, we will discuss some advanced topics regarding learning to rank, such as relational ranking, diverse ranking, semi-supervised ranking, transfer ranking, query-dependent ranking, and training data preprocessing. In the third part, we will briefly mention the recent advances on statistical learning theory for ranking, which explain the generalization ability and statistical consistency of different ranking methods. In the last part, we will conclude the tutorial and show several future research directions.

2,515 citations

Journal ArticleDOI
TL;DR: The underlying theory, an associated algorithm, example results, and comparisons to other compressive-sensing inversion algorithms in the literature are presented.
Abstract: The data of interest are assumed to be represented as N-dimensional real vectors, and these vectors are compressible in some linear basis B, implying that the signal can be reconstructed accurately using only a small number M Lt N of basis-function coefficients associated with B. Compressive sensing is a framework whereby one does not measure one of the aforementioned N-dimensional signals directly, but rather a set of related measurements, with the new measurements a linear combination of the original underlying N-dimensional signal. The number of required compressive-sensing measurements is typically much smaller than N, offering the potential to simplify the sensing system. Let f denote the unknown underlying N-dimensional signal, and g a vector of compressive-sensing measurements, then one may approximate f accurately by utilizing knowledge of the (under-determined) linear relationship between f and g, in addition to knowledge of the fact that f is compressible in B. In this paper we employ a Bayesian formalism for estimating the underlying signal f based on compressive-sensing measurements g. The proposed framework has the following properties: i) in addition to estimating the underlying signal f, "error bars" are also estimated, these giving a measure of confidence in the inverted signal; ii) using knowledge of the error bars, a principled means is provided for determining when a sufficient number of compressive-sensing measurements have been performed; iii) this setting lends itself naturally to a framework whereby the compressive sensing measurements are optimized adaptively and hence not determined randomly; and iv) the framework accounts for additive noise in the compressive-sensing measurements and provides an estimate of the noise variance. In this paper we present the underlying theory, an associated algorithm, example results, and provide comparisons to other compressive-sensing inversion algorithms in the literature.

2,259 citations