scispace - formally typeset
Search or ask a question
Author

Wei Hong

Bio: Wei Hong is an academic researcher from Texas Instruments. The author has contributed to research in topics: Image segmentation & Segmentation. The author has an hindex of 16, co-authored 51 publications receiving 1223 citations. Previous affiliations of Wei Hong include Princeton University & Hewlett-Packard.


Papers
More filters
Journal ArticleDOI
TL;DR: It is shown that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data and can be readily applied to segment real imagery and bioinformatic data.
Abstract: In this paper, based on ideas from lossy data coding and compression, we present a simple but effective technique for segmenting multivariate mixed data that are drawn from a mixture of Gaussian distributions, which are allowed to be almost degenerate. The goal is to find the optimal segmentation that minimizes the overall coding length of the segmented data, subject to a given distortion. By analyzing the coding length/rate of mixed data, we formally establish some strong connections of data segmentation to many fundamental concepts in lossy data compression and rate-distortion theory. We show that a deterministic segmentation is approximately the (asymptotically) optimal solution for compressing mixed data. We propose a very simple and effective algorithm that depends on a single parameter, the allowable distortion. At any given distortion, the algorithm automatically determines the corresponding number and dimension of the groups and does not involve any parameter estimation. Simulation results reveal intriguing phase-transition-like behaviors of the number of segments when changing the level of distortion or the amount of outliers. Finally, we demonstrate how this technique can be readily applied to segment real imagery and bioinformatic data.

470 citations

Journal ArticleDOI
TL;DR: The careful and extensive experimental results show that this new model gives more compact representations for a wide variety of natural images under a wide range of signal-to-noise ratios than many existing methods, including wavelets.
Abstract: In this paper, we introduce a simple and efficient representation for natural images. We view an image (in either the spatial domain or the wavelet domain) as a collection of vectors in a high-dimensional space. We then fit a piece-wise linear model (i.e., a union of affine subspaces) to the vectors at each downsampling scale. We call this a multiscale hybrid linear model for the image. The model can be effectively estimated via a new algebraic method known as generalized principal component analysis (GPCA). The hybrid and hierarchical structure of this model allows us to effectively extract and exploit multimodal correlations among the imagery data at different scales. It conceptually and computationally remedies limitations of many existing image representation methods that are based on either a fixed linear transformation (e.g., DCT, wavelets), or an adaptive uni-modal linear transformation (e.g., PCA), or a multimodal model that uses only cluster means (e.g., VQ). We will justify both quantitatively and experimentally why and how such a simple multiscale hybrid model is able to reduce simultaneously the model complexity and computational cost. Despite a small overhead of the model, our careful and extensive experimental results show that this new model gives more compact representations for a wide variety of natural images under a wide range of signal-to-noise ratios than many existing methods, including wavelets. We also briefly address how the same (hybrid linear) modeling paradigm can be extended to be potentially useful for other applications, such as image segmentation

214 citations

Journal ArticleDOI
TL;DR: Since every symmetric structure admits a “canonical” coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects provide us overwhelming clues to their orientation and position.
Abstract: In this paper, we provide a principled explanation of how knowledge in global 3-D structural invariants, typically captured by a group action on a symmetric structure, can dramatically facilitate the task of reconstructing a 3-D scene from one or more images. More importantly, since every symmetric structure admits a “canonical” coordinate frame with respect to which the group action can be naturally represented, the canonical pose between the viewer and this canonical frame can be recovered too, which explains why symmetric objects (e.g., buildings) provide us overwhelming clues to their orientation and position. We give the necessary and sufficient conditions in terms of the symmetry (group) admitted by a structure under which this pose can be uniquely determined. We also characterize, when such conditions are not satisfied, to what extent this pose can be recovered. We show how algorithms from conventional multiple-view geometry, after properly modified and extended, can be directly applied to perform such recovery, from all “hidden images” of one image of the symmetric structure. We also apply our results to a wide range of applications in computer vision and image processing such as camera self-calibration, image segmentation and global orientation, large baseline feature matching, image rendering and photo editing, as well as visual illusions (caused by symmetry if incorrectly assumed).

83 citations

Proceedings ArticleDOI
17 Oct 2005
TL;DR: This paper introduces a simple and efficient representation for natural images that gives more compact representations for a wide variety of natural images under a wide range of signal-to-noise ratio than many existing methods, including wavelets.
Abstract: This paper introduces a simple and efficient representation for natural images. We partition an image into blocks and treat the blocks as vectors in a high-dimensional space. We then fit a piecewise linear model (i.e. a union of affine subspaces) to the vectors at each down-sampling scale. We call this a multiscale hybrid linear model of the image. The hybrid and hierarchical structure of this model allows us effectively to extract and exploit multimodal correlations among the imagery data at different scales. It conceptually and computationally remedies limitations of many existing image representation methods that are based on either a fixed linear transformation (e.g. DCT, wavelets), an adaptive unimodal linear transformation (e.g. PCA), or a multi-modal model at a single scale. We will justify both analytically and experimentally why and how such a simple multiscale hybrid model is able to reduce simultaneously the model complexity and computational cost. Despite a small overhead for the model, our results show that this new model gives more compact representations for a wide variety of natural images under a wide range of signal-to-noise ratio than many existing methods, including wavelets.

42 citations

Patent
10 Aug 2006
TL;DR: In this paper, a method for automatic detection and segmentation of a target anatomical structure in received 3D volumetric medical images using a database of a set of expertly delineated anatomical structures is presented.
Abstract: The present invention is directed to a method for automatic detection and segmentation of a target anatomical structure in received three dimensional (3D) volumetric medical images using a database of a set of volumetric images with expertly delineated anatomical structures. A 3D anatomical structure detection and segmentation module is trained offline by learning anatomical structure appearance using the set of expertly delineated anatomical structures. A received volumetric image for the anatomical structure of interest is searched online using the offline learned 3D anatomical structure detection and segmentation module.

41 citations


Cited by
More filters
Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Journal ArticleDOI
TL;DR: It is shown that the convex program associated with LRR solves the subspace clustering problem in the following sense: When the data is clean, LRR exactly recovers the true subspace structures; when the data are contaminated by outliers, it is proved that under certain conditions LRR can exactly recover the row space of the original data.
Abstract: In this paper, we address the subspace clustering problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to cluster the samples into their respective subspaces and remove possible outliers as well. To this end, we propose a novel objective function named Low-Rank Representation (LRR), which seeks the lowest rank representation among all the candidates that can represent the data samples as linear combinations of the bases in a given dictionary. It is shown that the convex program associated with LRR solves the subspace clustering problem in the following sense: When the data is clean, we prove that LRR exactly recovers the true subspace structures; when the data are contaminated by outliers, we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well; for data corrupted by arbitrary sparse errors, LRR can also approximately recover the row space with theoretical guarantees. Since the subspace membership is provably determined by the row space, these further imply that LRR can perform robust subspace clustering and error correction in an efficient and effective way.

3,085 citations

Journal ArticleDOI
TL;DR: In this article, a sparse subspace clustering algorithm is proposed to cluster high-dimensional data points that lie in a union of low-dimensional subspaces, where a sparse representation corresponds to selecting a few points from the same subspace.
Abstract: Many real-world problems deal with collections of high-dimensional data, such as images, videos, text, and web documents, DNA microarray data, and more. Often, such high-dimensional data lie close to low-dimensional structures corresponding to several classes or categories to which the data belong. In this paper, we propose and study an algorithm, called sparse subspace clustering, to cluster data points that lie in a union of low-dimensional subspaces. The key idea is that, among the infinitely many possible representations of a data point in terms of other points, a sparse representation corresponds to selecting a few points from the same subspace. This motivates solving a sparse optimization program whose solution is used in a spectral clustering framework to infer the clustering of the data into subspaces. Since solving the sparse optimization program is in general NP-hard, we consider a convex relaxation and show that, under appropriate conditions on the arrangement of the subspaces and the distribution of the data, the proposed minimization program succeeds in recovering the desired sparse representations. The proposed algorithm is efficient and can handle data points near the intersections of subspaces. Another key advantage of the proposed algorithm with respect to the state of the art is that it can deal directly with data nuisances, such as noise, sparse outlying entries, and missing entries, by incorporating the model of the data into the sparse optimization program. We demonstrate the effectiveness of the proposed algorithm through experiments on synthetic data as well as the two real-world problems of motion segmentation and face clustering.

2,298 citations

Journal ArticleDOI
TL;DR: In this paper, the authors offer a new book that enPDFd the perception of the visual world to read, which they call "Let's Read". But they do not discuss how to read it.
Abstract: Let's read! We will often find out this sentence everywhere. When still being a kid, mom used to order us to always read, so did the teacher. Some books are fully read in a week and we need the obligation to support reading. What about now? Do you still love reading? Is reading only for you who have obligation? Absolutely not! We here offer you a new book enPDFd the perception of the visual world to read.

2,250 citations

Proceedings Article
21 Jun 2010
TL;DR: Both theoretical and experimental results show that low-rank representation is a promising tool for subspace segmentation from corrupted data.
Abstract: We propose low-rank representation (LRR) to segment data drawn from a union of multiple linear (or affine) subspaces. Given a set of data vectors, LRR seeks the lowest-rank representation among all the candidates that represent all vectors as the linear combination of the bases in a dictionary. Unlike the well-known sparse representation (SR), which computes the sparsest representation of each data vector individually, LRR aims at finding the lowest-rank representation of a collection of vectors jointly. LRR better captures the global structure of data, giving a more effective tool for robust subspace segmentation from corrupted data. Both theoretical and experimental results show that LRR is a promising tool for subspace segmentation.

1,542 citations