scispace - formally typeset
Search or ask a question
Author

Guo-Xin Zhang

Bio: Guo-Xin Zhang is an academic researcher from Tsinghua University. The author has contributed to research in topics: Image processing & Image segmentation. The author has an hindex of 8, co-authored 11 publications receiving 3822 citations.

Papers
More filters
Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations

Journal ArticleDOI
TL;DR: A novel image resizing method which attempts to ensure that important local regions undergo a geometric similarity transformation, and at the same time, to preserve image edge structure, and is efficient, and offers a closed form solution.
Abstract: We present a novel image resizing method which attempts to ensure that important local regions undergo a geometric similarity transformation, and at the same time, to preserve image edge structure. To accomplish this, we define handles to describe both local regions and image edges, and assign a weight for each handle based on an importance map for the source image. Inspired by conformal energy, which is widely used in geometry processing, we construct a novel quadratic distortion energy to measure the shape distortion for each handle. The resizing result is obtained by minimizing the weighted sum of the quadratic distortion energies of all handles. Compared to previous methods, our method allows distortion to be diffused better in all directions, and important image edges are well-preserved. The method is efficient, and offers a closed form solution.

228 citations

Journal ArticleDOI
01 Nov 2012
TL;DR: A novel approach for computing high quality point-to-point maps among a collection of related shapes that takes as input a sparse set of imperfect initial maps between pairs of shapes and builds a compact data structure which implicitly encodes an improved set of maps between all pairs of shape.
Abstract: We introduce a novel approach for computing high quality point-to-point maps among a collection of related shapes. The proposed approach takes as input a sparse set of imperfect initial maps between pairs of shapes and builds a compact data structure which implicitly encodes an improved set of maps between all pairs of shapes. These maps align well with point correspondences selected from initial maps; they map neighboring points to neighboring points; and they provide cycle-consistency, so that map compositions along cycles approximate the identity map.The proposed approach is motivated by the fact that a complete set of maps between all pairs of shapes that admits nearly perfect cycle-consistency are highly redundant and can be represented by compositions of maps through a single base shape. In general, multiple base shapes are needed to adequately cover a diverse collection. Our algorithm sequentially extracts such a small collection of base shapes and creates correspondences from each of these base shapes to all other shapes. These correspondences are found by global optimization on candidate correspondences obtained by diffusing initial maps. These are then used to create a compact graphical data structure from which globally optimal cycle-consistent maps can be extracted using simple graph algorithms.Experimental results on benchmark datasets show that the proposed approach yields significantly better results than state-of-the-art data-driven shape matching methods.

105 citations

Patent
Shi-min Hu1, Ming-ming Cheng2, Guo-Xin Zhang2, Niloy J. Mitra2, Xiang Ruan2 
09 May 2012
TL;DR: In this paper, an image processing method and image processing device for detecting visual saliency of an image based on regional contrast is presented. And the method includes: a segmentation step that segments an input image into a plurality of regions by using an automatic segmentation algorithm; and a computation step that calculates a saliency value of one region of the plurality of segmented regions, by using a weighted sum of color differences between the one region and all other regions.
Abstract: The present invention relates to an image processing method and image processing device for detecting visual saliency of an image based on regional contrast. The method includes: a segmentation step that segments an input image into a plurality of regions by using an automatic segmentation algorithm; and a computation step that calculates a saliency value of one region of the plurality of segmented regions by using a weighted sum of color differences between the one region and all other regions. According to the present invention, it is possible to automatically analyze visual saliency regions in an image, and a result of analysis can be used in application areas including significant object segmentation, object recognition, adaptive image compression, content-aware image resizing, and image retrieval.

37 citations

Journal ArticleDOI
TL;DR: This work introduces general Lp norms to shape deformation; the positive parameter p provides the user with a flexible control over the distribution of unavoidable distortions, and demonstrates the effectiveness of the proposed algorithm with various examples.
Abstract: Shape deformation is a fundamental tool in geometric modeling. Existing methods consider preserving local details by minimizing some energy functional measuring local distortions in the L2 norm. This strategy distributes distortions quite uniformly to all the vertices and penalizes outliers. However, there is no unique answer for a natural deformation as it depends on the nature of the objects. Inspired by recent sparse signal reconstruction work with non L2 norm, we introduce general Lp norms to shape deformation; the positive parameter p provides the user with a flexible control over the distribution of unavoidable distortions. Compared with the traditional L2 norm, using smaller p, distortions tend to be distributed to a sparse set of vertices, typically in feature regions, thus making most areas less distorted and structures better preserved. On the other hand, using larger p tends to distribute distortions more evenly across the whole model. This flexibility is often desirable as it mimics objects made up with different materials. By specifying varying p over the shape, more flexible control can be achieved. We demonstrate the effectiveness of the proposed algorithm with various examples.

27 citations


Cited by
More filters
Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work proposes a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence, and consistently outperformed existing saliency detection methods.
Abstract: Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

3,653 citations

Journal ArticleDOI
TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.
Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.

3,097 citations

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work considers both foreground and background cues in a different way and ranks the similarity of the image elements with foreground cues or background cues via graph-based manifold ranking, defined based on their relevances to the given seeds or queries.
Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking The saliency of the image elements is defined based on their relevances to the given seeds or queries We represent the image as a close-loop graph with super pixels as nodes These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed We also create a more difficult benchmark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field

2,278 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A conceptually clear and intuitive algorithm for contrast-based saliency estimation that outperforms all state-of-the-art approaches and can be formulated in a unified way using high-dimensional Gaussian filters.
Abstract: Saliency estimation has become a valuable tool in image processing. Yet, existing approaches exhibit considerable variation in methodology, and it is often difficult to attribute improvements in result quality to specific algorithm properties. In this paper we reconsider some of the design choices of previous methods and propose a conceptually clear and intuitive algorithm for contrast-based saliency estimation. Our algorithm consists of four basic steps. First, our method decomposes a given image into compact, perceptually homogeneous elements that abstract unnecessary detail. Based on this abstraction we compute two measures of contrast that rate the uniqueness and the spatial distribution of these elements. From the element contrast we then derive a saliency measure that produces a pixel-accurate saliency map which uniformly covers the objects of interest and consistently separates fore- and background. We show that the complete contrast and saliency estimation can be formulated in a unified way using high-dimensional Gaussian filters. This contributes to the conceptual simplicity of our method and lends itself to a highly efficient implementation with linear complexity. In a detailed experimental evaluation we analyze the contribution of each individual feature and show that our method outperforms all state-of-the-art approaches.

1,711 citations

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work tackles saliency detection from a scale point of view and proposes a multi-layer approach to analyze saliency cues, by finding saliency values optimally in a tree model.
Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem - namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed.

1,624 citations