Showing papers by "Jian Sun published in 2014"

PDF

Open Access

Book Chapter•DOI•

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren³, Jian Sun¹•Institutions (3)

Microsoft¹, Xi'an Jiaotong University², University of Science and Technology³

06 Sep 2014

TL;DR: This work equips the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.

...read moreread less

Abstract: Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g. 224×224) input image. This requirement is “artificial” and may hurt the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with a more principled pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. By removing the fixed-size limitation, we can improve all CNN-based image classification methods in general. Our SPP-net achieves state-of-the-art accuracy on the datasets of ImageNet 2012, Pascal VOC 2007, and Caltech101.

...read moreread less

3,945 citations

Book Chapter•DOI•

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren³, Jian Sun¹•Institutions (3)

Microsoft¹, Xi'an Jiaotong University², University of Science and Technology of China³

18 Jun 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: SPP-Net as mentioned in this paper proposes a spatial pyramid pooling strategy, which can generate a fixed-length representation regardless of image size/scale, and achieves state-of-the-art performance in object detection.

...read moreread less

Abstract: Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224x224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102x faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.

...read moreread less

2,304 citations

Proceedings Article•DOI•

Saliency Optimization from Robust Background Detection

[...]

Wangjiang Zhu¹, Shuang Liang², Yichen Wei³, Jian Sun³•Institutions (3)

Tsinghua University¹, Tongji University², Microsoft³

23 Jun 2014

TL;DR: This work proposes a robust background measure, called boundary connectivity, which characterizes the spatial layout of image regions with respect to image boundaries and is much more robust and presents unique benefits that are absent in previous saliency measures.

...read moreread less

Abstract: Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to address these issues. First, we propose a robust background measure, called boundary connectivity. It characterizes the spatial layout of image regions with respect to image boundaries and is much more robust. It has an intuitive geometrical interpretation and presents unique benefits that are absent in previous saliency measures. Second, we propose a principled optimization framework to integrate multiple low level cues, including our background measure, to obtain clean and uniform saliency maps. Our formulation is intuitive, efficient and achieves state-of-the-art results on several benchmark datasets.

...read moreread less

1,321 citations

Journal Article•DOI•

Face Alignment by Explicit Shape Regression

[...]

Xudong Cao¹, Yichen Wei¹, Fang Wen¹, Jian Sun¹•Institutions (1)

Microsoft¹

01 Apr 2014-International Journal of Computer Vision

TL;DR: A very efficient, highly accurate, “Explicit Shape Regression” approach for face alignment that significantly outperforms the state-of-the-art in terms of both accuracy and efficiency.

...read moreread less

Abstract: We present a very efficient, highly accurate, "Explicit Shape Regression" approach for face alignment. Unlike previous regression-based approaches, we directly learn a vectorial regression function to infer the whole facial shape (a set of facial landmarks) from the image and explicitly minimize the alignment errors over the training data. The inherent shape constraint is naturally encoded into the regressor in a cascaded learning framework and applied from coarse to fine during the test, without using a fixed parametric shape model as in most previous methods. To make the regression more effective and efficient, we design a two-level boosted regression, shape indexed features and a correlation-based feature selection method. This combination enables us to learn accurate models from large training data in a short time (20 min for 2,000 training images), and run regression extremely fast in test (15 ms for a 87 landmarks shape). Experiments on challenging data show that our approach significantly outperforms the state-of-the-art in terms of both accuracy and efficiency.

...read moreread less

1,239 citations

Proceedings Article•DOI•

Face Alignment at 3000 FPS via Regressing Local Binary Features

[...]

Shaoqing Ren¹, Xudong Cao¹, Yichen Wei¹, Jian Sun²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2014

TL;DR: This paper presents a highly efficient, very accurate regression approach for face alignment that achieves the state-of-the-art results when tested on the current most challenging benchmarks.

...read moreread less

Abstract: This paper presents a highly efficient, very accurate regression approach for face alignment. Our approach has two novel components: a set of local binary features, and a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. Our approach achieves the state-of-the-art results when tested on the current most challenging benchmarks. Furthermore, because extracting and regressing local binary features is computationally very cheap, our system is much faster than previous methods. It achieves over 3, 000 fps on a desktop or 300 fps on a mobile phone for locating a few dozens of landmarks.

...read moreread less

974 citations

Proceedings Article•DOI•

Realtime and Robust Hand Tracking from Depth

[...]

Chen Qian¹, Xiao Sun², Yichen Wei², Xiaoou Tang¹, Jian Sun² - Show less +1 more•Institutions (2)

The Chinese University of Hong Kong¹, Microsoft²

23 Jun 2014

TL;DR: A hybrid method that combines gradient based and stochastic optimization methods to achieve fast convergence and good accuracy is proposed and presented, making it the first system that achieves such robustness, accuracy, and speed simultaneously.

...read moreread less

Abstract: We present a realtime hand tracking system using a depth sensor. It tracks a fully articulated hand under large viewpoints in realtime (25 FPS on a desktop without using a GPU) and with high accuracy (error below 10 mm). To our knowledge, it is the first system that achieves such robustness, accuracy, and speed simultaneously, as verified on challenging real data. Our system is made of several novel techniques. We model a hand simply using a number of spheres and define a fast cost function. Those are critical for realtime performance. We propose a hybrid method that combines gradient based and stochastic optimization methods to achieve fast convergence and good accuracy. We present new finger detection and hand initialization methods that greatly enhance the robustness of tracking.

...read moreread less

517 citations

Book Chapter•DOI•

Joint Cascade Face Detection and Alignment

[...]

Dong Chen¹, Shaoqing Ren¹, Yichen Wei², Xudong Cao², Jian Sun² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

06 Sep 2014

TL;DR: The key idea is to combine face alignment with detection, observing that aligned face shapes provide better features for face classification and learns the two tasks jointly in the same cascade framework, by exploiting recent advances in face alignment.

...read moreread less

Abstract: We present a new state-of-the-art approach for face detection. The key idea is to combine face alignment with detection, observing that aligned face shapes provide better features for face classification. To make this combination more effective, our approach learns the two tasks jointly in the same cascade framework, by exploiting recent advances in face alignment. Such joint learning greatly enhances the capability of cascade detection and still retains its realtime performance. Extensive experiments show that our approach achieves the best accuracy on challenging datasets, where all existing solutions are either inaccurate or too slow.

...read moreread less

462 citations

Journal Article•DOI•

Optimized Product Quantization

[...]

Tiezheng Ge¹, Kaiming He², Qifa Ke², Jian Sun²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

01 Apr 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper optimize PQ by minimizing quantization distortions w.r.t the space decomposition and the quantization codebooks, and evaluates the optimized product quantizers in three applications: compact encoding for exhaustive ranking, inverted multi-indexing for non-exhaustive search, and compacting image representations for image retrieval.

...read moreread less

Abstract: Product quantization (PQ) is an effective vector quantization method. A product quantizer can generate an exponentially large codebook at very low memory/time cost. The essence of PQ is to decompose the high-dimensional vector space into the Cartesian product of subspaces and then quantize these subspaces separately. The optimal space decomposition is important for the PQ performance, but still remains an unaddressed issue. In this paper, we optimize PQ by minimizing quantization distortions w.r.t the space decomposition and the quantization codebooks. We present two novel solutions to this challenging optimization problem. The first solution iteratively solves two simpler sub-problems. The second solution is based on a Gaussian assumption and provides theoretical analysis of the optimality. We evaluate our optimized product quantizers in three applications: (i) compact encoding for exhaustive ranking [1], (ii) building inverted multi-indexing for non-exhaustive search [2], and (iii) compacting image representations for image retrieval [3]. In all applications our optimized product quantizers outperform existing solutions.

...read moreread less

314 citations

Proceedings Article•DOI•

SteadyFlow: Spatially Smooth Optical Flow for Video Stabilization

[...]

Shuaicheng Liu¹, Lu Yuan², Ping Tan¹, Jian Sun²•Institutions (2)

National University of Singapore¹, Microsoft²

23 Jun 2014

TL;DR: A novel motion model, SteadyFlow, to represent the motion between neighboring video frames for stabilization by enforcing strong spatial coherence, such that smoothing feature trajectories can be replaced by smoothing pixel profiles, which are motion vectors collected at the same pixel location in the Steady Flow over time.

...read moreread less

Abstract: We propose a novel motion model, SteadyFlow, to represent the motion between neighboring video frames for stabilization. A SteadyFlow is a specific optical flow by enforcing strong spatial coherence, such that smoothing feature trajectories can be replaced by smoothing pixel profiles, which are motion vectors collected at the same pixel location in the SteadyFlow over time. In this way, we can avoid brittle feature tracking in a video stabilization system. Besides, SteadyFlow is a more general 2D motion model which can deal with spatially-variant motion. We initialize the SteadyFlow by optical flow and then discard discontinuous motions by a spatial-temporal analysis and fill in missing regions by motion completion. Our experiments demonstrate the effectiveness of our stabilization on real-world challenging videos.

...read moreread less

181 citations

Journal Article•DOI•

Image Completion Approaches Using the Statistics of Similar Patches

[...]

Kaiming He¹, Jian Sun¹•Institutions (1)

Microsoft¹

11 Jul 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper observes that if the authors match similar patches in the image and obtain their offsets (relative positions), the statistics of these offsets are sparsely distributed and can be incorporated into both matching-based and graph-based methods for image completion.

...read moreread less

Abstract: Image completion involves filling missing parts in images In this paper we address this problem through novel statistics of similar patches We observe that if we match similar patches in the image and obtain their offsets (relative positions), the statistics of these offsets are sparsely distributed We further observe that a few dominant offsets provide reliable information for completing the image Such statistics can be incorporated into both matching-based and graph-based methods for image completion Experiments show that our method yields better results in various challenging cases, and is faster than existing state-of-the-art methods

...read moreread less

151 citations

Journal Article•DOI•

Fast burst images denoising

[...]

Ziwei Liu¹, Lu Yuan², Xiaoou Tang¹, Matthew T. Uyttendaele², Jian Sun² - Show less +1 more•Institutions (2)

The Chinese University of Hong Kong¹, Microsoft²

19 Nov 2014

TL;DR: A fast denoising method that produces a clean image from a burst of noisy images by introducing a lightweight camera motion representation called homography flow and a mechanism of selecting consistent pixels for temporal fusion to handle scene motion during the capture.

...read moreread less

Abstract: This paper presents a fast denoising method that produces a clean image from a burst of noisy images. We accelerate alignment of the images by introducing a lightweight camera motion representation called homography flow. The aligned images are then fused to create a denoised output with rapid per-pixel operations in temporal and spatial domains. To handle scene motion during the capture, a mechanism of selecting consistent pixels for temporal fusion is proposed to "synthesize" a clean, ghost-free image, which can largely reduce the computation of tracking motion between frames. Combined with these efficient solutions, our method runs several orders of magnitude faster than previous work, while the denoising quality is comparable. A smartphone prototype demonstrates that our method is practical and works well on a large variety of real examples.

...read moreread less

Proceedings Article•DOI•

Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers

[...]

Minsu Cho¹, Jian Sun², Olivier Duchenne³, Jean Ponce⁴•Institutions (4)

French Institute for Research in Computer Science and Automation¹, Xi'an Jiaotong University², Intel³, École Normale Supérieure⁴

23 Jun 2014

TL;DR: A max-pooling approach to graph matching is proposed, which is not only resilient to deformations but also remarkably tolerant to outliers.

...read moreread less

Abstract: A major challenge in real-world feature matching problems is to tolerate the numerous outliers arising in typical visual tasks. Variations in object appearance, shape, and structure within the same object class make it harder to distinguish inliers from outliers due to clutters. In this paper, we propose a max-pooling approach to graph matching, which is not only resilient to deformations but also remarkably tolerant to outliers. The proposed algorithm evaluates each candidate match using its most promising neighbors, and gradually propagates the corresponding scores to update the neighbors. As final output, it assigns a reliable score to each match together with its supporting neighbors, thus providing contextual information for further verification. We demonstrate the robustness and utility of our method with synthetic and real image experiments.

...read moreread less

Book Chapter•DOI•

Graph Cuts for Supervised Binary Coding

[...]

Tiezheng Ge¹, Kaiming He², Jian Sun²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

06 Sep 2014

TL;DR: This paper forms supervised binary coding as a single optimization problem that involves both the encoding functions and the binary label assignment, and applies the graph cuts algorithm to address the discrete optimization problem involved, with no continuous relaxation.

...read moreread less

Abstract: Learning short binary codes is challenged by the inherent discrete nature of the problem. The graph cuts algorithm is a well-studied discrete label assignment solution in computer vision, but has not yet been applied to solve the binary coding problems. This is partially because it was unclear how to use it to learn the encoding (hashing) functions for out-of-sample generalization. In this paper, we formulate supervised binary coding as a single optimization problem that involves both the encoding functions and the binary label assignment. Then we apply the graph cuts algorithm to address the discrete optimization problem involved, with no continuous relaxation. This method, named as Graph Cuts Coding (GCC), shows competitive results in various datasets.

...read moreread less

Posted Content•

Convolutional Neural Networks at Constrained Time Cost

[...]

Kaiming He¹, Jian Sun¹•Institutions (1)

Microsoft¹

04 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors investigate the accuracy of CNNs under constrained time cost and propose a series of controlled comparisons to progressively modify a baseline model while preserving its time complexity, achieving very competitive accuracy in the ImageNet dataset.

...read moreread less

Abstract: Though recent advanced convolutional neural networks (CNNs) have been improving the image recognition accuracy, the models are getting more complex and time-consuming. For real-world applications in industrial and commercial scenarios, engineers and developers are often faced with the requirement of constrained time budget. In this paper, we investigate the accuracy of CNNs under constrained time cost. Under this constraint, the designs of the network architectures should exhibit as trade-offs among the factors like depth, numbers of filters, filter sizes, etc. With a series of controlled comparisons, we progressively modify a baseline model while preserving its time complexity. This is also helpful for understanding the importance of the factors in network designs. We present an architecture that achieves very competitive accuracy in the ImageNet dataset (11.8% top-5 error, 10-view test), yet is 20% faster than "AlexNet" (16.0% top-5 error, 10-view test).

...read moreread less

Posted Content•

A discrete uniformization theorem for polyhedral surfaces II

[...]

Xianfeng Gu, Ren Guo, Feng Luo, Jian Sun, Tianqi Wu - Show less +1 more

18 Jan 2014-arXiv: Geometric Topology

TL;DR: In this paper, a discrete conformality for hyperbolic polyhedral surfaces is shown to be computable and it is proved that each hyperbola polyhedral metric on a closed surface is discrete conformal to a unique hyperbolaspecific metric with a given discrete curvature satisfying Gauss-Bonnet formula, which can be obtained using a discrete Yamabe flow with surgery.

...read moreread less

Abstract: A discrete conformality for hyperbolic polyhedral surfaces is introduced in this paper This discrete conformality is shown to be computable It is proved that each hyperbolic polyhedral metric on a closed surface is discrete conformal to a unique hyperbolic polyhedral metric with a given discrete curvature satisfying Gauss-Bonnet formula Furthermore, the hyperbolic polyhedral metric with given curvature can be obtained using a discrete Yamabe flow with surgery In particular, each hyperbolic polyhedral metric on a closed surface with negative Euler characteristic is discrete conformal to a unique hyperbolic metric

...read moreread less

Patent•

Generic object detection in images

[...]

Kaiming He¹, Jian Sun¹, Xiangyu Zhang¹•Institutions (1)

Microsoft¹

09 Oct 2014

TL;DR: In this paper, a spatial pyramid pooling (SPP) layer is used to generate a fixed-length representation regardless of image size and scale, and the feature maps are computed from the entire image once and pooled in arbitrary regions (sub-images) to generate fixed length representations for training the detectors.

...read moreread less

Abstract: Neural networks for object detection in images are used with a spatial pyramid pooling (SPP) layer. Using the SPP network structure, a fixed-length representation is generated regardless of image size and scale. The feature maps are computed from the entire image once, and the features are pooled in arbitrary regions (sub-images) to generate fixed- length representations for training the detectors. Thus, repeated computation of the convolutional features is avoided while accuracy is enhanced.

...read moreread less

Patent•

Spatial pyramid pooling networks for image processing

[...]

Kaiming He¹, Jian Sun¹, Xiangyu Zhang¹, Shaoqing Ren¹•Institutions (1)

Microsoft¹

09 Oct 2014

TL;DR: Spatial pyramid pooling (SPP) layers are combined with convolutional layers and partition an input image into divisions from finer to coarser levels, and aggregate local features in the divisions.

...read moreread less

Abstract: Spatial pyramid pooling (SPP) layers are combined with convolutional layers and partition an input image into divisions from finer to coarser levels, and aggregate local features in the divisions. A fixed-length output may be generated by the SPP layer(s) regardless of the input size. The multi-level spatial bins used by the SPP layer(s) may provide robustness to object deformations. An SPP layer based system may pool features extracted at variable scales due to the flexibility of input scales making it possible to generate a full-image representation for testing. Moreover, SPP networks may enable feeding of images with varying sizes or scales during training, which may increase scale-invariance and reduce the risk of over-fitting.

...read moreread less

Proceedings Article•DOI•

Gromov-Hausdorff Approximation of Filament Structure Using Reeb-type Graph

[...]

Frédéric Chazal¹, Jian Sun²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Tsinghua University²

08 Jun 2014

TL;DR: It is proved that filamentary structures that can be seen as topological metric graphs can be approximated, with respect to the Gromov-Hausdorff distance by well-chosen Reeb graphs (and some of their variants) and an efficient and easy to implement algorithm to compute such approximations is provided.

...read moreread less

Abstract: In many real-world applications data appear to be sampled around 1-dimensional filamentary structures that can be seen as topological metric graphs. In this paper we address the metric reconstruction problem of such filamentary structures from data sampled around them. We prove that they can be approximated, with respect to the Gromov-Hausdorff distance by well-chosen Reeb graphs (and some of their variants) and we provide an efficient and easy to implement algorithm to compute such approximations in almost linear time. We illustrate the performances of our algorithm on a few data sets.

...read moreread less

Book Chapter•DOI•

Well Begun Is Half Done: Generating High-Quality Seeds for Automatic Image Dataset Construction from Web

[...]

Yan Xia¹, Xudong Cao², Fang Wen², Jian Sun²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

06 Sep 2014

TL;DR: This work proposes a density score based on rank-order distance to identify positive seed images and guarantees the selected seeds are as numerous and accurate as possible through adaptive thresholding.

...read moreread less

Abstract: We present a fully automatic approach to construct a large-scale, high-precision dataset from noisy web images. Within the entire pipeline, we focus on generating high quality seed images for subsequent dataset growing. High quality seeds are essential as we revealed, but they have received relatively less attention in previous works with respect to how to automatically generate them. In this work, we propose a density score based on rank-order distance to identify positive seed images. The basic idea is images relevant to a concept typically are tightly clustered, while the outliers are widely scattered. Through adaptive thresholding, we guarantee the selected seeds as numerous and accurate as possible. Starting with the high quality seeds, we grow a high quality dataset by dividing seeds and conducting iterative negative and positive mining. Our system can automatically collect thousands of images for one concept/class, with a precision rate of 95% or more. Comparisons with recent state-of-the-arts also demonstrate our method’s superior performance.

...read moreread less

Proceedings Article•DOI•

Product Sparse Coding

[...]

Tiezheng Ge¹, Kaiming He², Jian Sun²•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2014

TL;DR: This paper studies a special case of sparse coding in which the codebook is a Cartesian product of two subcodebooks, and presents algorithms to decompose this sparse coding problem into smaller subproblems, which can be separately solved.

...read moreread less

Abstract: Sparse coding is a widely involved technique in computer vision. However, the expensive computational cost can hamper its applications, typically when the codebook size must be limited due to concerns on running time. In this paper, we study a special case of sparse coding in which the codebook is a Cartesian product of two subcodebooks. We present algorithms to decompose this sparse coding problem into smaller subproblems, which can be separately solved. Our solution, named as Product Sparse Coding (PSC), reduces the time complexity from O(K) to O(rK) in the codebook size K. In practice, this can be 20-100x faster than standard sparse coding. In experiments we demonstrate the efficiency and quality of this method on the applications of image classification and image retrieval.

...read moreread less

Posted Content•

Convergence of the Point Integral method for Poisson equation on point cloud

[...]

Zuoqiang Shi, Jian Sun

10 Mar 2014-arXiv: Numerical Analysis

TL;DR: The convergence of Point Integral method (PIM) for Poisson equation with Neumann boundary condition on submanifolds isometrically embedded in Euclidean spaces is analyzed.

...read moreread less

Abstract: The Laplace-Beltrami operator (LBO) is a fundamental object associated to Riemannian manifolds, which encodes all intrinsic geometry of the manifolds and has many desirable properties. Recently, we proposed a novel numerical method, Point Integral method (PIM), to discretize the Laplace-Beltrami operator on point clouds \cite{LSS}. In this paper, we analyze the convergence of Point Integral method (PIM) for Poisson equation with Neumann boundary condition on submanifolds isometrically embedded in Euclidean spaces.

...read moreread less

Patent•

Face alignment with shape regression

[...]

Xudong Cao¹, Yichen Wei¹, Jian Sun¹, Shaoqing Ren¹•Institutions (1)

Microsoft¹

22 Aug 2014

TL;DR: In this paper, a method, computer storage medium, and system are provided for face alignment via shape regression, which comprises receiving an image including a face; and performing shape regression to estimate a facial shape in the image.

...read moreread less

Abstract: The subject matter described herein relates to face alignment via shape regression. A method, computer storage medium, and system are provided. The method comprises receiving an image including a face; and performing shape regression to estimate a facial shape in the image. For each stage in the shape regression, a local feature is extracted from a local region around each facial landmark in the image independently; and a joint projection is performed based on local features of multiple facial landmarks to predict a facial shape increment. Then, a facial shape of a current stage is generated based on the predicted facial shape increment and a facial shape of a previous stage.

...read moreread less

Journal Article•DOI•

Rigidity of infinite hexagonal triangulation of the plane

[...]

Tianqi Wu¹, Xianfeng Gu², Jian Sun¹•Institutions (2)

Tsinghua University¹, Stony Brook University²

12 Nov 2014-Transactions of the American Mathematical Society

Abstract: In the paper, we consider the rigidity problem of the infinite hexagonal triangulation of the plane under the piecewise linear conformal changes introduced by Luo in [5]. Our result shows that if a geometric hexagonal triangulation of the plane is PL conformal to the regular hexagonal triangulation and all inner angles are in [δ, π/2 − δ] for any constant δ > 0, then it is the regular hexagonal triangulation. This partially solves a conjecture of Luo [4]. The proof uses the concept of quasi-harmonic functions to unfold the properties of the mesh. ∗Mathematical Sciences Center, Tsinghua University, Beijing 100084 China. Email: mike890505@gmail.com. †Department of Computer Science, Stony Brook University, New York 11794 USA. Email: gu@cs.stonybrook.edu. ‡Mathematical Sciences Center, Tsinghua University, Beijing 100084 China. Email: jsun@math.tsinghua.edu.cn. 1 ar X iv :1 30 6. 36 30 v1 [ m at h. G T ] 1 6 Ju n 20 13

...read moreread less

Posted Content•

Point Integral Method for Solving Poisson-type Equations on Manifolds from Point Clouds with Convergence Guarantees

[...]

Zhen Li¹, Zuoqiang Shi¹, Jian Sun¹•Institutions (1)

Tsinghua University¹

09 Sep 2014-arXiv: Numerical Analysis

TL;DR: In this article, a point integral method (PIM) is proposed to solve the Poisson-type equations from point clouds with convergence guarantees, where the key idea is to derive the integral equations which approximates the poisson type equations and contains no derivatives but only the values of the unknown function.

...read moreread less

Abstract: Partial differential equations (PDE) on manifolds arise in many areas, including mathematics and many applied fields. Among all kinds of PDEs, the Poisson-type equations including the standard Poisson equation and the related eigenproblem of the Laplace-Beltrami operator are of the most important. Due to the complicated geometrical structure of the manifold, it is difficult to get efficient numerical method to solve PDE on manifold. In the paper, we propose a method called point integral method (PIM) to solve the Poisson-type equations from point clouds with convergence guarantees. In PIM, the key idea is to derive the integral equations which approximates the Poisson-type equations and contains no derivatives but only the values of the unknown function. The latter makes the integral equation easy to be approximated from point cloud. In the paper, we explain the derivation of the integral equations, describe the point integral method and its implementation, and present the numerical experiments to demonstrate the convergence of PIM.

...read moreread less

Patent•

Content-aware image rotation

[...]

Kaiming He¹, Huiwen Chang¹, Jian Sun¹•Institutions (1)

Microsoft¹

24 Nov 2014

TL;DR: In this paper, a mesh is formed over an image and image lines in the image content are identified, and the image is warped using an energy function that rotates a subset of the lines a predetermined rotation angle, while rotating other lines by an angle other than the predetermined rotation angles.

...read moreread less

Abstract: According to implementations of this disclosure, image content is rotated in a content-aware fashion. In one implementation, a mesh is formed over an image and image lines in the image content are identified. The image is warped using an energy function that rotates a subset of the lines a predetermined rotation angle, while rotating other lines by an angle other than the predetermined rotation angle. In one example, lines that are intended to be horizontal or vertical after correcting are rotated by a rotation angle that will make them horizontal or vertical, whereas oblique lines are rotated by an angle other than the rotation angle.

...read moreread less

Journal Article•DOI•

A Fourier-theoretic approach for inferring symmetries

[...]

Xiaoye Jiang¹, Jian Sun², Leonidas J. Guibas¹•Institutions (2)

Stanford University¹, Tsinghua University²

01 Feb 2014-Computational Geometry: Theory and Applications

TL;DR: Using the Fourier-theoretic approach, an efficient marginal-based search strategy is constructed, which can recover the symmetry group G effectively and can fully determine the symmetries of various geometric objects.

...read moreread less

Abstract: In this paper, we propose a novel Fourier-theoretic approach for estimating the symmetry group G of a geometric object X. Our approach takes as input a geometric similarity matrix between low-order combinations of features of X and then searches within the tree of all feature permutations to detect the sparse subset that defines the symmetry group G of X. Using the Fourier-theoretic approach, we construct an efficient marginal-based search strategy, which can recover the symmetry group G effectively. The framework introduced in this paper can be used to discover symmetries of more abstract geometric spaces and is robust to deformation noise. Experimental results show that our approach can fully determine the symmetries of various geometric objects.

...read moreread less

Journal Article•DOI•

Optimal mass transport for geometric modeling based on variational principles in convex geometry

[...]

Zhengyu Su¹, Jian Sun², Xianfeng Gu¹, Feng Luo², Shing-Tung Yau³ - Show less +1 more•Institutions (3)

Stony Brook University¹, Tsinghua University², Harvard University³

01 Oct 2014-Engineering With Computers

TL;DR: This work proposes a novel area-preserving parameterization method, which is based on an optimal mass transport theory and convex geometry and formulates the solution to the optimal transport map as the unique optimum of a convex energy.

...read moreread less

Abstract: In geometric modeling, surface parameterization plays an important role for converting triangle meshes to spline surfaces Parameterization will introduce distortions Conventional parameterization methods emphasize on angle-preservation, which may induce huge area distortions and cause large spline fitting errors and trigger numerical instabilitiesTo overcome this difficulty, this work proposes a novel area-preserving parameterization method, which is based on an optimal mass transport theory and convex geometry Optimal mass transport mapping is measure-preserving and minimizes the transportation cost According to Brenier's theorem, for quadratic distance transportation costs, the optimal mass transport map is the gradient of a convex function The graph of the convex function is a convex polyhedron with prescribed normal and areas The existence and the uniqueness of such a polyhedron have been proved by the Minkowski-Alexandrov theorem in convex geometry This work gives an explicit method to construct such a polyhedron based on the variational principle, and formulates the solution to the optimal transport map as the unique optimum of a convex energy In practice, the energy optimization can be carried out using Newton's method, and each iteration constructs a power Voronoi diagram dynamically We tested the proposal algorithms on 3D surfaces scanned from real life Experimental results demonstrate the efficiency and efficacy of the proposed variational approach for the optimal transport map

...read moreread less

Patent•

Visual based identity tracking

[...]

Tommer Leyvand¹, Mitchell Stephen Dernis¹, Jinyu Li¹, Yichen Wei¹, Jian Sun¹, Meekhof Casey Leon¹, Timothy Milton Keosababian¹ - Show less +3 more•Institutions (1)

Microsoft¹

20 Oct 2014

TL;DR: In this paper, a video game system (or other data processing system) can visually identify a person entering a field of view of the system and determine whether the person has been previously interacting with the system.

...read moreread less

Abstract: A video game system (or other data processing system) can visually identify a person entering a field of view of the system and determine whether the person has been previously interacting with the system. In one embodiment, the system establishes thresholds, enrolls players, performs the video game (or other application) including interacting with a subset of the players based on the enrolling, determines that a person has become detectable in the field of view of the system, automatically determines whether the person is one of the enrolled players, maps the person to an enrolled player and interacts with the person based on the mapping if it is determined that the person is one of the enrolled players, and assigns a new identification to the person and interacts with the person based on the new identification if it is determined that the person is not one of the enrolled players.

...read moreread less

Posted Content•

Discrete Conformal Deformation: Algorithm and Experiments

[...]

Jian Sun, Tianqi Wu, Xianfeng Gu, Feng Luo

22 Dec 2014-arXiv: Computational Geometry

TL;DR: In this paper, the authors introduce a definition of discrete conformality for triangulated surfaces with flat cone metrics and describe an algorithm for solving the problem of prescribing curvature, that is to deform the metric discrete conformally so that the curvature of the resulting metric coincides with the prescribed curvature.

...read moreread less

Abstract: In this paper, we introduce a definition of discrete conformality for triangulated surfaces with flat cone metrics and describe an algorithm for solving the problem of prescribing curvature, that is to deform the metric discrete conformally so that the curvature of the resulting metric coincides with the prescribed curvature. We explicitly construct a discrete conformal map between the input triangulated surface and the deformed triangulated surface. Our algorithm can handle the surface with any topology with or without boundary, and can find a deformed metric for any prescribed curvature satisfying the Gauss-Bonnet formula. In addition, we present the numerical examples to show the convergence of our discrete conformality and to demonstrate the efficiency and the robustness of our algorithm.

...read moreread less