scispace - formally typeset
Search or ask a question
Author

Yinting Wang

Bio: Yinting Wang is an academic researcher from Zhejiang University. The author has contributed to research in topics: Iterative reconstruction & Video tracking. The author has an hindex of 8, co-authored 8 publications receiving 213 citations. Previous affiliations of Yinting Wang include National University of Singapore.

Papers
More filters
Proceedings ArticleDOI
16 Jun 2012
TL;DR: Though the depth image is noisy, incomplete and low resolution, it facilitates both camera motion estimation and frame warping, which make the video stabilization a much well posed problem.
Abstract: Previous video stabilization methods often employ homographies to model transitions between consecutive frames, or require robust long feature tracks. However, the homography model is invalid for scenes with significant depth variations, and feature point tracking is fragile in videos with textureless objects, severe occlusion or camera rotation. To address these challenging cases, we propose to solve video stabilization with an additional depth sensor such as the Kinect camera. Though the depth image is noisy, incomplete and low resolution, it facilitates both camera motion estimation and frame warping, which make the video stabilization a much well posed problem. The experiments demonstrate the effectiveness of our algorithm.

114 citations

Journal ArticleDOI
TL;DR: This paper proposes the bright channel prior based on the statistics of well-exposed images to estimate the relative exposure in local image regions and shows that the results generated by the exposure correction method are preferred over existing methods.

56 citations

Journal ArticleDOI
TL;DR: This paper proposes a patch-based method to remove the out-of-focus blur of a video and build an all-in-focus video, and employs the idea of a bilateral filter to temporally smooth the reconstructed video.
Abstract: Amateur videos always contain focusing issues. A focusing mistake may produce out-of-focus blur, which seriously degrades the expressive force of the video. In this paper, we propose a patch-based method to remove the out-of-focus blur of a video and build an all-in-focus video. We assume that the out-of-focus blurry region in one frame will be clear in a portion of other frames; thus, the clear corresponding regions can be used to reconstruct the blurry one. We divide each video frame into a grid of patches and track each patch in the surrounding frames. We independently reconstruct each video frame by building a Markov random field model to identify the optimal target patches that are sharp, similar to the original patches, and are coherent with their neighboring patches within the overlapped regions. To recover an all-in-focus video, an iterative framework is utilized, in which the reconstructed video of each iteration is substituted in the next iteration. Finally, we employ the idea of a bilateral filter to temporally smooth the reconstructed video. The experimental results and the comparison with the previous works demonstrate the effectiveness of our method.

19 citations

Proceedings Article
01 Jan 2012
TL;DR: A manifold embedding algorithm to transfer different-sized graphlets into equal length feature vectors and further integrate these feature vectors into a kernel is developed and used to train a SVM classifier for aerial image categories recognition.
Abstract: This paper presents a method for recognizing aerial image categories based on matching graphlets(i.e., small connected subgraphs) extracted from aerial images. By constructing a Region Adjacency Graph (RAG) to encode the geometric property and the color distribution of each aerial image, we cast aerial image category recognition as RAG-to-RAG matching. Based on graph theory, RAG-to-RAG matching is conducted by matching all their respective graphlets. Towards an effective graphlet matching process, we develop a manifold embedding algorithm to transfer different-sized graphlets into equal length feature vectors and further integrate these feature vectors into a kernel. This kernel is used to train a SVM [8] classifier for aerial image categories recognition. Experimental results demonstrate our method outperforms several state-of-the-art object/scene recognition models.

17 citations

Journal ArticleDOI
TL;DR: This article investigates human-scenery positional relationships and construct a photographic assistance system to optimize the position of human subjects in a given background scene, thereby assisting the user in capturing high-quality souvenir photos.
Abstract: People often take photographs at tourist sites and these pictures usually have two main elements: a person in the foreground and scenery in the background. This type of “souvenir photo” is one of the most common photos clicked by tourists. Although algorithms that aid a user-photographer in taking a well-composed picture of a scene exist [Ni et al. 2013], few studies have addressed the issue of properly positioning human subjects in photographs. In photography, the common guidelines of composing portrait images exist. However, these rules usually do not consider the background scene. Therefore, in this article, we investigate human-scenery positional relationships and construct a photographic assistance system to optimize the position of human subjects in a given background scene, thereby assisting the user in capturing high-quality souvenir photos. We collect thousands of well-composed portrait photographs to learn human-scenery aesthetic composition rules. In addition, we define a set of negative rules to exclude undesirable compositions. Recommendation results are achieved by combining the first learned positive rule with our proposed negative rules. We implement the proposed system on an Android platform in a smartphone. The system demonstrates its efficacy by producing well-composed souvenir photos.

14 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A fusion-based method for enhancing various weakly illuminated images that requires only one input to obtain the enhanced image and represents a trade-off among detail enhancement, local contrast improvement and preserving the natural feel of the image.

464 citations

Journal ArticleDOI
21 Jul 2013
TL;DR: A novel video stabilization method which models camera motion with a bundle of (multiple) camera paths based on a mesh-based, spatially-variant motion representation and an adaptive, space-time path optimization and introduces the 'as-similar-as-possible' idea to make motion estimation more robust.
Abstract: We present a novel video stabilization method which models camera motion with a bundle of (multiple) camera paths. The proposed model is based on a mesh-based, spatially-variant motion representation and an adaptive, space-time path optimization. Our motion representation allows us to fundamentally handle parallax and rolling shutter effects while it does not require long feature trajectories or sparse 3D reconstruction. We introduce the 'as-similar-as-possible' idea to make motion estimation more robust. Our space-time path smoothing adaptively adjusts smoothness strength by considering discontinuities, cropping size and geometrical distortion in a unified optimization framework. The evaluation on a large variety of consumer videos demonstrates the merits of our method.

315 citations

Journal Article
TL;DR: In this article, a generalized equation is proposed to represent a continuum of surface reconstruction solutions of a given non-integrable gradient field, where the range of solutions is related to the degree of anisotropy in applying weights to the gradient in the integration process.
Abstract: We propose a generalized equation to represent a continuum of surface reconstruction solutions of a given non-integrable gradient field. We show that common approaches such as Poisson solver and Frankot-Chellappa algorithm are special cases of this generalized equation. For a N x N pixel grid, the subspace of all integrable gradient fields is of dimension N 2 - 1. Our framework can be applied to derive a range of meaningful surface reconstructions from this high dimensional space. The key observation is that the range of solutions is related to the degree of anisotropy in applying weights to the gradients in the integration process. While common approaches use isotropic weights, we show that by using a progression of spatially varying anisotropic weights, we can achieve significant improvement in reconstructions. We propose (a) α-surfaces using binary weights, where the parameter a allows trade off between smoothness and robustness, (b) M-estimators and edge preserving regularization using continuous weights and (c) Diffusion using affine transformation of gradients. We provide results on photometric stereo, compare with previous approaches and show that anisotropic treatment discounts noise while recovering salient features in reconstructions.

313 citations

Proceedings ArticleDOI
10 Jul 2014
TL;DR: This paper proposes a saliency detection method using the additional depth information and saliency cues are provided to follow the laws of the visually salient stimuli in both color and depth spaces.
Abstract: Human vision system understands the environment from 3D perception However, most existing saliency detection algorithms detect the salient foreground based on 2D image information In this paper, we propose a saliency detection method using the additional depth information In our method, saliency cues are provided to follow the laws of the visually salient stimuli in both color and depth spaces Simultaneously, the 'center bias' is also extended to 'spatial' bias to represent the nature advantage in 3D image In addition, We build a dataset to test our method and the experiments demonstrate that the depth information is useful for extracting the salient object from the complex scenes

293 citations

Journal ArticleDOI
TL;DR: Experimental results indicate that this method outperforms several state-of-the-art object/scene recognition models, and the visualized graphlets indicate that the discriminative patterns are discovered by the proposed approach.
Abstract: Recognizing aerial image categories is useful for scene annotation and surveillance. Local features have been demonstrated to be robust to image transformations, including occlusions and clutters. However, the geometric property of an aerial image (i.e., the topology and relative displacement of local features), which is key to discriminating aerial image categories, cannot be effectively represented by state-of-the-art generic visual descriptors. To solve this problem, we propose a recognition model that mines graphlets from aerial images, where graphlets are small connected subgraphs reflecting both the geometric property and color/texture distribution of an aerial image. More specifically, each aerial image is decomposed into a set of basic components (e.g., road and playground) and a region adjacency graph (RAG) is accordingly constructed to model their spatial interactions. Aerial image categories recognition can subsequently be casted as RAG-to-RAG matching. Based on graph theory, RAG-to-RAG matching is conducted by comparing all their respective graphlets. Because the number of graphlets is huge, we derive a manifold embedding algorithm to measure different-sized graphlets, after which we select graphlets that have highly discriminative and low redundancy topologies. Through quantizing the selected graphlets from each aerial image into a feature vector, we use support vector machine to discriminate aerial image categories. Experimental results indicate that our method outperforms several state-of-the-art object/scene recognition models, and the visualized graphlets indicate that the discriminative patterns are discovered by our proposed approach.

207 citations