scispace - formally typeset
Search or ask a question

Showing papers on "Real image published in 2013"


Patent
18 Dec 2013
TL;DR: An augmented reality image display system may be implemented together with a surgical robot system as discussed by the authors, where a slave system performing a surgical operation, a master system controlling the surgical operation of the slave system, and an imaging system generating a virtual image of the inside of a patient's body.
Abstract: An augmented reality image display system may be implemented together with a surgical robot system. The surgical robot system may include a slave system performing a surgical operation, a master system controlling the surgical operation of the slave system, an imaging system generating a virtual image of the inside of a patient's body, and an augmented reality image display system including a camera capturing a real image having a plurality of markers attached to the patient's body or a human body model. The augmented reality image system may include an augmented reality image generator which detects the plurality of markers in the real image, estimates the position and gaze direction of the camera using the detected markers, and generates an augmented reality image by overlaying a region of the virtual image over the real image, and a display which displays the augmented reality image.

142 citations


Journal ArticleDOI
TL;DR: This work proposes a new method for automatic radial distortion estimation based on the plumb-line approach that works from a single image and does not require a special calibration pattern, and performs an extensive empirical study of the method on synthetic images.
Abstract: Many computer vision algorithms rely on the assumptions of the pinhole camera model, but lens distortion with off-the-shelf cameras is usually significant enough to violate this assumption. Many methods for radial distortion estimation have been proposed, but they all have limitations. Robust automatic radial distortion estimation from a single natural image would be extremely useful for many applications, particularly those in human-made environments containing abundant lines. For example, it could be used in place of an extensive calibration procedure to get a mobile robot or quadrotor experiment up and running quickly in an indoor environment. We propose a new method for automatic radial distortion estimation based on the plumb-line approach. The method works from a single image and does not require a special calibration pattern. It is based on Fitzgibbon's division model, robust estimation of circular arcs, and robust estimation of distortion parameters. We perform an extensive empirical study of the method on synthetic images. We include a comparative statistical analysis of how different circle fitting methods contribute to accurate distortion parameter estimation. We finally provide qualitative results on a wide variety of challenging real images. The experiments demonstrate the method's ability to accurately identify distortion parameters and remove distortion from images.

132 citations


Journal ArticleDOI
TL;DR: Experimental results on real images demonstrate that the proposed novel filter for edge-preserving decomposition of an image is especially effective at preserving or enhancing local details.
Abstract: A novel filter is proposed for edge-preserving decomposition of an image. It is different from previous filters in its locally adaptive property. The filtered image contains local means everywhere and preserves local salient edges. Comparisons are made between our filtered result and the results of three other methods. A detailed analysis is also made on the behavior of the filter. A multiscale decomposition with this filter is proposed for manipulating a high dynamic range image, which has three detail layers and one base layer. The multiscale decomposition with the filter addresses three assumptions: 1) the base layer preserves local means everywhere; 2) every scale's salient edges are relatively large gradients in a local window; and 3) all of the nonzero gradient information belongs to the detail layer. An effective function is also proposed for compressing the detail layers. The reproduced image gives a good visualization. Experimental results on real images demonstrate that our algorithm is especially effective at preserving or enhancing local details.

129 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: This paper addresses the problem of restoring images subjected to unknown and spatially varying blur caused by defocus or linear motion using a robust (non-uniform) deblurring algorithm based on sparse regularization with global image statistics.
Abstract: This paper addresses the problem of restoring images subjected to unknown and spatially varying blur caused by defocus or linear (say, horizontal) motion. The estimation of the global (non-uniform) image blur is cast as a multi-label energy minimization problem. The energy is the sum of unary terms corresponding to learned local blur estimators, and binary ones corresponding to blur smoothness. Its global minimum is found using Ishikawa's method by exploiting the natural order of discretized blur values for linear motions and defocus. Once the blur has been estimated, the image is restored using a robust (non-uniform) deblurring algorithm based on sparse regularization with global image statistics. The proposed algorithm outputs both a segmentation of the image into uniform-blur layers and an estimate of the corresponding sharp image. We present qualitative results on real images, and use synthetic data to quantitatively compare our approach to the publicly available implementation of Chakrabarti~et al.

104 citations


Journal ArticleDOI
TL;DR: A fractional order total variation (TV) regularization functional for image super-resolution is presented, the role of which is to better handle the texture details of image to efficiently preserve the discontinuities and image structures.

104 citations


Journal ArticleDOI
Lianru Gao, Qian Du1, Bing Zhang, Wei Yang, Yuanfeng Wu 
TL;DR: A comparative study for linear regression-based algorithms for noise estimation for hyperspectral images using simulated images with different signal-to-noise ratio (SNR) and real image types, concluding instructive guidance is concluded for their practical applications.
Abstract: In the traditional signal model, signal is assumed to be deterministic, and noise is assumed to be random, additive and uncorrelated to the signal component. A hyperspectral image has high spatial and spectral correlation, and a pixel can be well predicted using its spatial and/or spectral neighbors; any prediction error can be considered from noise. Using this concept, several algorithms have been developed for noise estimation for hyperspectral images. However, these algorithms have not been rigorously analyzed with a unified scheme. In this paper, we conduct a comparative study for such linear regression-based algorithms using simulated images with different signal-to-noise ratio (SNR) and real images with different land cover types. Based on experimental results, instructive guidance is concluded for their practical applications.

95 citations


Journal ArticleDOI
Xinxing Xia1, Xu Liu1, Haifeng Li1, Zhenrong Zheng1, Han Wang1, Yifan Peng1, Shen Weidong1 
TL;DR: Using light field reconstruction technique, a floating 3D scene in the air is displayed, which is 360-degree surrounding viewable with correct occlusion effect, and the experimental results verified the representability of this method.
Abstract: Using light field reconstruction technique, we can display a floating 3D scene in the air, which is 360-degree surrounding viewable with correct occlusion effect. A high-frame-rate color projector and flat light field scanning screen are used in the system to create the light field of real 3D scene in the air above the spinning screen. The principle and display performance of this approach are investigated in this paper. The image synthesis method for all the surrounding viewpoints is analyzed, and the 3D spatial resolution and angular resolution of the common display zone are employed to evaluate display performance. The prototype is achieved and the real 3D color animation image has been presented vividly. The experimental results verified the representability of this method.

88 citations


Journal ArticleDOI
TL;DR: It is proved that such nonlinearity can cause large errors around edges when directly applying deconvolution to a motion blurred image without CRF correction, and introduced two methods to estimate the CRF directly from one or more blurred images when the PSF is known or unknown.
Abstract: This paper investigates the role that nonlinear camera response functions (CRFs) have on image deblurring. We present a comprehensive study to analyze the effects of CRFs on motion deblurring. In particular, we show how nonlinear CRFs can cause a spatially invariant blur to behave as a spatially varying blur. We prove that such nonlinearity can cause large errors around edges when directly applying deconvolution to a motion blurred image without CRF correction. These errors are inevitable even with a known point spread function (PSF) and with state-of-the-art regularization-based deconvolution algorithms. In addition, we show how CRFs can adversely affect PSF estimation algorithms in the case of blind deconvolution. To help counter these effects, we introduce two methods to estimate the CRF directly from one or more blurred images when the PSF is known or unknown. Our experimental results on synthetic and real images validate our analysis and demonstrate the robustness and accuracy of our approaches.

77 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work uses linear programming (LP) to identify a minimal set of least-violated connectivity constraints that are sufficient to unambiguously reconstruct the 3D lines.
Abstract: We propose a novel and an efficient method for reconstructing the 3D arrangement of lines extracted from a single image, using vanishing points, orthogonal structure, and an optimization procedure that considers all plausible connectivity constraints between lines. Line detection identifies a large number of salient lines that intersect or nearly intersect in an image, but relatively a few of these apparent junctions correspond to real intersections in the 3D scene. We use linear programming (LP) to identify a minimal set of least-violated connectivity constraints that are sufficient to unambiguously reconstruct the 3D lines. In contrast to prior solutions that primarily focused on well-behaved synthetic line drawings with severely restricting assumptions, we develop an algorithm that can work on real images. The algorithm produces line reconstruction by identifying 95% correct connectivity constraints in York Urban database, with a total computation time of 1 second per image.

77 citations


Journal ArticleDOI
TL;DR: A new template‐based template matching approach for segmenting cell nuclei from microscopy images presents increased robustness in the sense of better handling variations in illumination, variations in texture from different imaging modalities, providing more smooth and accurate segmentation borders, as well as handling better cluttered nuclei.
Abstract: We describe a new supervised learning-based template matching approach for segmenting cell nuclei from microscopy images. The method uses examples selected by a user for building a statistical model that captures the texture and shape variations of the nuclear structures from a given dataset to be segmented. Segmentation of subsequent, unlabeled, images is then performed by finding the model instance that best matches (in the normalized cross correlation sense) local neighborhood in the input image. We demonstrate the application of our method to segmenting nuclei from a variety of imaging modalities, and quantitatively compare our results to several other methods. Quantitative results using both simulated and real image data show that, while certain methods may work well for certain imaging modalities, our software is able to obtain high accuracy across several imaging modalities studied. Results also demonstrate that, relative to several existing methods, the template-based method we propose presents increased robustness in the sense of better handling variations in illumination, variations in texture from different imaging modalities, providing more smooth and accurate segmentation borders, as well as handling better cluttered nuclei. © 2013 International Society for Advancement of Cytometry

74 citations


Journal ArticleDOI
TL;DR: In this paper, the radial orientation of identical plasmonic dipoles is controlled by an ultrathin flat lens such that a beam can be focused into a 3D spot either in a real or virtual focal plane, which can be reversed via manipulation of the circularly polarized status of the incident light
Abstract: Metasurfaces with interfacial phase discontinuities provide a unique platform for manipulating light propagation both in free space and along a surface Three-dimensional focusing of visible light is experimentally exhibited as a bi-functional phenomenon by controlling the radial orientation of identical plasmonic dipoles, generating a desired phase profile along the interface With this technique, the in-plane and out-of-plane refractions are manipulated by an ultrathin flat lens such that a beam can be focused into a 3D spot either in a real or virtual focal plane, which can be reversed via manipulation of the circularly polarized status of the incident light Both the inverted real image and the upright virtual image of an arbitrary object are experimentally demonstrated using the same flat lens in the visible range, which paves the way towards robust application of phase discontinuity devices

Journal ArticleDOI
TL;DR: In this paper, the spectral information from subpixel shifted remote sensing images (SSRSI) is incorporated into the likelihood energy function of MRF to provide multiple spectral constraints, which can generate the most accurate SPM results among these methods.
Abstract: Subpixel mapping (SPM) is a promising technique to increase the spatial resolution of land cover maps. Markov random field (MRF)-based SPM has the advantages of considering spatial and spectral constraints simultaneously. In the conventional MRF, only the spectral information of one observed coarse spatial resolution image is utilized, which limits the SPM accuracy. In this letter, supplementary information from subpixel shifted remote sensing images (SSRSI) is used with MRF to produce more accurate SPM results. That is, spectral information from SSRSI is incorporated into the likelihood energy function of MRF to provide multiple spectral constraints. Simulated and real images were tested with the subpixel/pixel spatial attraction model, Hopfield neural networks (HNNs), HNN with SSRSI, image interpolation then hard classification, conventional MRF, and proposed MRF with SSRSI based SPM methods. Results showed that the proposed method can generate the most accurate SPM results among these methods.

Patent
27 Nov 2013
TL;DR: In this article, a user activation of an image capture function of a mobile device is received, and two or more camera lenses are approximately concurrently activated, and a front-side image from a first camera lens and a rear-side view from a second camera lens are optically captured.
Abstract: A user activation of an image capture function of a mobile device is received. The image capture function is for a surround image mode. Two or more camera lenses are approximately concurrently activated. Responsive to activating the lenses, a front-side image from a first camera lens and a rear-side image from a second camera lens are optically captured. Content from the front-side image and content from the rear-side are recorded in a non-transitory storage medium of the mobile device within a single file for a surround mode image.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This paper studies the geometry problems of estimating camera pose with unknown focal length using combination of geometric primitives, and develops efficient polynomial solvers for each of the derived cases with different combinations of primitives.
Abstract: In this paper, we study the geometry problems of estimating camera pose with unknown focal length using combination of geometric primitives. We consider points, lines and also rich features such as quivers, i.e.\ points with one or more directions. We formulate the problems as polynomial systems where the constraints for different primitives are handled in a unified way. We develop efficient polynomial solvers for each of the derived cases with different combinations of primitives. The availability of these solvers enables robust pose estimation with unknown focal length for wider classes of features. Such rich features allow for fewer feature correspondences and generate larger inlier sets with higher probability. We demonstrate in synthetic experiments that our solvers are fast and numerically stable. For real images, we show that our solvers can be used in RANSAC loops to provide good initial solutions.

Journal ArticleDOI
TL;DR: This paper presents an approach to extract curvilinear structures (lines) and their widths from two-dimensional images with high accuracy and shows that very accurate results can be achieved on real data if the appropriate line model is used.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work presents a framework to super-resolve planar regions found in urban scenes and other man-made environments by taking into account their 3D geometry by using recently developed tools based on convex optimization to learn a transform that maps the image to a domain where its gradient has a simple group-sparse structure.
Abstract: We present a framework to super-resolve planar regions found in urban scenes and other man-made environments by taking into account their 3D geometry. Such regions have highly structured straight edges, but this prior is challenging to exploit due to deformations induced by the projection onto the imaging plane. Our method factors out such deformations by using recently developed tools based on convex optimization to learn a transform that maps the image to a domain where its gradient has a simple group-sparse structure. This allows to obtain a novel convex regularizer that enforces global consistency constraints between the edges of the image. Computational experiments with real images show that this data-driven approach to the design of regularizers promoting transform-invariant group sparsity is very effective at high super-resolution factors. We view our approach as complementary to most recent super-resolution methods, which tend to focus on hallucinating high-frequency textures.

Journal ArticleDOI
TL;DR: A method to estimate camera spectral sensitivities and white balance setting jointly from images with sky regions is introduced, which uses sky images without additional hardware, assuming the geolocation of the captured sky is known.
Abstract: Photometric camera calibration is often required in physics-based computer vision. There have been a number of studies to estimate camera response functions (gamma function), and vignetting effect from images. However less attention has been paid to camera spectral sensitivities and white balance settings. This is unfortunate, since those two properties significantly affect image colors. Motivated by this, a method to estimate camera spectral sensitivities and white balance setting jointly from images with sky regions is introduced. The basic idea is to use the sky regions to infer the sky spectra. Given sky images as the input and assuming the sun direction with respect to the camera viewing direction can be extracted, the proposed method estimates the turbidity of the sky by fitting the image intensities to a sky model. Subsequently, it calculates the sky spectra from the estimated turbidity. Having the sky $$RGB$$ RGB values and their corresponding spectra, the method estimates the camera spectral sensitivities together with the white balance setting. Precomputed basis functions of camera spectral sensitivities are used in the method for robust estimation. The whole method is novel and practical since, unlike existing methods, it uses sky images without additional hardware, assuming the geolocation of the captured sky is known. Experimental results using various real images show the effectiveness of the method.

Proceedings ArticleDOI
23 Jun 2013
TL;DR: A simple algorithm of auto-calibration from separable homogeneous specular reflection of real images is developed, which takes a holistic approach to exploiting reflectance symmetry and produces superior results.
Abstract: Under unknown directional lighting, the uncalibrated Lambertian photometric stereo algorithm recovers the shape of a smooth surface up to the generalized bas-relief (GBR) ambiguity. We resolve this ambiguity from the half vector symmetry, which is observed in many isotropic materials. Under this symmetry, a 2D BRDF slice with low-rank structure can be obtained from an image, if the surface normals and light directions are correctly recovered. In general, this structure is destroyed by the GBR ambiguity. As a result, we can resolve the ambiguity by restoring this structure. We develop a simple algorithm of auto-calibration from separable homogeneous specular reflection of real images. Compared with previous methods, this method takes a holistic approach to exploiting reflectance symmetry and produces superior results.

Proceedings ArticleDOI
23 Dec 2013
TL;DR: This study addresses the problem of geometric consistency between displayed images and real scenes in augmented reality using a video see-through hand-held display or tablet using approximated user-perspective images rendered by homography transformation of camera images.
Abstract: This study addresses the problem of geometric consistency between displayed images and real scenes in augmented reality using a video see-through hand-held display or tablet. To solve this problem, we present approximated user-perspective images rendered by homography transformation of camera images. Homography approximation has major advantages not only in terms of computational costs, but also in the quality of image rendering. However, it can lead to an inconsistency between the real image and virtual objects. This study also introduces a variety of rendering methods for virtual objects and discusses the differences between them. We implemented two prototypes and designed three types of user studies on matching tasks between real scenes and displayed images. We have confirmed that the proposed method works in real time on an off-the-shelf tablet. Our pilot tests show the potential to improve users' visibility, even in real environments, by using our method.

Journal ArticleDOI
TL;DR: A new method for determining the intrinsic dimension of a hyperspectral image using recent advances in random matrix theory is discussed, entirely unsupervised, free from any user-determined parameters and allows spectrally correlated noise in the data.
Abstract: Determining the intrinsic dimension of a hyperspectral image is an important step in the spectral unmixing process and under- or overestimation of this number may lead to incorrect unmixing in unsupervised methods. In this paper, we discuss a new method for determining the intrinsic dimension using recent advances in random matrix theory. This method is entirely unsupervised, free from any user-determined parameters and allows spectrally correlated noise in the data. Robustness tests are run on synthetic data, to determine how the results were affected by noise levels, noise variability, noise approximation, and spectral characteristics of the end-members. Success rates are determined for many different synthetic images, and the method is tested on two pairs of real images, namely a Cuprite scene taken from Airborne Visible InfraRed Imaging Spectrometer (AVIRIS) and SpecTIR sensors, and a Lunar Lakes scene taken from AVIRIS and Hyperion, with good results.

Journal ArticleDOI
TL;DR: A novel fuzzy energy minimization method for simultaneous segmentation and bias field estimation of medical images by incorporating spatial information into the membership function using the spatial function which is the summation of the membership functions in the neighborhood of each pixel under consideration.
Abstract: This paper presents a novel fuzzy energy minimizationmethod for simultaneous segmentation and bias field estimation of medical images. We first define an objective function based on a localized fuzzy c-means (FCM) clustering for the image intensities in a neighborhood around each point. Then, this objective function is integrated with respect to the neighborhood center over the entire image domain to formulate a global fuzzy energy, which depends on membership functions, a bias field that accounts for the intensity inhomogeneity, and the constants that approximate the true intensities of the corresponding tissues. Therefore, segmentation and bias field estimation are simultaneously achieved by minimizing the global fuzzy energy. Besides, to reduce the impact of noise, the proposed algorithm incorporates spatial information into the membership function using the spatial function which is the summation of the membership functions in the neighborhood of each pixel under consideration. Experimental results on synthetic and real images are given to demonstrate the desirable performance of the proposed algorithm.

Patent
22 Nov 2013
TL;DR: In this article, a video display method with which visible-light-communication signals can be appropriately transmitted includes: a step (SL21) in which a striped pattern image is generated as a visiblelight communication image; a step(SL22) in calculating the average brightness of the visible-light communication image is calculated; an image superimposed on only a divided image displayed in a specific subframe from among at least one subframe in which divided images configuring a video signal image is respectively displayed.
Abstract: A video display method with which visible-light-communication signals can be appropriately transmitted includes: a step (SL21) in which a striped pattern image is generated as a visible-light-communication image; a step (SL22) in which the average brightness of the visible-light-communication image is calculated; a step (SL23) in which a visible-light superimposed image is generated by superimposing the visible-light-communication image on only a divided image displayed in a specific subframe from among at least one subframe in which divided images configuring a video signal image displayed in a frame are respectively displayed, said specific subframe being for expressing the calculated average brightness; and a step (SL24) in which the visible-light superimposed image is displayed in the specific subframe.

Patent
07 May 2013
TL;DR: In this paper, a computer-based method is proposed to compare a 3D model of a planned layout of the environment to the received image data, and determine whether results of the comparison reach a first particular threshold, and if so, then output a pertinent indication, alert, or the like.
Abstract: In one embodiment, a computer-based method includes receiving image data of a real world layout. The image data reflects the real world layout across three dimensions (e.g., vertical, horizontal, and orthogonal). Each of the three dimensions has an image range with a beginning and an end. The real world layout has inventory of products distributed across the three dimensions. The computer-based method further includes comparing a 3D model of a planned layout of the environment to the received image data. The 3D model represents the planned layout across the three dimensions. The computer-based method further includes determining whether results of the comparison reach a first particular threshold, and if so, then output a pertinent indication, alert, or the like.

Patent
28 Jun 2013
TL;DR: In this paper, the authors described a system for detecting defective camera arrays, optic arrays and/or sensors. But their method is limited to the case when the number of localized defects in a specific set of image regions exceeds a predetermined threshold, where the specific set is formed by a common corresponding image region from at least a subset of the captured images.
Abstract: Systems and methods for detecting defective camera arrays, optic arrays and/or sensors are described. One embodiment includes capturing image data using a camera array; dividing the captured images into a plurality of corresponding image regions; identifying the presence of localized defects in any of the cameras by evaluating the image regions in the captured images; and detecting a defective camera array using the image processing system when the number of localized defects in a specific set of image regions exceeds a predetermined threshold, where the specific set of image regions is formed by: a common corresponding image region from at least a subset of the captured images; and any additional image region in a given image that contains at least one pixel located within a predetermined maximum parallax shift distance along an epipolar line from a pixel within said common corresponding image region within the given image.

Journal ArticleDOI
01 Jan 2013
TL;DR: A regularization approach based on second order derivative of both simulated and real images with highly undersampled data, obtaining a good reconstruction accuracy in an inverse reconstruction problem of Magnetic Resonance Imaging with few acquired body scanner samples is proposed.
Abstract: In this paper we investigate an inverse reconstruction problem of Magnetic Resonance Imaging with few acquired body scanner samples. The missing information in the Fourier domain causes image artefacts, therefore iterative computationally expensive recovery techniques are needed. We propose a regularization approach based on second order derivative of both simulated and real images with highly undersampled data, obtaining a good reconstruction accuracy. Moreover, an accelerated regularization algorithm, by using a projection technique combined with an implementation on Graphics Processing Unit (GPU) computing environment, is presented. The numerical experiments give clinically-feasible reconstruction runtimes with an increase in speed and accuracy of the MRI dataset reconstructions.

Journal ArticleDOI
TL;DR: Two enhanced Fuzzy C-Means (FCM) clustering algorithms with spatial constraints for noisy color image segmentation with robustness and effectiveness in the presence and absence of noise are introduced.

Journal ArticleDOI
TL;DR: The R-DRLSE model is a variational level set approach that utilizes the region information to find image contours by minimizing the presented energy functional in order to avoid the time-consuming re-initialization step.
Abstract: In this paper, a novel active contour model (R-DRLSE model) based on level set method is proposed for image segmentation. The R-DRLSE model is a variational level set approach that utilizes the region information to find image contours by minimizing the presented energy functional. To avoid the time-consuming re-initialization step, the distance regularization term is used to penalize the deviation of the level set function from a signed distance function. The numerical implementation scheme of the model can significantly reduce the iteration number and computation time. The results of experiments performed on some synthetic and real images show that the R-DRLSE model is effective and efficient. In particular, our method has been applied to MR kidney image segmentation with desirable results.

Journal ArticleDOI
01 Jul 2013
TL;DR: This paper proposes a highly optimized approach to histogram calculation that uses histogram replication for eliminating position conflicts, padding to reduce bank conflicts, and an improved access to input data called interleaved read access.
Abstract: A histogram is a compact representation of the distribution of data in an image with a full range of applications in diverse fields. Histogram generation is an inherently sequential operation where every pixel votes in a reduced set of bins. This makes finding efficient parallel implementations very desirable but challenging, because on graphics processing units thousands of threads may be atomically updating a short number of histogram bins. Under these circumstances, collisions among threads will be very frequent and such collisions will serialize thread execution, seriously damaging the performance. In this paper we propose a highly optimized approach to histogram calculation, which tackles such performance bottlenecks. It uses histogram replication for eliminating position conflicts, padding to reduce bank conflicts, and an improved access to input data called interleaved read access. Our so-called $${\mathcal{R}}$$ -per-block approach to histogram calculation has been successfully compared to the main state-of-the-art works using four histogram-based image processing kernels and two real image databases. Results show that our proposal is between 1.4 and 15.7 faster than every previous implementation for histograms of up to 4,096 bins.

Journal ArticleDOI
TL;DR: This work proposes a method to simultaneously estimate a person’s clothed and naked shapes from a single image of that person wearing clothing, using a deformable model of clothed human shape.
Abstract: Estimation of human shape from images has numerous applications ranging from graphics to surveillance. A single image provides insufficient constraints (e.g. clothing), making human shape estimation more challenging. We propose a method to simultaneously estimate a person's clothed and naked shapes from a single image of that person wearing clothing. The key component of our method is a deformable model of clothed human shape. We learn our deformable model, which spans variations in pose, body, and clothes, from a training dataset. These variations are derived by the non-rigid surface deformation, and encoded in various low-dimension parameters. Our deformable model can be used to produce clothed 3D meshes for different people in different poses, which neither appears in the training dataset. Afterward, given an input image, our deformable model is initialized with a few user-specified 2D joints and contours of the person. We optimize the parameters of the deformable model by pose fitting and body fitting in an iterative way. Then the clothed and naked 3D shapes of the person can be obtained simultaneously. We illustrate our method for texture mapping and animation. The experimental results on real images demonstrate the effectiveness of our method.

Journal ArticleDOI
TL;DR: This paper presents a new algorithm that combines the benefits of both appearance-based and geometry-based methods and mathematically guarantees a global optimization and generalizes it in the context of data correspondence/grouping under an unknown parametric model and shows it can be applied to certain classes of computer vision problems.
Abstract: Data correspondence/grouping under an unknown parametric model is a fundamental topic in computer vision. Finding feature correspondences between two images is probably the most popular application of this research field, and is the main motivation of our work. It is a key ingredient for a wide range of vision tasks, including three-dimensional reconstruction and object recognition. Existing feature correspondence methods are based on either local appearance similarity or global geometric consistency or a combination of both in some heuristic manner. None of these methods is fully satisfactory, especially in the presence of repetitive image textures or mismatches. In this paper, we present a new algorithm that combines the benefits of both appearance-based and geometry-based methods and mathematically guarantees a global optimization. Our algorithm accepts the two sets of features extracted from two images as input, and outputs the feature correspondences with the largest number of inliers, which verify both the appearance similarity and geometric constraints. Specifically, we formulate the problem as a mixed integer program and solve it efficiently by a series of linear programs via a branch-and-bound procedure. We subsequently generalize our framework in the context of data correspondence/grouping under an unknown parametric model and show it can be applied to certain classes of computer vision problems. Our algorithm has been validated successfully on synthesized data and challenging real images.