scispace - formally typeset
Search or ask a question

Showing papers on "Real image published in 2015"


Journal ArticleDOI
TL;DR: Experimental results show that the proposed AB-SIFT matching method is more robust and accurate than state-of-the-art methods, including the SIFT, DAISY, the gradient location and orientation histogram, the local intensity order pattern, and the binary robust invariant scale keypoint.
Abstract: Image matching based on local invariant features is crucial for many photogrammetric and remote sensing applications such as image registration and image mosaicking. In this paper, a novel local feature descriptor named adaptive binning scale-invariant feature transform (AB-SIFT) for fully automatic remote sensing image matching that is robust to local geometric distortions is proposed. The main idea of the proposed method is an adaptive binning strategy to compute the local feature descriptor. The proposed descriptor is computed on a normalized region defined by an improved version of the prominent Hessian affine feature extraction algorithm called the uniform robust Hessian affine algorithm. Unlike common distribution-based descriptors, the proposed descriptor uses an adaptive histogram quantization strategy for both location and gradient orientations, which is robust and actually resistant to a local viewpoint distortion and extremely increases the discriminability and robustness of the final AB-SIFT descriptor. In addition to the SIFT descriptor, the proposed adaptive quantization strategy can be easily extended for other distribution-based descriptors. Experimental results on both synthetic and real image pairs show that the proposed AB-SIFT matching method is more robust and accurate than state-of-the-art methods, including the SIFT, DAISY, the gradient location and orientation histogram, the local intensity order pattern, and the binary robust invariant scale keypoint.

174 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: The strategy is to learn a convolutional neural network that directly predicts output albedo and shading channels from an input RGB image patch, which outperforms all prior work, including methods that rely on RGB+Depth input.
Abstract: We introduce a new approach to intrinsic image decomposition, the task of decomposing a single image into albedo and shading components. Our strategy, which we term direct intrinsics, is to learn a convolutional neural network (CNN) that directly predicts output albedo and shading channels from an input RGB image patch. Direct intrinsics is a departure from classical techniques for intrinsic image decomposition, which typically rely on physically-motivated priors and graph-based inference algorithms. The large-scale synthetic ground-truth of the MPI Sintel dataset plays the key role in training direct intrinsics. We demonstrate results on both the synthetic images of Sintel and the real images of the classic MIT intrinsic image dataset. On Sintel, direct intrinsics, using only RGB input, outperforms all prior work, including methods that rely on RGB+Depth input. Direct intrinsics also generalizes across modalities, our Sintel-trained CNN produces quite reasonable decompositions on the real images of the MIT dataset. Our results indicate that the marriage of CNNs with synthetic training data may be a powerful new technique for tackling classic problems in computer vision.

140 citations


Proceedings ArticleDOI
07 Jun 2015
TL;DR: This work develops an improved technique for local shape estimation from defocus and correspondence cues, and shows how shading can be used to further refine the depth, and proposes a new framework that uses angular coherence to optimize depth and shading.
Abstract: Light-field cameras are now used in consumer and industrial applications. Recent papers and products have demonstrated practical depth recovery algorithms from a passive single-shot capture. However, current light-field capture devices have narrow baselines and constrained spatial resolution; therefore, the accuracy of depth recovery is limited, requiring heavy regularization and producing planar depths that do not resemble the actual geometry. Using shading information is essential to improve the shape estimation. We develop an improved technique for local shape estimation from defocus and correspondence cues, and show how shading can be used to further refine the depth. Light-field cameras are able to capture both spatial and angular data, suitable for refocusing. By locally refocusing each spatial pixel to its respective estimated depth, we produce an all-in-focus image where all viewpoints converge onto a point in the scene. Therefore, the angular pixels have angular coherence, which exhibits three properties: photo consistency, depth consistency, and shading consistency. We propose a new framework that uses angular coherence to optimize depth and shading. The optimization framework estimates both general lighting in natural scenes and shading to improve depth regularization. Our method outperforms current state-of-the-art light-field depth estimation algorithms in multiple scenarios, including real images.

126 citations


Posted Content
TL;DR: In this article, a convolutional neural network (CNN) is used to directly predict output albedo and shading channels from an input RGB image patch, which is a departure from classical techniques for intrinsic image decomposition.
Abstract: We introduce a new approach to intrinsic image decomposition, the task of decomposing a single image into albedo and shading components. Our strategy, which we term direct intrinsics, is to learn a convolutional neural network (CNN) that directly predicts output albedo and shading channels from an input RGB image patch. Direct intrinsics is a departure from classical techniques for intrinsic image decomposition, which typically rely on physically-motivated priors and graph-based inference algorithms. The large-scale synthetic ground-truth of the MPI Sintel dataset plays a key role in training direct intrinsics. We demonstrate results on both the synthetic images of Sintel and the real images of the classic MIT intrinsic image dataset. On Sintel, direct intrinsics, using only RGB input, outperforms all prior work, including methods that rely on RGB+Depth input. Direct intrinsics also generalizes across modalities; it produces quite reasonable decompositions on the real images of the MIT dataset. Our results indicate that the marriage of CNNs with synthetic training data may be a powerful new technique for tackling classic problems in computer vision.

122 citations


Book ChapterDOI
13 Jan 2015
TL;DR: An entirely non-recursive 2D variational mode decomposition (2D-VMD) model, where the modes are extracted concurrently and the model looks for a number of 2D modes and their respective center frequencies, such that the bandlimited modes reproduce the input image.
Abstract: In this paper we propose a variational method to adaptively decompose an image into few different modes of separate spectral bands, which are unknown before. A popular method for recursive one dimensional signal decomposition is the Empirical Mode Decomposition algorithm, introduced by Huang in the nineties. This algorithm, as well as its 2D extension, though extensively used, suffers from a lack of exact mathematical model, interpolation choice, and sensitivity to both noise and sampling. Other state-of-the-art models include synchrosqueezing, the empirical wavelet transform, and recursive variational decomposition into smooth signals and residuals. Here, we have created an entirely non-recursive 2D variational mode decomposition (2D-VMD) model, where the modes are extracted concurrently. The model looks for a number of 2D modes and their respective center frequencies, such that the bandlimited modes reproduce the input image (exactly or in least-squares sense). Preliminary results show excellent performance on both synthetic and real images. Running this algorithm on a peptide microscopy image yields accurate, timely, and autonomous segmentation - pertinent in the fields of biochemistry and nanoscience.

120 citations


Journal ArticleDOI
TL;DR: It is shown in the context of drone, plane, and car detection that using such synthetically generated images yields significantly better performances than simply perturbing real images or even synthesizing images in such way that they look very realistic, as is often done when only limited amounts of training data are available.

120 citations


Journal ArticleDOI
TL;DR: An approach to perform relative spectral alignment between optical cross-sensor acquisitions by proposing a completely automatic strategy to select the hyperparameters of the system as well as the dimensionality of the transformed (latent) space.
Abstract: In this paper we present an approach to perform relative spectral alignment between optical cross-sensor acquisitions. The proposed method aims at projecting the images from two different and possibly disjoint input spaces into a common latent space, in which standard change detection algorithms can be applied. The system relies on the regularized kernel canonical correlation analysis transformation (kCCA), which can accommodate nonlinear dependencies between pixels by means of kernel functions. To learn the projections, the method employs a subset of samples belonging to the unchanged areas or to uninteresting radiometric differences. Since the availability of ground truth information to perform model selection is limited, we propose a completely automatic strategy to select the hyperparameters of the system as well as the dimensionality of the transformed (latent) space. The proposed scheme is fully automatic and allows the use of any change detection algorithm in the transformed latent space. A synthetic problem built from real images and a case study involving a real cross-sensor change detection problem illustrate the capabilities of the proposed method. Results show that the proposed system outperforms the linear baseline and provides accuracies close the ones obtained with a fully supervised strategy. We provide a MATLAB implementation of the proposed method as well as the real cross-sensor data we prepared and employed at https://sites.google.com/site/michelevolpiresearch/codes/cross-sensor.

104 citations


Journal ArticleDOI
TL;DR: An efficient iterative algorithm is proposed for energy minimization, via which the image segmentation and bias field correction are simultaneously achieved and the smoothness of the obtained optimal bias field is ensured by the normalized convolutions without extra cost.
Abstract: This paper presents a novel variational approach for simultaneous estimation of bias field and segmentation of images with intensity inhomogeneity. We model intensity of inhomogeneous objects to be Gaussian distributed with different means and variances, and then introduce a sliding window to map the original image intensity onto another domain, where the intensity distribution of each object is still Gaussian but can be better separated. The means of the Gaussian distributions in the transformed domain can be adaptively estimated by multiplying the bias field with a piecewise constant signal within the sliding window. A maximum likelihood energy functional is then defined on each local region, which combines the bias field, the membership function of the object region, and the constant approximating the true signal from its corresponding object. The energy functional is then extended to the whole image domain by the Bayesian learning approach. An efficient iterative algorithm is proposed for energy minimization, via which the image segmentation and bias field correction are simultaneously achieved. Furthermore, the smoothness of the obtained optimal bias field is ensured by the normalized convolutions without extra cost. Experiments on real images demonstrated the superiority of the proposed algorithm to other state-of-the-art representative methods.

101 citations


Journal ArticleDOI
TL;DR: A cascaded collaborative regression algorithm is proposed, which generates a cascaded shape updater that has the ability to overcome the difficulties caused by pose variations, as well as achieving better accuracy when applied to real faces.
Abstract: A large amount of training data is usually crucial for successful supervised learning. However, the task of providing training samples is often time-consuming, involving a considerable amount of tedious manual work. In addition, the amount of training data available is often limited. As an alternative, in this paper, we discuss how best to augment the available data for the application of automatic facial landmark detection. We propose the use of a 3D morphable face model to generate synthesized faces for a regression-based detector training. Benefiting from the large synthetic training data, the learned detector is shown to exhibit a better capability to detect the landmarks of a face with pose variations. Furthermore, the synthesized training data set provides accurate and consistent landmarks automatically as compared to the landmarks annotated manually, especially for occluded facial parts. The synthetic data and real data are from different domains; hence the detector trained using only synthesized faces does not generalize well to real faces. To deal with this problem, we propose a cascaded collaborative regression algorithm, which generates a cascaded shape updater that has the ability to overcome the difficulties caused by pose variations, as well as achieving better accuracy when applied to real faces. The training is based on a mix of synthetic and real image data with the mixing controlled by a dynamic mixture weighting schedule. Initially, the training uses heavily the synthetic data, as this can model the gross variations between the various poses. As the training proceeds, progressively more of the natural images are incorporated, as these can model finer detail. To improve the performance of the proposed algorithm further, we designed a dynamic multi-scale local feature extraction method, which captures more informative local features for detector training. An extensive evaluation on both controlled and uncontrolled face data sets demonstrates the merit of the proposed algorithm.

97 citations


Journal ArticleDOI
TL;DR: A novel level set method for complex image segmentation, where the local statistical analysis and global similarity measurement are both incorporated into the construction of energy functional to avoid the time-consuming re-initialization step.

90 citations


01 Jan 2015
TL;DR: A new state-of-the-art performance for cell counting on the standard synthetic image benchmarks is set and the potential of the FCRNs for providing cell detections for overlapping cells is shown.
Abstract: This paper concerns automated cell counting in microscopy images. The approach we take is to adapt Convolutional Neural Networks (CNNs) to regress a cell spatial density map across the image. This is applicable to situations where traditional single-cell segmentation based methods do not work well due to cell clumping or overlap. We make the following contributions: (i) we develop and compare architectures for two Fully Convolutional Regression Networks (FCRNs) for this task; (ii) since the networks are fully convolutional, they can predict a density map for an input image of arbitrary size, and we exploit this to improve efficiency at training time by training end-to-end on image patches; and (iii) we show that FCRNs trained entirely on synthetic data are able to give excellent predictions on real microscopy images without fine-tuning, and that the performance can be further improved by fine-tuning on the real images. We set a new state-of-the-art performance for cell counting on the standard synthetic image benchmarks and, as a side benefit, show the potential of the FCRNs for providing cell detections for overlapping cells.

Journal ArticleDOI
TL;DR: Tunable focusing LC optical elements are promising developments in the thriving field of AR applications and could be further developed for other tunable focusing lenses, even those with a lower lens power.
Abstract: An augmented reality (AR) system involving the electrically tunable location of a projected image is implemented using a liquid-crystal (LC) lens. The projected image is either real or virtual. By effectively doubling the LC lens power following light reflection, the position of a projected virtual image can be made to vary from 42 to 360 cm, while the tunable range for a projected real image is from 27 to 52 cm on the opposite side. The optical principle of the AR system is introduced and could be further developed for other tunable focusing lenses, even those with a lower lens power. The benefits of this study could be extended to head-mounted display systems for vision correction or vision compensation. We believe that tunable focusing LC optical elements are promising developments in the thriving field of AR applications.

Journal ArticleDOI
TL;DR: This paper addresses the problem of extracting a subspace representing the difference components between class subspaces generated from each set of object images independently of each other, and demonstrates validity through shape analysis on synthetic and real images of 3D objects as well as extensive comparison of performance on classification tests.
Abstract: Subspace-based methods are known to provide a practical solution for image set-based object recognition. Based on the insight that local shape differences between objects offer a sensitive cue for recognition, this paper addresses the problem of extracting a subspace representing the difference components between class subspaces generated from each set of object images independently of each other. We first introduce the difference subspace (DS), a novel geometric concept between two subspaces as an extension of a difference vector between two vectors, and describe its effectiveness in analyzing shape differences. We then generalize it to the generalized difference subspace (GDS) for multi-class subspaces, and show the benefit of applying this to subspace and mutual subspace methods, in terms of recognition capability. Furthermore, we extend these methods to kernel DS (KDS) and kernel GDS (KGDS) by a nonlinear kernel mapping to deal with cases involving larger changes in viewing direction. In summary, the contributions of this paper are as follows: 1) a DS/KDS between two class subspaces characterizes shape differences between the two respectively corresponding objects, 2) the projection of an input vector onto a DS/KDS realizes selective visualization of shape differences between objects, and 3) the projection of an input vector or subspace onto a GDS/KGDS is extremely effective at extracting differences between multiple subspaces, and therefore improves object recognition performance. We demonstrate validity through shape analysis on synthetic and real images of 3D objects as well as extensive comparison of performance on classification tests with several related methods; we study the performance in face image classification on the Yale face database B+ and the CMU Multi-PIE database, and hand shape classification of multi-view images.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: This work takes the collection of already reconstructed cameras as a generalized camera, and determines the absolute pose of a candidate pinhole camera from pure 2D correspondences, which is called semi-generalized camera pose problem.
Abstract: This paper proposes a new incremental structure from motion (SfM) algorithm based on a novel structure-less camera resection technique. Traditional methods rely on 2D-3D correspondences to compute the pose of candidate cameras using PnP. In this work, we take the collection of already reconstructed cameras as a generalized camera, and determine the absolute pose of a candidate pinhole camera from pure 2D correspondences, which we call it semi-generalized camera pose problem. We present the minimal solvers of the new problem for both calibrated and partially calibrated (unknown focal length) pinhole cameras. By integrating these new algorithms in an incremental SfM system, we go beyond the state-of-art methods with the capability of reconstructing cameras without 2D-3D correspondences. Large-scale real image experiments show that our new SfM system significantly improves the completeness of 3D reconstruction over the standard approach.

Proceedings ArticleDOI
26 Aug 2015
TL;DR: This work uses a convolutional network to build a deep image representation and an additional fully-connected single layer with soft max regression for classification for Iris spoofing detection, which can achieve a 30% performance gain over SOTA on two public iris image databases for contact lens detection.
Abstract: Spoofing detection is a challenging task in biometric systems, when differentiating illegitimate users from genuine ones. Although iris scans are far more inclusive than fingerprints, and also more precise for person authentication, iris recognition systems are vulnerable to spoofing via textured cosmetic contact lenses. Iris spoofing detection is also referred to as liveness detection (binary classification of fake and real images). In this work, we focus on a three-class detection problem: images with textured (colored) contact lenses, soft contact lenses, and no lenses. Our approach uses a convolutional network to build a deep image representation and an additional fully-connected single layer with soft max regression for classification. Experiments are conducted in comparison with a state-of-the-art approach (SOTA) on two public iris image databases for contact lens detection: 2013 Notre Dame and IIIT-Delhi. Our approach can achieve a 30% performance gain over SOTA on the former database (from 80% to 86%) and comparable results on the latter. Since IIIT-Delhi does not provide segmented iris images and, differently from SOTA, our approach does not segment the iris yet, we conclude that these are very promising results.

Journal ArticleDOI
TL;DR: It is proved that the value of the unique global minimizer for the energy functional is within the interval -1, 1 for any image, and equals to 1 in the object and -1 in the background for an ideal binary image.

Journal ArticleDOI
TL;DR: A novel imaging system that can simultaneously capture the red, green, blue (RGB) and the NIR images with different exposure times and reconstruct a latent color image sequence using an adaptive smoothness condition based on gradient and color correlations is proposed.
Abstract: We propose a novel method to synthesize a noise- and blur-free color image sequence using near-infrared (NIR) images captured in extremely low light conditions. In extremely low light scenes, heavy noise and motion blur are simultaneously produced in the captured images. Our goal is to enhance the color image sequence of an extremely low light scene. In this paper, we augment the imaging system as well as enhancing the image synthesis scheme. We propose a novel imaging system that can simultaneously capture the red, green, blue (RGB) and the NIR images with different exposure times. An RGB image is taken with a long exposure time to acquire sufficient color information and mitigates the effects of heavy noise. By contrast, the NIR images are captured with a short exposure time to measure the structure of the scenes. Our imaging system using different exposure times allows us to ensure sufficient information to reconstruct a clear color image sequence. Using the captured image pairs, we reconstruct a latent color image sequence using an adaptive smoothness condition based on gradient and color correlations. Our experiments using both synthetic images and real image sequences show that our method outperforms other state-of-the-art methods.

Journal ArticleDOI
TL;DR: Comparisons with the recently popular local binary fitting (LBF) model and local Chan-Vese (LCV) model show that the proposed method has obvious superiority over the traditional local region based methods.

Journal ArticleDOI
TL;DR: A multiple hypotheses tracker is derived which retrieves the potential poses of the camera from the observations in the image and it is shown how these candidate poses can be integrated into a particle filtering framework to guide the particle set toward the peaks of the distribution.
Abstract: This paper proposes a novel model-based tracking approach for 3-D localization. One main difficulty of standard model-based approach lies in the presence of low-level ambiguities between different edges. In this paper, given a 3-D model of the edges of the environment, we derive a multiple hypotheses tracker which retrieves the potential poses of the camera from the observations in the image. We also show how these candidate poses can be integrated into a particle filtering framework to guide the particle set toward the peaks of the distribution. Motivated by the UAV indoor localization problem where GPS signal is not available, we validate the algorithm on real image sequences from UAV flights.

Journal ArticleDOI
TL;DR: This work presents a randomized iterative work-flow, which exploits geometrical properties of isophotes in the image to select the most meaningful edge pixels and to classify them in subsets of equal isophote curvature.

Journal ArticleDOI
TL;DR: CytoSpectre, a versatile, easy-to-use software tool for spectral analysis of microscopy images was developed, and it is found to be tolerant against noise and blurring and superior to FibrilTool when analyzing realistic targets with degraded image quality.
Abstract: Orientation and the degree of isotropy are important in many biological systems such as the sarcomeres of cardiomyocytes and other fibrillar structures of the cytoskeleton. Image based analysis of such structures is often limited to qualitative evaluation by human experts, hampering the throughput, repeatability and reliability of the analyses. Software tools are not readily available for this purpose and the existing methods typically rely at least partly on manual operation. We developed CytoSpectre, an automated tool based on spectral analysis, allowing the quantification of orientation and also size distributions of structures in microscopy images. CytoSpectre utilizes the Fourier transform to estimate the power spectrum of an image and based on the spectrum, computes parameter values describing, among others, the mean orientation, isotropy and size of target structures. The analysis can be further tuned to focus on targets of particular size at cellular or subcellular scales. The software can be operated via a graphical user interface without any programming expertise. We analyzed the performance of CytoSpectre by extensive simulations using artificial images, by benchmarking against FibrilTool and by comparisons with manual measurements performed for real images by a panel of human experts. The software was found to be tolerant against noise and blurring and superior to FibrilTool when analyzing realistic targets with degraded image quality. The analysis of real images indicated general good agreement between computational and manual results while also revealing notable expert-to-expert variation. Moreover, the experiment showed that CytoSpectre can handle images obtained of different cell types using different microscopy techniques. Finally, we studied the effect of mechanical stretching on cardiomyocytes to demonstrate the software in an actual experiment and observed changes in cellular orientation in response to stretching. CytoSpectre, a versatile, easy-to-use software tool for spectral analysis of microscopy images was developed. The tool is compatible with most 2D images and can be used to analyze targets at different scales. We expect the tool to be useful in diverse applications dealing with structures whose orientation and size distributions are of interest. While designed for the biological field, the software could also be useful in non-biological applications.

Proceedings ArticleDOI
Mi Zhang1, Jian Yao1, Menghan Xia1, Kai Li1, Yi Zhang1, Yaping Liu1 
07 Jun 2015
TL;DR: This paper proposes an easily implemented fisheye image rectification algorithm with line constrains in the undistorted perspective image plane that outperforms the existing approaches and the commercial software in most cases.
Abstract: Fisheye image rectification and estimation of intrinsic parameters for real scenes have been addressed in the literature by using line information on the distorted images. In this paper, we propose an easily implemented fisheye image rectification algorithm with line constrains in the undistorted perspective image plane. A novel Multi-Label Energy Optimization (MLEO) method is adopted to merge short circular arcs sharing the same or the approximately same circular parameters and select long circular arcs for camera rectification. Further we propose an efficient method to estimate intrinsic parameters of the fisheye camera by automatically selecting three properly arranged long circular arcs from previously obtained circular arcs in the calibration procedure. Experimental results on a number of real images and simulated data show that the proposed method can achieve good results and outperforms the existing approaches and the commercial software in most cases.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: This work proposes a complete approach that links the detection of curved reflection symmetries to produce symmetry-constrained segments of structures/regions in real images with clutter to enforce global symmetrical consistency in the final segmentation.
Abstract: Symmetry, as one of the key components of Gestalt theory, provides an important mid-level cue that serves as input to higher visual processes such as segmentation. In this work, we propose a complete approach that links the detection of curved reflection symmetries to produce symmetry-constrained segments of structures/regions in real images with clutter. For curved reflection symmetry detection, we leverage on patch-based symmetric features to train a Structured Random Forest classifier that detects multiscaled curved symmetries in 2D images. Next, using these curved symmetries, we modulate a novel symmetry-constrained foreground-background segmentation by their symmetry scores so that we enforce global symmetrical consistency in the final segmentation. This is achieved by imposing a pairwise symmetry prior that encourages symmetric pixels to have the same labels over a MRF-based representation of the input image edges, and the final segmentation is obtained via graph-cuts. Experimental results over four publicly available datasets containing annotated symmetric structures: 1) SYMMAX-300 [38], 2) BSD-Parts, 3) Weizmann Horse (both from [18]) and 4) NY-roads [35] demonstrate the approach's applicability to different environments with state-of-the-art performance.

Journal ArticleDOI
TL;DR: A new adaptive active contour model is proposed for image segmentation, which is built based on fractional order differentiation, level set method and curve evolution, and a penalty tern is added into the proposed model to ensure stable evolution of the level set function.

Journal ArticleDOI
TL;DR: This paper introduces two effective iterative global optimization algorithms initiated with noisy auxiliary information to improve structure from motion solving, and introduces a robust scene reconstruction algorithm to deal with noisy GPS data for camera centers initialization.
Abstract: One of the potentially effective means for large-scale 3D scene reconstruction is to reconstruct the scene in a global manner, rather than incrementally, by fully exploiting available auxiliary information on the imaging condition, such as camera location by Global Positioning System (GPS), orientation by inertial measurement unit (or compass), focal length from EXIF, and so on. However, such auxiliary information, though informative and valuable, is usually too noisy to be directly usable. In this paper, we present an approach by taking advantage of such noisy auxiliary information to improve structure from motion solving. More specifically, we introduce two effective iterative global optimization algorithms initiated with such noisy auxiliary information. One is a robust rotation averaging algorithm to deal with contaminated epipolar graph, the other is a robust scene reconstruction algorithm to deal with noisy GPS data for camera centers initialization. We found that by exclusively focusing on the estimated inliers at the current iteration, the optimization process initialized by such noisy auxiliary information could converge well and efficiently. Our proposed method is evaluated on real images captured by unmanned aerial vehicle, StreetView car, and conventional digital cameras. Extensive experimental results show that our method performs similarly or better than many of the state-of-art reconstruction approaches, in terms of reconstruction accuracy and completeness, but is more efficient and scalable for large-scale image data sets.

Journal ArticleDOI
TL;DR: In this article, the authors combined the optimal transport and the metamorphosis perspectives for a pair of given input images and defined geodesic paths in the space of images as minimizers of a resulting path energy.
Abstract: In this paper the optimal transport and the metamorphosis perspectives are combined. For a pair of given input images geodesic paths in the space of images are defined as minimizers of a resulting path energy. To this end, the underlying Riemannian metric measures the rate of transport cost and the rate of viscous dissipation. Furthermore, the model is capable to deal with strongly varying image contrast and explicitly allows for sources and sinks in the transport equations which are incorporated in the metric related to the metamorphosis approach by Trouve and Younes. In the non-viscous case with source term existence of geodesic paths is proven in the space of measures. The proposed model is explored on the range from merely optimal transport to strongly dissipative dynamics. For this model a robust and effective variational time discretization of geodesic paths is proposed. This requires to minimize a discrete path energy consisting of a sum of consecutive image matching functionals. These functionals are defined on corresponding pairs of intensity functions and on associated pairwise matching deformations. Existence of time discrete geodesics is demonstrated. Furthermore, a finite element implementation is proposed and applied to instructive test cases and to real images. In the non-viscous case this is compared to the algorithm proposed by Benamou and Brenier including a discretization of the source term. Finally, the model is generalized to define discrete weighted barycentres with applications to textures and objects.

Journal ArticleDOI
02 Apr 2015-PLOS ONE
TL;DR: A modified model is developed for segmenting images with intensity inhomogeneity and estimating the bias field simultaneously, and a clustering criterion energy function is defined by considering the difference between the measured image and estimated image in local region.
Abstract: Intensity inhomogeneity causes many difficulties in image segmentation and the understanding of magnetic resonance (MR) images. Bias correction is an important method for addressing the intensity inhomogeneity of MR images before quantitative analysis. In this paper, a modified model is developed for segmenting images with intensity inhomogeneity and estimating the bias field simultaneously. In the modified model, a clustering criterion energy function is defined by considering the difference between the measured image and estimated image in local region. By using this difference in local region, the modified method can obtain accurate segmentation results and an accurate estimation of the bias field. The energy function is incorporated into a level set formulation with a level set regularization term, and the energy minimization is conducted by a level set evolution process. The proposed model first appeared as a two-phase model and then extended to a multi-phase one. The experimental results demonstrate the advantages of our model in terms of accuracy and insensitivity to the location of the initial contours. In particular, our method has been applied to various synthetic and real images with desirable results.

Journal ArticleDOI
TL;DR: A decremental Sparse Modeling Representative Selection (D-SMRS) is proposed in which the selection of the representatives is broken down into several nested processes and a novel framework for prototype selection in subspaces is proposed.

Patent
18 Feb 2015
TL;DR: In this paper, the authors describe a system for rendering an image stream combining real and virtual elements for display to a user equipped with a head mounted display as an augmented reality system.
Abstract: Systems and methods are described for rendering an image stream combining real and virtual elements for display to a user equipped with a head mounted display as an augmented reality. The head mounted display comprises: an image camera to capture a physical image stream of the physical environment surrounding the user; a depth camera for capturing depth information for the physical environment; a processor for receiving the depth information and the physical image stream and associating the depth information with regions in the physical image stream; a graphics processing unit having a graphics engine to render a virtual image stream comprising virtual features alongside the physical image stream; and a display to display the virtual image stream to the user. In use, the processor calls the graphics engine to incorporate the physical image stream such that the relative depths of the virtual and physical features are represented.

Proceedings ArticleDOI
01 Aug 2015
TL;DR: This paper proposed a power line recognition and tracking method for sequence aerial images using the Hough transform with parallel constraint for power lines recognition and the tracking could reduce the computation time and made the recognition robust.
Abstract: Automatic power line recognition is a challenging task for an unmanned aerial vehicles (UAVs) power line inspection system In this paper we proposed a power line recognition and tracking method for sequence aerial images First, the power lines are enhanced by a kind of double-side filters; then, the Hough transform with parallel constraint is used for power lines recognition; finally, in the sequence image, the state of power lines are estimated, and the power lines are tracked with those estimation The tracking could reduce the computation time and made the recognition robust Our experiments on real image data captured from UAV demonstrate our method is effective for power line recognition in sequence image