scispace - formally typeset
Search or ask a question

Showing papers on "Image processing published in 2007"


Journal ArticleDOI
TL;DR: An algorithm based on an enhanced sparse representation in transform domain based on a specially developed collaborative Wiener filtering achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.
Abstract: We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call "groups." Collaborative Altering is a special procedure developed to deal with these 3D groups. We realize it using the three successive steps: 3D transformation of a group, shrinkage of the transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to color-image denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.

7,912 citations


Journal ArticleDOI
TL;DR: EMAN2 has been under development for the last two years, with a completely refactored image processing library, and a wide range of features to make it much more flexible and extensible than EMAN1.

2,852 citations


Journal ArticleDOI
TL;DR: This work forms stitching as a multi-image matching problem, and uses invariant local features to find matches between all of the images, and is insensitive to the ordering, orientation, scale and illumination of the input images.
Abstract: This paper concerns the problem of fully automated panoramic image stitching. Though the 1D problem (single axis of rotation) is well studied, 2D or multi-row stitching is more difficult. Previous approaches have used human input or restrictions on the image sequence in order to establish matching images. In this work, we formulate stitching as a multi-image matching problem, and use invariant local features to find matches between all of the images. Because of this our method is insensitive to the ordering, orientation, scale and illumination of the input images. It is also insensitive to noise images that are not part of a panorama, and can recognise multiple panoramas in an unordered image dataset. In addition to providing more detail, this paper extends our previous work in the area (Brown and Lowe, 2003) by introducing gain compensation and automatic straightening steps.

2,550 citations


Book ChapterDOI
12 Sep 2007
TL;DR: This work presents a novel approach to solve the TV-L1 formulation, which is based on a dual formulation of the TV energy and employs an efficient point-wise thresholding step.
Abstract: Variational methods are among the most successful approaches to calculate the optical flow between two image frames. A particularly appealing formulation is based on total variation (TV) regularization and the robust L1 norm in the data fidelity term. This formulation can preserve discontinuities in the flow field and offers an increased robustness against illumination changes, occlusions and noise. In this work we present a novel approach to solve the TV-L1 formulation. Our method results in a very efficient numerical scheme, which is based on a dual formulation of the TV energy and employs an efficient point-wise thresholding step. Additionally, our approach can be accelerated by modern graphics processing units. We demonstrate the real-time performance (30 fps) of our approach for video inputs at a resolution of 320 × 240 pixels.

1,759 citations


Proceedings ArticleDOI
29 Jul 2007
TL;DR: In this article, seam carving is used for content-aware image resizing for both reduction and expansion, where an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function.
Abstract: Effective resizing of images should not only use geometric constraints, but consider the image content as well We present a simple image operator called seam carving that supports content-aware image resizing for both reduction and expansion A seam is an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function By repeatedly carving out or inserting seams in one direction we can change the aspect ratio of an image By applying these operators in both directions we can retarget the image to a new size The selection and order of seams protect the content of the image, as defined by the energy function Seam carving can also be used for image content enhancement and object removal We support various visual saliency measures for defining the energy of an image, and can also include user input to guide the process By storing the order of seams in an image we create multi-size images, that are able to continuously change in real time to fit a given size

1,652 citations


Journal ArticleDOI
TL;DR: This paper adapt and expand kernel regression ideas for use in image denoising, upscaling, interpolation, fusion, and more and establishes key relationships with some popular existing methods and shows how several of these algorithms are special cases of the proposed framework.
Abstract: In this paper, we make contact with the field of nonparametric statistics and present a development and generalization of tools and results for use in image processing and reconstruction. In particular, we adapt and expand kernel regression ideas for use in image denoising, upscaling, interpolation, fusion, and more. Furthermore, we establish key relationships with some popular existing methods and show how several of these algorithms, including the recently popularized bilateral filter, are special cases of the proposed framework. The resulting algorithms and analyses are amply illustrated with practical examples

1,457 citations


Journal ArticleDOI
TL;DR: The proposed VSNR metric is generally competitive with current metrics of visual fidelity; it is efficient both in terms of its low computational complexity and in termsof its low memory requirements; and it operates based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions.
Abstract: This paper presents an efficient metric for quantifying the visual fidelity of natural images based on near-threshold and suprathreshold properties of human vision. The proposed metric, the visual signal-to-noise ratio (VSNR), operates via a two-stage approach. In the first stage, contrast thresholds for detection of distortions in the presence of natural images are computed via wavelet-based models of visual masking and visual summation in order to determine whether the distortions in the distorted image are visible. If the distortions are below the threshold of detection, the distorted image is deemed to be of perfect visual fidelity (VSNR = infin)and no further analysis is required. If the distortions are suprathreshold, a second stage is applied which operates based on the low-level visual property of perceived contrast, and the mid-level visual property of global precedence. These two properties are modeled as Euclidean distances in distortion-contrast space of a multiscale wavelet decomposition, and VSNR is computed based on a simple linear sum of these distances. The proposed VSNR metric is generally competitive with current metrics of visual fidelity; it is efficient both in terms of its low computational complexity and in terms of its low memory requirements; and it operates based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions.

1,153 citations


Journal ArticleDOI
TL;DR: By incorporating local spatial and gray information together, a novel fast and robust FCM framework for image segmentation, i.e., fast generalized fuzzy c-means (FGFCM) clustering algorithms, is proposed and can mitigate the disadvantages of FCM_S and at the same time enhances the clustering performance.

1,021 citations


Proceedings ArticleDOI
29 Jul 2007
TL;DR: A new image completion algorithm powered by a huge database of photographs gathered from the Web, requiring no annotations or labelling by the user, that can generate a diverse set of results for each input image and allow users to select among them.
Abstract: What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of results for each input image and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches.

1,005 citations


Journal ArticleDOI
TL;DR: Enhanced image resolution and lower noise have been achieved, concurrently with the reduction of helical cone-beam artifacts, as demonstrated by phantom studies and clinical results illustrate the capabilities of the algorithm on real patient data.
Abstract: Multislice helical computed tomography scanning offers the advantages of faster acquisition and wide organ coverage for routine clinical diagnostic purposes. However, image reconstruction is faced with the challenges of three-dimensional cone-beam geometry, data completeness issues, and low dosage. Of all available reconstruction methods, statistical iterative reconstruction (IR) techniques appear particularly promising since they provide the flexibility of accurate physical noise modeling and geometric system description. In this paper, we present the application of Bayesian iterative algorithms to real 3D multislice helical data to demonstrate significant image quality improvement over conventional techniques. We also introduce a novel prior distribution designed to provide flexibility in its parameters to fine-tune image quality. Specifically, enhanced image resolution and lower noise have been achieved, concurrently with the reduction of helical cone-beam artifacts, as demonstrated by phantom studies. Clinical results also illustrate the capabilities of the algorithm on real patient data. Although computational load remains a significant challenge for practical development, superior image quality combined with advancements in computing technology make IR techniques a legitimate candidate for future clinical applications.

987 citations


Proceedings ArticleDOI
29 Jul 2007
TL;DR: This paper shows in this paper how to produce a high quality image that cannot be obtained by simply denoising the noisy image, or deblurring the blurred image alone, by combining information extracted from both blurred and noisy images.
Abstract: Taking satisfactory photos under dim lighting conditions using a hand-held camera is challenging. If the camera is set to a long exposure time, the image is blurred due to camera shake. On the other hand, the image is dark and noisy if it is taken with a short exposure time but with a high camera gain. By combining information extracted from both blurred and noisy images, however, we show in this paper how to produce a high quality image that cannot be obtained by simply denoising the noisy image, or deblurring the blurred image alone. Our approach is image deblurring with the help of the noisy image. First, both images are used to estimate an accurate blur kernel, which otherwise is difficult to obtain from a single blurred image. Second, and again using both images, a residual deconvolution is proposed to significantly reduce ringing artifacts inherent to image deconvolution. Third, the remaining ringing artifacts in smooth image regions are further suppressed by a gain-controlled deconvolution process. We demonstrate the effectiveness of our approach using a number of indoor and outdoor images taken by off-the-shelf hand-held cameras in poor lighting environments.

Journal ArticleDOI
01 May 2007
TL;DR: This dynamic histogram equalization (DHE) technique takes control over the effect of traditional HE so that it performs the enhancement of an image without making any loss of details in it.
Abstract: In this paper, a smart contrast enhancement technique based on conventional histogram equalization (HE) algorithm is proposed. This dynamic histogram equalization (DHE) technique takes control over the effect of traditional HE so that it performs the enhancement of an image without making any loss of details in it. DHE partitions the image histogram based on local minima and assigns specific gray level ranges for each partition before equalizing them separately. These partitions further go though a repartitioning test to ensure the absence of any dominating portions. This method outperforms other present approaches by enhancing the contrast well without introducing severe side effects, such as washed out appearance, checkerboard effects etc., or undesirable artifacts.

Journal ArticleDOI
TL;DR: The use of the open-source software, CellProfiler, to automatically identify and measure a variety of biological objects in images is described, enabling biologists to comprehensively and quantitatively address many questions that previously would have required custom programming.
Abstract: Careful visual examination of biological samples is quite powerful, but many visual analysis tasks done in the laboratory are repetitive, tedious, and subjective. Here we describe the use of the open-source software, CellProfiler, to automatically identify and measure a variety of biological objects in images. The applications demonstrated here include yeast colony counting and classifying, cell microarray annotation, yeast patch assays, mouse tumor quantification, wound healing assays, and tissue topology measurement. The software automatically identifies objects in digital images, counts them, and records a full spectrum of measurements for each object, including location within the image, size, shape, color intensity, degree of correlation between colors, texture (smoothness), and number of neighbors. Small numbers of images can be processed automatically on a personal computer and hundreds of thousands can be analyzed using a computing cluster. This free, easy-to-use software enables biologists to comprehensively and quantitatively address many questions that previously would have required custom programming, thereby facilitating discovery in a variety of biological fields of study.

Journal ArticleDOI
TL;DR: It is demonstrated how a recently proposed measure of similarity, the normalized probabilistic rand (NPR) index, can be used to perform a quantitative comparison between image segmentation algorithms using a hand-labeled set of ground-truth segmentations.
Abstract: Unsupervised image segmentation is an important component in many image understanding algorithms and practical vision systems. However, evaluation of segmentation algorithms thus far has been largely subjective, leaving a system designer to judge the effectiveness of a technique based only on intuition and results in the form of a few example segmented images. This is largely due to image segmentation being an ill-defined problem-there is no unique ground-truth segmentation of an image against which the output of an algorithm may be compared. This paper demonstrates how a recently proposed measure of similarity, the normalized probabilistic rand (NPR) index, can be used to perform a quantitative comparison between image segmentation algorithms using a hand-labeled set of ground-truth segmentations. We show that the measure allows principled comparisons between segmentations created by different algorithms, as well as segmentations on different images. We outline a procedure for algorithm evaluation through an example evaluation of some familiar algorithms - the mean-shift-based algorithm, an efficient graph-based segmentation algorithm, a hybrid algorithm that combines the strengths of both methods, and expectation maximization. Results are presented on the 300 images in the publicly available Berkeley segmentation data set

Journal ArticleDOI
TL;DR: The proposed technique would also allow precise coregistration of images for the measurement of surface displacements due to ice-flow or geomorphic processes, or for any other change detection applications.
Abstract: We describe a procedure to accurately measure ground deformations from optical satellite images. Precise orthorectification is obtained owing to an optimized model of the imaging system, where look directions are linearly corrected to compensate for attitude drifts, and sensor orientation uncertainties are accounted for. We introduce a new computation of the inverse projection matrices for which a rigorous resampling is proposed. The irregular resampling problem is explicitly addressed to avoid introducing aliasing in the ortho-rectified images. Image registration and correlation is achieved with a new iterative unbiased processor that estimates the phase plane in the Fourier domain for subpixel shift detection. Without using supplementary data, raw images are wrapped onto the digital elevation model and coregistered with a 1/50 pixel accuracy. The procedure applies to images from any pushbroom imaging system. We analyze its performance using Satellite pour l'Observation de la Terre (SPOT) images in the case of a null test (no coseismic deformation) and in the case of large coseismic deformations due to the Mw 7.1 Hector Mine, California, earthquake of 1999. The proposed technique would also allow precise coregistration of images for the measurement of surface displacements due to ice-flow or geomorphic processes, or for any other change detection applications. A complete software package, the Coregistration of Optically Sensed Images and Correlation, is available for download from the Caltech Tectonics Observatory website

Journal ArticleDOI
TL;DR: This paper presents a new framework for the completion of missing information based on local structures that poses the task of completion as a global optimization problem with a well-defined objective function and derives a new algorithm to optimize it.
Abstract: This paper presents a new framework for the completion of missing information based on local structures. It poses the task of completion as a global optimization problem with a well-defined objective function and derives a new algorithm to optimize it. Missing values are constrained to form coherent structures with respect to reference examples. We apply this method to space-time completion of large space-time "holes" in video sequences of complex dynamic scenes. The missing portions are filled in by sampling spatio-temporal patches from the available parts of the video, while enforcing global spatio-temporal consistency between all patches in and around the hole. The consistent completion of static scene parts simultaneously with dynamic behaviors leads to realistic looking video sequences and images. Space-time video completion is useful for a variety of tasks, including, but not limited to: 1) sophisticated video removal (of undesired static or dynamic objects) by completing the appropriate static or dynamic background information. 2) Correction of missing/corrupted video frames in old movies. 3) Modifying a visual story by replacing unwanted elements. 4) Creation of video textures by extending smaller ones. 5) Creation of complete field-of-view stabilized video. 6) As images are one-frame videos, we apply the method to this special case as well

Journal ArticleDOI
TL;DR: This paper takes the first step towards constructing the surface layout, a labeling of the image intogeometric classes, to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region.
Abstract: Humans have an amazing ability to instantly grasp the overall 3D structure of a scene--ground orientation, relative positions of major landmarks, etc.--even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.

Journal ArticleDOI
TL;DR: A novel approach to image filtering based on the shape-adaptive discrete cosine transform is presented, in particular, image denoising and image deblocking and deringing from block-DCT compression and a special structural constraint in luminance-chrominance space is proposed to enable an accurate filtering of color images.
Abstract: The shape-adaptive discrete cosine transform (SA-DCT) transform can be computed on a support of arbitrary shape, but retains a computational complexity comparable to that of the usual separable block-DCT (B-DCT). Despite the near-optimal decorrelation and energy compaction properties, application of the SA-DCT has been rather limited, targeted nearly exclusively to video compression. In this paper, we present a novel approach to image filtering based on the SA-DCT. We use the SA-DCT in conjunction with the Anisotropic Local Polynomial Approximation-Intersection of Confidence Intervals technique, which defines the shape of the transform's support in a pointwise adaptive manner. The thresholded or attenuated SA-DCT coefficients are used to reconstruct a local estimate of the signal within the adaptive-shape support. Since supports corresponding to different points are in general overlapping, the local estimates are averaged together using adaptive weights that depend on the region's statistics. This approach can be used for various image-processing tasks. In this paper, we consider, in particular, image denoising and image deblocking and deringing from block-DCT compression. A special structural constraint in luminance-chrominance space is also proposed to enable an accurate filtering of color images. Simulation experiments show a state-of-the-art quality of the final estimate, both in terms of objective criteria and visual appearance. Thanks to the adaptive support, reconstructed edges are clean, and no unpleasant ringing artifacts are introduced by the fitted transform

Journal ArticleDOI
TL;DR: An interscale orthonormal wavelet thresholding algorithm is described based on this new approach and its near-optimal performance is described by comparing it with the results of three state-of-the-art nonredundant denoising algorithms on a large set of test images.
Abstract: This paper introduces a new approach to orthonormal wavelet image denoising. Instead of postulating a statistical model for the wavelet coefficients, we directly parametrize the denoising process as a sum of elementary nonlinear processes with unknown weights. We then minimize an estimate of the mean square error between the clean image and the denoised one. The key point is that we have at our disposal a very accurate, statistically unbiased, MSE estimate-Stein's unbiased risk estimate-that depends on the noisy image alone, not on the clean one. Like the MSE, this estimate is quadratic in the unknown weights, and its minimization amounts to solving a linear system of equations. The existence of this a priori estimate makes it unnecessary to devise a specific statistical model for the wavelet coefficients. Instead, and contrary to the custom in the literature, these coefficients are not considered random any more. We describe an interscale orthonormal wavelet thresholding algorithm based on this new approach and show its near-optimal performance-both regarding quality and CPU requirement-by comparing it with the results of three state-of-the-art nonredundant denoising algorithms on a large set of test images. An interesting fallout of this study is the development of a new, group-delay-based, parent-child prediction in a wavelet dyadic tree

Journal ArticleDOI
TL;DR: In this paper, an algorithm to construct an optimized reference PSF image from a set of reference images is presented. But this image is built as a linear combination of the reference images available and the coefficients of the combination are optimized inside multiple subsections of the image independently to minimize the residual noise within each subsection.
Abstract: Direct imaging of exoplanets is limited by bright quasi-static speckles in the point-spread function (PSF) of the central star. This limitation can be reduced by subtraction of reference PSF images. We have developed an algorithm to construct an optimized reference PSF image from a set of reference images. This image is built as a linear combination of the reference images available, and the coefficients of the combination are optimized inside multiple subsections of the image independently to minimize the residual noise within each subsection. The algorithm developed can be used with many high-contrast imaging observing strategies relying on PSF subtraction, such as angular differential imaging (ADI), roll subtraction, spectral differential imaging, and reference star observations. The performance of the algorithm is demonstrated for ADI data. It is shown that for this type of data the new algorithm provides a gain in sensitivity by up to a factor of 3 at small separation over the algorithm previously used by Marois and colleagues.

Journal ArticleDOI
TL;DR: A novel discriminative learning method over sets is proposed for set classification that maximizes the canonical correlations of within-class sets and minimizes thecanon correlations of between- class sets.
Abstract: We address the problem of comparing sets of images for object recognition, where the sets may represent variations in an object's appearance due to changing camera pose and lighting conditions. canonical correlations (also known as principal or canonical angles), which can be thought of as the angles between two d-dimensional subspaces, have recently attracted attention for image set matching. Canonical correlations offer many benefits in accuracy, efficiency, and robustness compared to the two main classical methods: parametric distribution-based and nonparametric sample-based matching of sets. Here, this is first demonstrated experimentally for reasonably sized data sets using existing methods exploiting canonical correlations. Motivated by their proven effectiveness, a novel discriminative learning method over sets is proposed for set classification. Specifically, inspired by classical linear discriminant analysis (LDA), we develop a linear discriminant function that maximizes the canonical correlations of within-class sets and minimizes the canonical correlations of between-class sets. Image sets transformed by the discriminant function are then compared by the canonical correlations. Classical orthogonal subspace method (OSM) is also investigated for the similar purpose and compared with the proposed method. The proposed method is evaluated on various object recognition problems using face image sets with arbitrary motion captured under different illuminations and image sets of 500 general objects taken at different views. The method is also applied to object category recognition using ETH-80 database. The proposed method is shown to outperform the state-of-the-art methods in terms of accuracy and efficiency

Journal ArticleDOI
TL;DR: This paper presents a multi-cue vision system for the real-time detection and tracking of pedestrians from a moving vehicle, with results from extensive field tests in difficult urban traffic conditions suggest system performance is at the leading edge.
Abstract: This paper presents a multi-cue vision system for the real-time detection and tracking of pedestrians from a moving vehicle. The detection component involves a cascade of modules, each utilizing complementary visual criteria to successively narrow down the image search space, balancing robustness and efficiency considerations. Novel is the tight integration of the consecutive modules: (sparse) stereo-based ROI generation, shape-based detection, texture-based classification and (dense) stereo-based verification. For example, shape-based detection activates a weighted combination of texture-based classifiers, each attuned to a particular body pose. Performance of individual modules and their interaction is analyzed by means of Receiver Operator Characteristics (ROCs). A sequential optimization technique allows the successive combination of individual ROCs, providing optimized system parameter settings in a systematic fashion, avoiding ad-hoc parameter tuning. Application-dependent processing constraints can be incorporated in the optimization procedure. Results from extensive field tests in difficult urban traffic conditions suggest system performance is at the leading edge.

Journal ArticleDOI
TL;DR: An active near infrared (NIR) imaging system is presented that is able to produce face images of good condition regardless of visible lights in the environment, and it is shown that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone.
Abstract: Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups

Proceedings ArticleDOI
29 Jul 2007
TL;DR: A new data structure---the bilateral grid, that enables fast edge-aware image processing that parallelize the algorithms on modern GPUs to achieve real-time frame rates on high-definition video.
Abstract: We present a new data structure---the bilateral grid, that enables fast edge-aware image processing. By working in the bilateral grid, algorithms such as bilateral filtering, edge-aware painting, and local histogram equalization become simple manipulations that are both local and independent. We parallelize our algorithms on modern GPUs to achieve real-time frame rates on high-definition video. We demonstrate our method on a variety of applications such as image editing, transfer of photographic look, and contrast enhancement of medical images.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: An interactive framework for soft segmentation and matting of natural images and videos is presented, based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented.
Abstract: An interactive framework for soft segmentation and matting of natural images and videos is presented in this paper. The proposed technique is based on the optimal, linear time, computation of weighted geodesic distances to the user-provided scribbles, from which the whole data is automatically segmented. The weights are based on spatial and/or temporal gradients, without explicit optical flow or any advanced and often computationally expensive feature detectors. These could be naturally added to the proposed framework as well if desired, in the form of weights in the geodesic distances. A localized refinement step follows this fast segmentation in order to accurately compute the corresponding matte function. Additional constraints into the distance definition permit to efficiently handle occlusions such as people or objects crossing each other in a video sequence. The presentation of the framework is complemented with numerous and diverse examples, including extraction of moving foreground from dynamic background, and comparisons with the recent literature.

Journal ArticleDOI
TL;DR: Image fusion is the process of combining information from two or more images of a scene into a single composite image that is more informative and is more suitable for visual perception or computer processing.

Journal ArticleDOI
TL;DR: This work addresses the problem of detecting irregularities in visual data, e.g., detecting suspicious behaviors in video sequences, or identifying salient patterns in images, using a probabilistic graphical model.
Abstract: We address the problem of detecting irregularities in visual data, e.g., detecting suspicious behaviors in video sequences, or identifying salient patterns in images. The term "irregular" depends on the context in which the "regular" or "valid" are defined. Yet, it is not realistic to expect explicit definition of all possible valid configurations for a given context. We pose the problem of determining the validity of visual data as a process of constructing a puzzle: We try to compose a new observed image region or a new video segment ("the query") using chunks of data ("pieces of puzzle") extracted from previous visual examples ("the database"). Regions in the observed data which can be composed using large contiguous chunks of data from the database are considered very likely, whereas regions in the observed data which cannot be composed from the database (or can be composed, but only using small fragmented pieces) are regarded as unlikely/suspicious. The problem is posed as an inference process in a probabilistic graphical model. We show applications of this approach to identifying saliency in images and video, for detecting suspicious behaviors and for automatic visual inspection for quality assurance.

Journal ArticleDOI
TL;DR: Bsoft is a software package written for image processing of electron micrographs, interpretation of reconstructions, molecular modeling, and general image processing that allows shell scripting of processes and allows subtasks to be distributed across multiple computers for concurrent processing.

Journal ArticleDOI
TL;DR: Experiments show that sum-modified-Laplacian (SML) can provide better performance than other focus measures, when the execution time is not included in the evaluation.

Journal ArticleDOI
TL;DR: Non-Doppler, 2-dimensional, 2D strain imaging is a new echocardiographic technique for obtaining strain and strain rate measurements that analyzes motion by tracking speckles in the ultrasonic image in two dimensions, enabling rapid and accurate assessment of global and segmental myocardial function.
Abstract: During the past several years, strain and strain rate imaging have emerged as a quantitative technique to accurately estimate myocardial function and contractility. Non-Doppler, 2-dimensional (2D) strain imaging is a new echocardiographic technique for obtaining strain and strain rate measurements. It analyzes motion by tracking speckles in the ultrasonic image in two dimensions. Current available software allows spatial and temporal image processing with recognition and selection of such elements on ultrasound image. The geometric shift of each speckle represents local tissue movement. By tracking theses speckles, 2D tissue velocity, strain, and strain rate can be calculated. Non-Doppler 2D strain imaging is simple to perform. It requires only one cardiac cycle to be acquired; further processing and interpretation can be done after image data acquisition. Because it is not based on tissue Doppler measurements, it is angle independent. Data regarding accuracy, validity, and clinical application of non-Doppler 2D strain imaging are rapidly accumulating. This technique may prove to be of significant clinical value, enabling rapid and accurate assessment of global and segmental myocardial function.