scispace - formally typeset
Search or ask a question

Showing papers on "Image processing published in 2013"


Journal ArticleDOI
TL;DR: This work has recently derived a blind IQA model that only makes use of measurable deviations from statistical regularities observed in natural images, without training on human-rated distorted images, and, indeed, without any exposure to distorted images.
Abstract: An important aim of research on the blind image quality assessment (IQA) problem is to devise perceptual models that can predict the quality of distorted images with as little prior knowledge of the images or their distortions as possible. Current state-of-the-art “general purpose” no reference (NR) IQA algorithms require knowledge about anticipated distortions in the form of training examples and corresponding human opinion scores. However we have recently derived a blind IQA model that only makes use of measurable deviations from statistical regularities observed in natural images, without training on human-rated distorted images, and, indeed without any exposure to distorted images. Thus, it is “completely blind.” The new IQA model, which we call the Natural Image Quality Evaluator (NIQE) is based on the construction of a “quality aware” collection of statistical features based on a simple and successful space domain natural scene statistic (NSS) model. These features are derived from a corpus of natural, undistorted images. Experimental results show that the new index delivers performance comparable to top performing NR IQA models that require training on large databases of human opinions of distorted images. A software release is available at http://live.ece.utexas.edu/research/quality/niqe_release.zip.

3,722 citations


Journal ArticleDOI
TL;DR: This paper attempts to give an overview of deformable registration methods, putting emphasis on the most recent advances in the domain, and provides an extensive account of registration techniques in a systematic manner.
Abstract: Deformable image registration is a fundamental task in medical image processing. Among its most important applications, one may cite: 1) multi-modality fusion, where information acquired by different imaging devices or protocols is fused to facilitate diagnosis and treatment planning; 2) longitudinal studies, where temporal structural or anatomical changes are investigated; and 3) population modeling and statistical atlases used to study normal anatomical variability. In this paper, we attempt to give an overview of deformable registration methods, putting emphasis on the most recent advances in the domain. Additional emphasis has been given to techniques applied to medical images. In order to study image registration methods in depth, their main components are identified and studied independently. The most recent techniques are presented in a systematic fashion. The contribution of this paper is to provide an extensive account of registration techniques in a systematic manner.

1,434 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed method can obtain state-of-the-art performance for fusion of multispectral, multifocus, multimodal, and multiexposure images.
Abstract: A fast and effective image fusion method is proposed for creating a highly informative fused image through merging multiple images. The proposed method is based on a two-scale decomposition of an image into a base layer containing large scale variations in intensity, and a detail layer capturing small scale details. A novel guided filtering-based weighted average technique is proposed to make full use of spatial consistency for fusion of the base and detail layers. Experimental results demonstrate that the proposed method can obtain state-of-the-art performance for fusion of multispectral, multifocus, multimodal, and multiexposure images.

1,300 citations


Book
26 Mar 2013
TL;DR: The updated 2nd edition of this book presents a variety of image analysis applications, reviews their precise mathematics and shows how to discretize them, and provides programming tools for creating simulations with minimal effort.
Abstract: The updated 2nd edition of this book presents a variety of image analysis applications, reviews their precise mathematics and shows how to discretize them. For the mathematical community, the book shows the contribution of mathematics to this domain, and highlights unsolved theoretical questions. For the computer vision community, it presents a clear, self-contained and global overview of the mathematics involved in image procesing problems. The second edition offers a review of progress in image processing applications covered by the PDE framework, and updates the existing material. The book also provides programming tools for creating simulations with minimal effort.

1,279 citations


Journal ArticleDOI
Masroor Hussain1, Dongmei Chen1, Angela Cheng1, Hui Wei, David Stanley 
TL;DR: This paper begins with a discussion of the traditionally pixel-based and (mostly) statistics-oriented change detection techniques which focus mainly on the spectral values and mostly ignore the spatial context, followed by a review of object-basedchange detection techniques.
Abstract: The appetite for up-to-date information about earth’s surface is ever increasing, as such information provides a base for a large number of applications, including local, regional and global resources monitoring, land-cover and land-use change monitoring, and environmental studies. The data from remote sensing satellites provide opportunities to acquire information about land at varying resolutions and has been widely used for change detection studies. A large number of change detection methodologies and techniques, utilizing remotely sensed data, have been developed, and newer techniques are still emerging. This paper begins with a discussion of the traditionally pixel-based and (mostly) statistics-oriented change detection techniques which focus mainly on the spectral values and mostly ignore the spatial context. This is succeeded by a review of object-based change detection techniques. Finally there is a brief discussion of spatial data mining techniques in image processing and change detection from remote sensing data. The merits and issues of different techniques are compared. The importance of the exponential increase in the image data volume and multiple sensors and associated challenges on the development of change detection techniques are highlighted. With the wide use of very-high-resolution (VHR) remotely sensed images, object-based methods and data mining techniques may have more potential in change detection.

1,159 citations


Proceedings ArticleDOI
16 Jun 2013
TL;DR: A systematic model of the tradeoff space fundamental to stencil pipelines is presented, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule are presented.
Abstract: Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of magnitude. Efficient implementations require optimization of both parallelism and locality, but due to the nature of stencils, there is a fundamental tension between parallelism, locality, and introducing redundant recomputation of shared values.We present a systematic model of the tradeoff space fundamental to stencil pipelines, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing language that synthesizes high performance implementations from a Halide algorithm and a schedule. Combining this compiler with stochastic search over the space of schedules enables terse, composable programs to achieve state-of-the-art performance on a wide range of real image processing pipelines, and across different hardware architectures, including multicores with SIMD, and heterogeneous CPU+GPU execution. From simple Halide programs written in a few hours, we demonstrate performance up to 5x faster than hand-tuned C, intrinsics, and CUDA implementations optimized by experts over weeks or months, for image processing applications beyond the reach of past automatic compilers.

1,074 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: An efficient regularization method to remove hazes from a single input image and can restore a high-quality haze-free image with faithful colors and fine image details is proposed.
Abstract: Images captured in foggy weather conditions often suffer from bad visibility. In this paper, we propose an efficient regularization method to remove hazes from a single input image. Our method benefits much from an exploration on the inherent boundary constraint on the transmission function. This constraint, combined with a weighted L1-norm based contextual regularization, is modeled into an optimization problem to estimate the unknown scene transmission. A quite efficient algorithm based on variable splitting is also presented to solve the problem. The proposed method requires only a few general assumptions and can restore a high-quality haze-free image with faithful colors and fine image details. Experimental results on a variety of haze images demonstrate the effectiveness and efficiency of the proposed method.

923 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed enhancement algorithm can not only enhance the details but also preserve the naturalness for non-uniform illumination images.
Abstract: Image enhancement plays an important role in image processing and analysis. Among various enhancement algorithms, Retinex-based algorithms can efficiently enhance details and have been widely adopted. Since Retinex-based algorithms regard illumination removal as a default preference and fail to limit the range of reflectance, the naturalness of non-uniform illumination images cannot be effectively preserved. However, naturalness is essential for image enhancement to achieve pleasing perceptual quality. In order to preserve naturalness while enhancing details, we propose an enhancement algorithm for non-uniform illumination images. In general, this paper makes the following three major contributions. First, a lightness-order-error measure is proposed to access naturalness preservation objectively. Second, a bright-pass filter is proposed to decompose an image into reflectance and illumination, which, respectively, determine the details and the naturalness of the image. Third, we propose a bi-log transformation, which is utilized to map the illumination to make a balance between details and naturalness. Experimental results demonstrate that the proposed algorithm can not only enhance the details but also preserve the naturalness for non-uniform illumination images.

918 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work addresses the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image, and employs a regression forest that is capable of inferting an estimate of each pixel's correspondence to 3D points in the scene's world coordinate frame.
Abstract: We address the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image. Our approach employs a regression forest that is capable of inferring an estimate of each pixel's correspondence to 3D points in the scene's world coordinate frame. The forest uses only simple depth and RGB pixel comparison features, and does not require the computation of feature descriptors. The forest is trained to be capable of predicting correspondences at any pixel, so no interest point detectors are required. The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.

796 citations


Journal ArticleDOI
TL;DR: An automatic transformation technique that improves the brightness of dimmed images via the gamma correction and probability distribution of luminance pixels and uses temporal information regarding the differences between each frame to reduce computational complexity is presented.
Abstract: This paper proposes an efficient method to modify histograms and enhance contrast in digital images. Enhancement plays a significant role in digital image processing, computer vision, and pattern recognition. We present an automatic transformation technique that improves the brightness of dimmed images via the gamma correction and probability distribution of luminance pixels. To enhance video, the proposed image-enhancement method uses temporal information regarding the differences between each frame to reduce computational complexity. Experimental results demonstrate that the proposed method produces enhanced images of comparable or higher quality than those produced using previous state-of-the-art methods.

795 citations


Journal ArticleDOI
TL;DR: Applying this procedure to cryoEM images of beta-galactosidase shows how overfitting varies greatly depending on the procedure, but in the best case shows no overfitting and a resolution of ~6 Å.

Proceedings ArticleDOI
03 Jul 2013
TL;DR: The PH2 database includes the manual segmentation, the clinical diagnosis, and the identification of several dermoscopic structures, performed by expert dermatologists, in a set of 200 dermosCopic images.
Abstract: The increasing incidence of melanoma has recently promoted the development of computer-aided diagnosis systems for the classification of dermoscopic images. Unfortunately, the performance of such systems cannot be compared since they are evaluated in different sets of images by their authors and there are no public databases available to perform a fair evaluation of multiple systems. In this paper, a dermoscopic image database, called PH2, is presented. The PH2 database includes the manual segmentation, the clinical diagnosis, and the identification of several dermoscopic structures, performed by expert dermatologists, in a set of 200 dermoscopic images. The PH2 database will be made freely available for research and benchmarking purposes.

Journal ArticleDOI
TL;DR: This paper is the first to demonstrate the utility and effectiveness of a fusion-based technique for dehazing based on a single degraded image that derives from two original hazy image inputs by applying a white balance and a contrast enhancing procedure.
Abstract: Haze is an atmospheric phenomenon that significantly degrades the visibility of outdoor scenes. This is mainly due to the atmosphere particles that absorb and scatter the light. This paper introduces a novel single image approach that enhances the visibility of such degraded images. Our method is a fusion-based strategy that derives from two original hazy image inputs by applying a white balance and a contrast enhancing procedure. To blend effectively the information of the derived inputs to preserve the regions with good visibility, we filter their important features by computing three measures (weight maps): luminance, chromaticity, and saliency. To minimize artifacts introduced by the weight maps, our approach is designed in a multiscale fashion, using a Laplacian pyramid representation. We are the first to demonstrate the utility and effectiveness of a fusion-based technique for dehazing based on a single degraded image. The method performs in a per-pixel fashion, which is straightforward to implement. The experimental results demonstrate that the method yields results comparative to and even better than the more complex state-of-the-art techniques, having the advantage of being appropriate for real-time applications.

Journal ArticleDOI
Zaixu Cui1, Suyu Zhong1, Pengfei Xu1, Yong He1, Gaolang Gong1 
TL;DR: A MATLAB toolbox named “Pipeline for Analyzing braiN Diffusion imAges” (PANDA) is developed, expected to substantially simplify the image processing of dMRI datasets and facilitate human structural connectome studies.
Abstract: Diffusion magnetic resonance imaging (dMRI) is widely used in both scientific research and clinical practice in in-vivo studies of the human brain. While a number of post-processing packages have been developed, fully automated processing of dMRI datasets remains challenging. Here, we developed a MATLAB toolbox named “Pipeline for Analyzing braiN Diffusion imAges” (PANDA) for fully automated processing of brain diffusion images. The processing modules of a few established packages, including FMRIB Software Library (FSL), Pipeline System for Octave and Matlab (PSOM), Diffusion Toolkit and MRIcron, were employed in PANDA. Using any number of raw dMRI datasets from different subjects, in either DICOM or NIfTI format, PANDA can automatically perform a series of steps to process DICOM/NIfTI to diffusion metrics (e.g., FA and MD) that are ready for statistical analysis at the voxel-level, the atlas-level and the Tract-Based Spatial Statistics (TBSS)-level and can finish the construction of anatomical brain networks for all subjects. In particular, PANDA can process different subjects in parallel, using multiple cores either in a single computer or in a distributed computing environment, thus greatly reducing the time cost when dealing with a large number of datasets. In addition, PANDA has a friendly graphical user interface (GUI), allowing the user to be interactive and to adjust the input/output settings, as well as the processing parameters. As an open-source package, PANDA is freely available at http://www.nitrc.org/projects/panda/. This novel toolbox is expected to substantially simplify the image processing of dMRI datasets and facilitate human structural connectome studies.

Journal ArticleDOI
TL;DR: The goal is to explore and record the variability of the computed effective properties as a function of using different tools and workflows, and benchmarking is the topic of the two present companion papers.

Journal ArticleDOI
TL;DR: Performance comparisons with FRQI reveal that NEQR can achieve a quadratic speedup in quantum image preparation, increase the compression ratio of quantum images by approximately 1.5X, and retrieve digital images from quantum images accurately.
Abstract: Quantum computation is becoming an important and effective tool to overcome the high real-time computational requirements of classical digital image processing. In this paper, based on analysis of existing quantum image representations, a novel enhanced quantum representation (NEQR) for digital images is proposed, which improves the latest flexible representation of quantum images (FRQI). The newly proposed quantum image representation uses the basis state of a qubit sequence to store the gray-scale value of each pixel in the image for the first time, instead of the probability amplitude of a qubit, as in FRQI. Because different basis states of qubit sequence are orthogonal, different gray scales in the NEQR quantum image can be distinguished. Performance comparisons with FRQI reveal that NEQR can achieve a quadratic speedup in quantum image preparation, increase the compression ratio of quantum images by approximately 1.5X, and retrieve digital images from quantum images accurately. Meanwhile, more quantum image operations related to gray-scale information in the image can be performed conveniently based on NEQR, for example partial color operations and statistical color operations. Therefore, the proposed NEQR quantum image model is more flexible and better suited for quantum image representation than other models in the literature.

Journal ArticleDOI
TL;DR: The accelerated registration tool elastix is employed in a study on diagnostic classification of Alzheimer's disease and cognitively normal controls based on T1-weighted MRI and has nearly identical results to the non-optimized version.
Abstract: Nonrigid image registration is an important, but time-consuming task in medical image analysis. In typical neuroimaging studies, multiple image registrations are performed, i.e., for atlas-based segmentation or template construction. Faster image registration routines would therefore be beneficial. In this paper we explore acceleration of the image registration package elastix by a combination of several techniques: (i) parallelization on the CPU, to speed up the cost function derivative calculation; (ii) parallelization on the GPU building on and extending the OpenCL framework from ITKv4, to speed up the Gaussian pyramid computation and the image resampling step; (iii) exploitation of certain properties of the B-spline transformation model; (iv) further software optimizations. The accelerated registration tool is employed in a study on diagnostic classification of Alzheimer's disease and cognitively normal controls based on T1-weighted MRI. We selected 299 participants from the publicly available Alzheimer's Disease Neuroimaging Initiative database. Classification is performed with a support vector machine based on gray matter volumes as a marker for atrophy. We evaluated two types of strategies (voxel-wise and region-wise) that heavily rely on nonrigid image registration. Parallelization and optimization resulted in an acceleration factor of 4-5x on an 8-core machine. Using OpenCL a speedup factor of 2 was realized for computation of the Gaussian pyramids, and 15-60 for the resampling step, for larger images. The voxel-wise and the region-wise classification methods had an area under the receiver operator characteristic curve of 88 and 90%, respectively, both for standard and accelerated registration. We conclude that the image registration package elastix was substantially accelerated, with nearly identical results to the non-optimized version. The new functionality will become available in the next release of elastix as open source under the BSD license.

Journal ArticleDOI
TL;DR: This Commemorative Review presents an overview of literature on physical principles and applications of integral imaging, and applications including 3D underwater imaging, 3D imaging in photon-starved environments, 2D tracking of occluded objects,3D optical microscopy, and 3D polarimetric imaging are reviewed.
Abstract: Three-dimensional (3D) sensing and imaging technologies have been extensively researched for many applications in the fields of entertainment, medicine, robotics, manufacturing, industrial inspection, security, surveillance, and defense due to their diverse and significant benefits. Integral imaging is a passive multiperspective imaging technique, which records multiple two-dimensional images of a scene from different perspectives. Unlike holography, it can capture a scene such as outdoor events with incoherent or ambient light. Integral imaging can display a true 3D color image with full parallax and continuous viewing angles by incoherent light; thus it does not suffer from speckle degradation. Because of its unique properties, integral imaging has been revived over the past decade or so as a promising approach for massive 3D commercialization. A series of key articles on this topic have appeared in the OSA journals, including Applied Optics. Thus, it is fitting that this Commemorative Review presents an overview of literature on physical principles and applications of integral imaging. Several data capture configurations, reconstruction, and display methods are overviewed. In addition, applications including 3D underwater imaging, 3D imaging in photon-starved environments, 3D tracking of occluded objects, 3D optical microscopy, and 3D polarimetric imaging are reviewed.

Journal ArticleDOI
23 Jun 2013
TL;DR: This work investigates projective estimation under model inadequacies, i.e., when the underpinning assumptions of the projective model are not fully satisfied by the data, and proposes as-projective-as-possible warps that aim to be globally projective, yet allow local non-projectives to account for violations to the assumed imaging conditions.
Abstract: The success of commercial image stitching tools often leads to the impression that image stitching is a “solved problem”. The reality, however, is that many tools give unconvincing results when the input photos violate fairly restrictive imaging assumptions; the main two being that the photos correspond to views that differ purely by rotation, or that the imaged scene is effectively planar. Such assumptions underpin the usage of 2D projective transforms or homographies to align photos. In the hands of the casual user, such conditions are often violated, yielding misalignment artifacts or “ghosting” in the results. Accordingly, many existing image stitching tools depend critically on post-processing routines to conceal ghosting. In this paper, we propose a novel estimation technique called Moving Direct Linear Transformation (Moving DLT) that is able to tweak or fine-tune the projective warp to accommodate the deviations of the input data from the idealized conditions. This produces as-projective-as-possible image alignment that significantly reduces ghosting without compromising the geometric realism of perspective image stitching. Our technique thus lessens the dependency on potentially expensive postprocessing algorithms. In addition, we describe how multiple as-projective-as-possible warps can be simultaneously refined via bundle adjustment to accurately align multiple images for large panorama creation.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work presents a post-capture image processing solution that can remove localized rain and dirt artifacts from a single image, and demonstrates effective removal of dirt and rain in outdoor test conditions.
Abstract: Photographs taken through a window are often compromised by dirt or rain present on the window surface. Common cases of this include pictures taken from inside a vehicle, or outdoor security cameras mounted inside a protective enclosure. At capture time, defocus can be used to remove the artifacts, but this relies on achieving a shallow depth-of-field and placement of the camera close to the window. Instead, we present a post-capture image processing solution that can remove localized rain and dirt artifacts from a single image. We collect a dataset of clean/corrupted image pairs which are then used to train a specialized form of convolutional neural network. This learns how to map corrupted image patches to clean ones, implicitly capturing the characteristic appearance of dirt and water droplets in natural images. Our models demonstrate effective removal of dirt and rain in outdoor test conditions.

Journal ArticleDOI
TL;DR: An approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images is proposed, and the relevance estimation is determined with a hypergraph learning approach.
Abstract: Due to the popularity of social media websites, extensive research efforts have been dedicated to tag-based social image search. Both visual information and tags have been investigated in the research field. However, most existing methods use tags and visual characteristics either separately or sequentially in order to estimate the relevance of images. In this paper, we propose an approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images. The relevance estimation is determined with a hypergraph learning approach. In this method, a social image hypergraph is constructed, where vertices represent images and hyperedges represent visual or textual terms. Learning is achieved with use of a set of pseudo-positive images, where the weights of hyperedges are updated throughout the learning process. In this way, the impact of different tags and visual words can be automatically modulated. Comparative results of the experiments conducted on a dataset including 370+images are presented, which demonstrate the effectiveness of the proposed approach.

Journal ArticleDOI
TL;DR: This paper proposes to consider every two adjacent prediction-errors jointly to generate a sequence consisting of prediction-error pairs, and based on the sequence and the resulting 2D prediction- error histogram, a more efficient embedding strategy, namely, pairwise PEE, can be designed to achieve an improved performance.
Abstract: In prediction-error expansion (PEE) based reversible data hiding, better exploiting image redundancy usually leads to a superior performance. However, the correlations among prediction-errors are not considered and utilized in current PEE based methods. Specifically, in PEE, the prediction-errors are modified individually in data embedding. In this paper, to better exploit these correlations, instead of utilizing prediction-errors individually, we propose to consider every two adjacent prediction-errors jointly to generate a sequence consisting of prediction-error pairs. Then, based on the sequence and the resulting 2D prediction-error histogram, a more efficient embedding strategy, namely, pairwise PEE, can be designed to achieve an improved performance. The superiority of our method is verified through extensive experiments.


Journal ArticleDOI
TL;DR: This work proposes novel problem formulations for learning sparsifying transforms from data and proposes alternating minimization algorithms that give rise to well-conditioned square transforms that show the superiority of this approach over analytical sparsify transforms such as the DCT for signal and image representation.
Abstract: The sparsity of signals and images in a certain transform domain or dictionary has been exploited in many applications in signal and image processing. Analytical sparsifying transforms such as Wavelets and DCT have been widely used in compression standards. Recently, synthesis sparsifying dictionaries that are directly adapted to the data have become popular especially in applications such as image denoising, inpainting, and medical image reconstruction. While there has been extensive research on learning synthesis dictionaries and some recent work on learning analysis dictionaries, the idea of learning sparsifying transforms has received no attention. In this work, we propose novel problem formulations for learning sparsifying transforms from data. The proposed alternating minimization algorithms give rise to well-conditioned square transforms. We show the superiority of our approach over analytical sparsifying transforms such as the DCT for signal and image representation. We also show promising performance in signal denoising using the learnt sparsifying transforms. The proposed approach is much faster than previous approaches involving learnt synthesis, or analysis dictionaries.

Book
17 Jan 2013
TL;DR: Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation is an important introduction to numerous algorithmic, architectural and system design aspects of the multimedia standard MPEG-2 and H.263.
Abstract: MPEG-4 is the multimedia standard for combining interactivity, natural and synthetic digital video, audio and computer-graphics Typical applications are: internet, video conferencing, mobile videophones, multimedia cooperative work, teleteaching and games With MPEG-4 the next step from block-based video (ISO/IEC MPEG-1, MPEG-2, CCITT H261, ITU-T H263) to arbitrarily-shaped visual objects is taken This significant step demands a new methodology for system analysis and design to meet the considerably higher flexibility of MPEG-4 Motion estimation is a central part of MPEG-1/2/4 and H261/H263 video compression standards and has attracted much attention in research and industry, for the following reasons: it is computationally the most demanding algorithm of a video encoder (about 60-80% of the total computation time), it has a high impact on the visual quality of a video encoder, and it is not standardized, thus being open to competition Algorithms, Complexity Analysis, and VLSI Architectures for MPEG-4 Motion Estimation covers in detail every single step in the design of a MPEG-1/2/4 or H261/H263 compliant video encoder: Fast motion estimation algorithms Complexity analysis tools Detailed complexity analysis of a software implementation of MPEG-4 video Complexity and visual quality analysis of fast motion estimation algorithms within MPEG-4 Design space on motion estimation VLSI architectures Detailed VLSI design examples of (1) a high throughput and (2) a low-power MPEG-4 motion estimator Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation is an important introduction to numerous algorithmic, architectural and system design aspects of the multimedia standard MPEG-4 As such, all researchers, students and practitioners working in image processing, video coding or system and VLSI design will find this book of interest

Journal ArticleDOI
TL;DR: A fully integrated system for the automatic detection and characterization of cracks in road flexible pavement surfaces, which does not require manually labeled samples, is proposed to minimize the human subjectivity resulting from traditional visual surveys.
Abstract: A fully integrated system for the automatic detection and characterization of cracks in road flexible pavement surfaces, which does not require manually labeled samples, is proposed to minimize the human subjectivity resulting from traditional visual surveys. The first task addressed, i.e., crack detection, is based on a learning from samples paradigm, where a subset of the available image database is automatically selected and used for unsupervised training of the system. The system classifies nonoverlapping image blocks as either containing crack pixels or not. The second task deals with crack type characterization, for which another classification system is constructed, to characterize the detected cracks' connect components. Cracks are labeled according to the types defined in the Portuguese Distress Catalog, with each different crack present in a given image receiving the appropriate label. Moreover, a novel methodology for the assignment of crack severity levels is introduced, computing an estimate for the width of each detected crack. Experimental crack detection and characterization results are presented based on images captured during a visual road pavement surface survey over Portuguese roads, with promising results. This is shown by the quantitative evaluation methodology introduced for the evaluation of this type of system, including a comparison with human experts' manual labeling results.

Proceedings ArticleDOI
23 Jun 2013
TL;DR: The proposed QAC based BIQA method not only has comparable accuracy to those methods using human scored images in learning, but also has merits such as high linearity to human perception of image quality, real-time implementation and availability of image local quality map.
Abstract: General purpose blind image quality assessment (BIQA) has been recently attracting significant attention in the fields of image processing, vision and machine learning. State-of-the-art BIQA methods usually learn to evaluate the image quality by regression from human subjective scores of the training samples. However, these methods need a large number of human scored images for training, and lack an explicit explanation of how the image quality is affected by image local features. An interesting question is then: can we learn for effective BIQA without using human scored images? This paper makes a good effort to answer this question. We partition the distorted images into overlapped patches, and use a percentile pooling strategy to estimate the local quality of each patch. Then a quality-aware clustering (QAC) method is proposed to learn a set of centroids on each quality level. These centroids are then used as a codebook to infer the quality of each patch in a given image, and subsequently a perceptual quality score of the whole image can be obtained. The proposed QAC based BIQA method is simple yet effective. It not only has comparable accuracy to those methods using human scored images in learning, but also has merits such as high linearity to human perception of image quality, real-time implementation and availability of image local quality map.

Journal ArticleDOI
TL;DR: This review presents the past and present work on GPU accelerated medical image processing, and is meant to serve as an overview and introduction to existing GPU implementations.

Book
01 Feb 2013
TL;DR: Most materials covered in this book can be used in conjunction with the author’s first book, Hyperspectral Imaging: Techniques for Spectral Detection and Classification, without much overlap.
Abstract: Hyperspectral Data Processing: Algorithm Design and Analysis is a culmination of the research conducted in the Remote Sensing Signal and Image Processing Laboratory (RSSIPL) at the University of Maryland, Baltimore County. Specifically, it treats hyperspectral image processing and hyperspectral signal processing as separate subjects in two different categories. Most materials covered in this book can be used in conjunction with the author’s first book, Hyperspectral Imaging: Techniques for Spectral Detection and Classification, without much overlap.

Journal ArticleDOI
TL;DR: This paper incorporates the image nonlocal self-similarity into SRM for image interpolation, and shows that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective forimage interpolation.
Abstract: Sparse representation is proven to be a promising approach to image super-resolution, where the low-resolution (LR) image is usually modeled as the down-sampled version of its high-resolution (HR) counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e., the LR image is directly down-sampled from its HR counterpart without blurring, the super-resolution problem becomes an image interpolation problem. In such cases, however, the conventional sparse representation models (SRM) become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, fortunately, many nonlocal similar patches to a given patch could provide nonlocal constraint to the local structure. In this paper, we incorporate the image nonlocal self-similarity into SRM for image interpolation. More specifically, a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in SRM. We show that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective for image interpolation. Our extensive experimental results demonstrate that the proposed NARM-based image interpolation method can effectively reconstruct the edge structures and suppress the jaggy/ringing artifacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as SSIM and FSIM.