scispace - formally typeset
Search or ask a question

Showing papers on "Image processing published in 2015"


Journal ArticleDOI
TL;DR: This work equips the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.
Abstract: Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 $\times$ 224) input image. This requirement is “artificial” and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 $\times$ faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.

5,919 citations


Journal ArticleDOI
TL;DR: An efficient evaluation tool for 3D medical image segmentation is proposed using 20 evaluation metrics based on a comprehensive literature review and guidelines for selecting a subset of these metrics that is suitable for the data and the segmentation task are provided.
Abstract: Medical Image segmentation is an important image processing step. Comparing images to evaluate the quality of segmentation is an essential part of measuring progress in this research area. Some of the challenges in evaluating medical segmentation are: metric selection, the use in the literature of multiple definitions for certain metrics, inefficiency of the metric calculation implementations leading to difficulties with large volumes, and lack of support for fuzzy segmentation by existing metrics. First we present an overview of 20 evaluation metrics selected based on a comprehensive literature review. For fuzzy segmentation, which shows the level of membership of each voxel to multiple classes, fuzzy definitions of all metrics are provided. We present a discussion about metric properties to provide a guide for selecting evaluation metrics. Finally, we propose an efficient evaluation tool implementing the 20 selected metrics. The tool is optimized to perform efficiently in terms of speed and required memory, also if the image size is extremely large as in the case of whole body MRI or CT volume segmentation. An implementation of this tool is available as an open source project. We propose an efficient evaluation tool for 3D medical image segmentation using 20 evaluation metrics and provide guidelines for selecting a subset of these metrics that is suitable for the data and the segmentation task.

1,561 citations


Journal ArticleDOI
TL;DR: A protocol for advanced CUBIC (Clear, Unobstructed Brain/Body Imaging Cocktails and Computational analysis) is described in this paper, which enables simple and efficient organ clearing, rapid imaging by light-sheet microscopy and quantitative imaging analysis of multiple samples.
Abstract: Here we describe a protocol for advanced CUBIC (Clear, Unobstructed Brain/Body Imaging Cocktails and Computational analysis). The CUBIC protocol enables simple and efficient organ clearing, rapid imaging by light-sheet microscopy and quantitative imaging analysis of multiple samples. The organ or body is cleared by immersion for 1-14 d, with the exact time required dependent on the sample type and the experimental purposes. A single imaging set can be completed in 30-60 min. Image processing and analysis can take <1 d, but it is dependent on the number of samples in the data set. The CUBIC clearing protocol can process multiple samples simultaneously. We previously used CUBIC to image whole-brain neural activities at single-cell resolution using Arc-dVenus transgenic (Tg) mice. CUBIC informatics calculated the Venus signal subtraction, comparing different brains at a whole-organ scale. These protocols provide a platform for organism-level systems biology by comprehensively detecting cells in a whole organ or body.

554 citations


Journal ArticleDOI
TL;DR: The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio.
Abstract: Inexact (or approximate) computing is an attractive paradigm for digital processing at nanometric scales. Inexact computing is particularly interesting for computer arithmetic designs. This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. These designs rely on different features of compression, such that imprecision in computation (as measured by the error rate and the so-called normalized error distance) can meet with respect to circuit-based figures of merit of a design (number of transistors, delay and power consumption). Four different schemes for utilizing the proposed approximate compressors are proposed and analyzed for a Dadda multiplier. Extensive simulation results are provided and an application of the approximate multipliers to image processing is presented. The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio (more than 50 dB for the considered image examples).

447 citations


Journal ArticleDOI
TL;DR: This study attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques and introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images.
Abstract: Lung cancer has a poor prognosis when not diagnosed early and unresectable lesions are present. The management of small lung nodules noted on computed tomography scan is controversial due to uncertain tumor characteristics. A conventional computer-aided diagnosis (CAD) scheme requires several image processing and pattern recognition steps to accomplish a quantitative tumor differentiation result. In such an ad hoc image analysis pipeline, every step depends heavily on the performance of the previous step. Accordingly, tuning of classification performance in a conventional CAD scheme is very complicated and arduous. Deep learning techniques, on the other hand, have the intrinsic advantage of an automatic exploitation feature and tuning of performance in a seamless fashion. In this study, we attempted to simplify the image analysis pipeline of conventional CAD with deep learning techniques. Specifically, we introduced models of a deep belief network and a convolutional neural network in the context of nodule classification in computed tomography images. Two baseline methods with feature computing steps were implemented for comparison. The experimental results suggest that deep learning methods could achieve better discriminative results and hold promise in the CAD application domain.

444 citations


Proceedings ArticleDOI
26 Feb 2015
TL;DR: The methods used for the detection of plant diseases using their leaves images are discussed and some segmentation and feature extraction algorithm used in the plant disease detection are discussed.
Abstract: Identification of the plant diseases is the key to preventing the losses in the yield and quantity of the agricultural product. The studies of the plant diseases mean the studies of visually observable patterns seen on the plant. Health monitoring and disease detection on plant is very critical for sustainable agriculture. It is very difficult to monitor the plant diseases manually. It requires tremendous amount of work, expertize in the plant diseases, and also require the excessive processing time. Hence, image processing is used for the detection of plant diseases. Disease detection involves the steps like image acquisition, image pre-processing, image segmentation, feature extraction and classification. This paper discussed the methods used for the detection of plant diseases using their leaves images. This paper also discussed some segmentation and feature extraction algorithm used in the plant disease detection.

412 citations


Proceedings ArticleDOI
07 Jun 2015
TL;DR: The approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets and the ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks.
Abstract: This paper addresses the problem of pixel-level segmentation and classification of scene images with an entirely learning-based approach using Long Short Term Memory (LSTM) recurrent neural networks, which are commonly used for sequence classification. We investigate two-dimensional (2D) LSTM networks for natural scene images taking into account the complex spatial dependencies of labels. Prior methods generally have required separate classification and image segmentation stages and/or pre- and post-processing. In our approach, classification, segmentation, and context integration are all carried out by 2D LSTM networks, allowing texture and spatial model parameters to be learned within a single model. The networks efficiently capture local and global contextual information over raw RGB values and adapt well for complex scene images. Our approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets. In fact, if no pre- or post-processing is applied, LSTM networks outperform other state-of-the-art approaches. Hence, only with a single-core Central Processing Unit (CPU), the running time of our approach is equivalent or better than the compared state-of-the-art approaches which use a Graphics Processing Unit (GPU). Finally, our networks' ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks.

397 citations


Journal ArticleDOI
TL;DR: Results show that the DMAS beamformer outperforms DAS in both simulated and experimental trials and that the main improvement brought about by this new method is a significantly higher contrast resolution, which turns out into an increased dynamic range and better quality of B-mode images.
Abstract: Most of ultrasound medical imaging systems currently on the market implement standard Delay and Sum (DAS) beamforming to form B-mode images. However, image resolution and contrast achievable with DAS are limited by the aperture size and by the operating frequency. For this reason, different beamformers have been presented in the literature that are mainly based on adaptive algorithms, which allow achieving higher performance at the cost of an increased computational complexity. In this paper, we propose the use of an alternative nonlinear beamforming algorithm for medical ultrasound imaging, which is called Delay Multiply and Sum (DMAS) and that was originally conceived for a RADAR microwave system for breast cancer detection. We modify the DMAS beamformer and test its performance on both simulated and experimentally collected linear-scan data, by comparing the Point Spread Functions, beampatterns, synthetic phantom and in vivo carotid artery images obtained with standard DAS and with the proposed algorithm. Results show that the DMAS beamformer outperforms DAS in both simulated and experimental trials and that the main improvement brought about by this new method is a significantly higher contrast resolution (i.e., narrower main lobe and lower side lobes), which turns out into an increased dynamic range and better quality of B-mode images.

376 citations


Journal ArticleDOI
TL;DR: This paper presents a segmentation framework based on Voronoï tessellation constructed from the coordinates of localized molecules, implemented in freely available and open-source SR-Tesseler software, which allows precise, robust and automatic quantification of protein organization at different scales.
Abstract: Localization-based super-resolution techniques open the door to unprecedented analysis of molecular organization. This task often involves complex image processing adapted to the specific topology and quality of the image to be analyzed. Here we present a segmentation framework based on Voronoi tessellation constructed from the coordinates of localized molecules, implemented in freely available and open-source SR-Tesseler software. This method allows precise, robust and automatic quantification of protein organization at different scales, from the cellular level down to clusters of a few fluorescent markers. We validated our method on simulated data and on various biological experimental data of proteins labeled with genetically encoded fluorescent proteins or organic fluorophores. In addition to providing insight into complex protein organization, this polygon-based method should serve as a reference for the development of new types of quantifications, as well as for the optimization of existing ones.

358 citations


Patent
10 Mar 2015
TL;DR: In this paper, an adaptive strobe illumination control process for use in a digital image capture and processing system is described. And the authors present a real-time image analysis based on the results of this analysis, the exposure time (i.e. photonic integration time interval) is automatically adjusted during subsequent image frames (e.g. image acquisition cycles) according to the principles of the present disclosure.
Abstract: An adaptive strobe illumination control process for use in a digital image capture and processing system. In general, the process involves: (i) illuminating an object in the field of view (FOV) with several different pulses of strobe (i.e. stroboscopic) illumination over a pair of consecutive video image frames; (ii) detecting digital images of the illuminated object over these consecutive image frames; and (iii) decode processing the digital images in an effort to read a code symbol graphically encoded therein. In a first illustrative embodiment, upon failure to read a code symbol graphically encoded in one of the first and second images, these digital images are analyzed in real-time, and based on the results of this real-time image analysis, the exposure time (i.e. photonic integration time interval) is automatically adjusted during subsequent image frames (i.e. image acquisition cycles) according to the principles of the present disclosure. In a second illustrative embodiment, upon failure to read a code symbol graphically encoded in one of the first and second images, these digital images are analyzed in real-time, and based on the results of this real-time image analysis, the energy level of the strobe illumination is automatically adjusted during subsequent image frames (i.e. image acquisition cycles) according to the principles of the present disclosure.

352 citations


Journal ArticleDOI
TL;DR: This paper presents a novel multi-focus image fusion method in spatial domain that utilizes a dictionary which is learned from local patches of source images and outperforms existing state-of-the-art methods, in terms of visual and quantitative evaluations.

Journal ArticleDOI
TL;DR: An image processing toolbox that generates images that are linear with respect to radiance from the RAW files of numerous camera brands and can combine image channels from multispectral cameras, including additional ultraviolet photographs, which enables objective measures of reflectance and colour using a wide range of consumer cameras.
Abstract: Quantitative measurements of colour, pattern and morphology are vital to a growing range of disciplines. Digital cameras are readily available and already widely used for making these measurements, having numerous advantages over other techniques, such as spectrometry. However, off-the-shelf consumer cameras are designed to produce images for human viewing, meaning that their uncalibrated photographs cannot be used for making reliable, quantitative measurements. Many studies still fail to appreciate this, and of those scientists who are aware of such issues, many are hindered by a lack of usable tools for making objective measurements from photographs.We have developed an image processing toolbox that generates images that are linear with respect to radiance from the RAW files of numerous camera brands and can combine image channels from multispectral cameras, including additional ultraviolet photographs. Images are then normalised using one or more grey standards to control for lighting conditions. This enables objective measures of reflectance and colour using a wide range of consumer cameras. Furthermore, if the camera's spectral sensitivities are known, the software can convert images to correspond to the visual system (cone-catch values) of a wide range of animals, enabling human and non-human visual systems to be modelled. The toolbox also provides image analysis tools that can extract luminance (lightness), colour and pattern information. Furthermore, all processing is performed on 32-bit floating point images rather than commonly used 8-bit images. This increases precision and reduces the likelihood of data loss through rounding error or saturation of pixels, while also facilitating the measurement of objects with shiny or fluorescent properties.All cameras tested using this software were found to demonstrate a linear response within each image and across a range of exposure times. Cone-catch mapping functions were highly robust, converting images to several animal visual systems and yielding data that agreed closely with spectrometer-based estimates.Our imaging toolbox is freely available as an addition to the open source ImageJ software. We believe that it will considerably enhance the appropriate use of digital cameras across multiple areas of biology, in particular researchers aiming to quantify animal and plant visual signals.

Patent
29 May 2015
TL;DR: In this article, the authors described an imaging apparatus having an imaging assembly that includes an image sensor, which can capture a frame or image data having image data corresponding to a second set of pixels of the image sensor.
Abstract: There is described an imaging apparatus having an imaging assembly that includes an image sensor. The imaging apparatus can capture a frame of image data having image data corresponding to a first set of pixels of the image sensor. The imaging apparatus can capture a frame or image data having image data corresponding to a second set of pixels of the image sensor.

Journal ArticleDOI
TL;DR: The proposed infinite active contour model takes the advantage of using different types of region information, such as the combination of intensity information and local phase based enhancement map, and outperforms its competitors when compared with other widely used unsupervised and supervised methods.
Abstract: Automated detection of blood vessel structures is becoming of crucial interest for better management of vascular disease. In this paper, we propose a new infinite active contour model that uses hybrid region information of the image to approach this problem. More specifically, an infinite perimeter regularizer, provided by using ${\cal L}^{2}$ Lebesgue measure of the $\gamma$ -neighborhood of boundaries, allows for better detection of small oscillatory (branching) structures than the traditional models based on the length of a feature's boundaries (i.e., ${\cal H}^{1}$ Hausdorff measure). Moreover, for better general segmentation performance, the proposed model takes the advantage of using different types of region information, such as the combination of intensity information and local phase based enhancement map. The local phase based enhancement map is used for its superiority in preserving vessel edges while the given image intensity information will guarantee a correct feature's segmentation. We evaluate the performance of the proposed model by applying it to three public retinal image datasets (two datasets of color fundus photography and one fluorescein angiography dataset). The proposed model outperforms its competitors when compared with other widely used unsupervised and supervised methods. For example, the sensitivity (0.742), specificity (0.982) and accuracy (0.954) achieved on the DRIVE dataset are very close to those of the second observer's annotations.

Journal ArticleDOI
TL;DR: A comparative study of the basic Block-Based image segmentation techniques is presented, which shows how these techniques have to be combined with domain knowledge in order to effectively solve an image segmentsation problem for a problem domain.

Journal ArticleDOI
TL;DR: UAV-SFM remote sensing was used to produce 3D multispectral point clouds of Temperate Deciduous forests at different levels of UAV altitude, image overlap, weather, and image processing, with accurate estimates of canopy height.
Abstract: Ecological remote sensing is being transformed by three-dimensional (3D), multispectral measurements of forest canopies by unmanned aerial vehicles (UAV) and computer vision structure from motion (SFM) algorithms. Yet applications of this technology have out-paced understanding of the relationship between collection method and data quality. Here, UAV-SFM remote sensing was used to produce 3D multispectral point clouds of Temperate Deciduous forests at different levels of UAV altitude, image overlap, weather, and image processing. Error in canopy height estimates was explained by the alignment of the canopy height model to the digital terrain model (R2 = 0.81) due to differences in lighting and image overlap. Accounting for this, no significant differences were observed in height error at different levels of lighting, altitude, and side overlap. Overall, accurate estimates of canopy height compared to field measurements (R2 = 0.86, RMSE = 3.6 m) and LIDAR (R2 = 0.99, RMSE = 3.0 m) were obtained under optimal conditions of clear lighting and high image overlap (>80%). Variation in point cloud quality appeared related to the behavior of SFM ‘image features’. Future research should consider the role of image features as the fundamental unit of SFM remote sensing, akin to the pixel of optical imaging and the laser pulse of LIDAR.

Patent
17 Dec 2015
TL;DR: In this paper, an imaging apparatus and method for discriminating between sky and non-sky regions is presented, consisting of a UV-passing image forming element and a UV sensing image capturing sensor.
Abstract: An imaging apparatus and method are provided configured for discriminating between sky and non-sky regions The apparatus comprises a UV-passing image forming element and a UV-sensing image capturing sensor for capturing an image projected from the UV- passing image forming element The image comprising at least one of sky and non-sky image regions and the image captured by the sensor is provided to an image processing unit for performing discrimination between sky and non-sky regions of the captured image using primarily UV wavelengths

Journal ArticleDOI
TL;DR: Experimental results on multi-focus and multi-modal image sets demonstrate that the ASR-based fusion method can outperform the conventional SR-based method in terms of both visual quality and objective assessment.
Abstract: In this study, a novel adaptive sparse representation (ASR) model is presented for simultaneous image fusion and denoising. As a powerful signal modelling technique, sparse representation (SR) has been successfully employed in many image processing applications such as denoising and fusion. In traditional SR-based applications, a highly redundant dictionary is always needed to satisfy signal reconstruction requirement since the structures vary significantly across different image patches. However, it may result in potential visual artefacts as well as high computational cost. In the proposed ASR model, instead of learning a single redundant dictionary, a set of more compact sub-dictionaries are learned from numerous high-quality image patches which have been pre-classified into several corresponding categories based on their gradient information. At the fusion and denoising processes, one of the sub-dictionaries is adaptively selected for a given set of source image patches. Experimental results on multi-focus and multi-modal image sets demonstrate that the ASR-based fusion method can outperform the conventional SR-based method in terms of both visual quality and objective assessment.

Journal ArticleDOI
TL;DR: The proposed algorithm is less dependent on training data, requires less segmentation time and achieves consistent vessel segmentation accuracy on normal images as well as images with pathology when compared to existing supervised segmentation methods.
Abstract: This paper presents a novel three-stage blood vessel segmentation algorithm using fundus photographs. In the first stage, the green plane of a fundus image is preprocessed to extract a binary image after high-pass filtering, and another binary image from the morphologically reconstructed enhanced image for the vessel regions. Next, the regions common to both the binary images are extracted as the major vessels . In the second stage, all remaining pixels in the two binary images are classified using a Gaussian mixture model (GMM) classifier using a set of eight features that are extracted based on pixel neighborhood and first and second-order gradient images. In the third postprocessing stage, the major portions of the blood vessels are combined with the classified vessel pixels. The proposed algorithm is less dependent on training data, requires less segmentation time and achieves consistent vessel segmentation accuracy on normal images as well as images with pathology when compared to existing supervised segmentation methods. The proposed algorithm achieves a vessel segmentation accuracy of 95.2%, 95.15%, and 95.3% in an average of 3.1, 6.7, and 11.7 s on three public datasets DRIVE, STARE, and CHASE_DB1, respectively.

Journal ArticleDOI
TL;DR: ClearVolume makes live imaging truly live by enabling direct real-time inspection of the specimen imaged in light-sheet microscopes through an interface for stabilization of the sample.
Abstract: Further, we demonstrated the use of ClearVolume for longterm time-lapse imaging with an OpenSPIM microscope5. ClearVolume was readily integrated into the mManager (www. micro-manager.org/) plug-in to allow live 3D visualization. We used ClearVolume to remotely monitor a developing Drosophila melanogaster embryo to check on sample drift, photobleaching and other artifacts (Supplementary Video 5). Time-shifting allows inspection of the data at any given point in time during acquisition (Supplementary Video 6). In addition, ClearVolume can be used for aligning and calibrating light-sheet microscopes. The overall 3D point spread function of the optical system and the full 3D structure of the beam (for example, Gaussian or Bessel beam) can be visualized (Supplementary Videos 7 and 8). To aid manual alignment of the microscope, ClearVolume computes image sharpness in real time and provides audiovisual feedback (Supplementary Video 9). Finally, ClearVolume computes and visualizes sample drift trajectories and makes that information available through an interface for stabilization of the sample (Supplementary Video 10). ClearVolume can also be used on other types of microscopes such as confocal microscopes (Supplementary Video 11). Going beyond microscopy, ClearVolume is integrated with the popular Fiji6 (http://fiji.sc) and KNIME (http://www.knime.org/) software packages, bringing real-time 3D+t multichannel volume visualization to a larger community (Supplementary Videos 12–14). Furthermore, ClearVolume’s modularity allows any user to implement additional CPUor GPU-based modules for image analysis and visualization. In summary, ClearVolume makes live imaging truly live by enabling direct real-time inspection of the specimen imaged in light-sheet microscopes. The source code of ClearVolume is available at http://clearvolume.github.io.

Journal ArticleDOI
TL;DR: In this article, the authors reviewed the various quality metrics available in the literature, for assessing the quality of fused image, and evaluated the performance of the fused image by two variants such as with reference image and without reference image.

Journal ArticleDOI
TL;DR: A novel stopping criterion is presented that terminates the iterative process leading to higher vessel segmentation accuracy and is robust to the rate of new vessel pixel addition.
Abstract: This paper presents a novel unsupervised iterative blood vessel segmentation algorithm using fundus images. First, a vessel enhanced image is generated by tophat reconstruction of the negative green plane image. An initial estimate of the segmented vasculature is extracted by global thresholding the vessel enhanced image. Next, new vessel pixels are identified iteratively by adaptive thresholding of the residual image generated by masking out the existing segmented vessel estimate from the vessel enhanced image. The new vessel pixels are, then, region grown into the existing vessel, thereby resulting in an iterative enhancement of the segmented vessel structure. As the iterations progress, the number of false edge pixels identified as new vessel pixels increases compared to the number of actual vessel pixels. A key contribution of this paper is a novel stopping criterion that terminates the iterative process leading to higher vessel segmentation accuracy. This iterative algorithm is robust to the rate of new vessel pixel addition since it achieves 93.2–95.35% vessel segmentation accuracy with 0.9577–0.9638 area under ROC curve (AUC) on abnormal retinal images from the STARE dataset. The proposed algorithm is computationally efficient and consistent in vessel segmentation performance for retinal images with variations due to pathology, uneven illumination, pigmentation, and fields of view since it achieves a vessel segmentation accuracy of about 95% in an average time of 2.45, 3.95, and 8 s on images from three public datasets DRIVE, STARE, and CHASE_DB1, respectively. Additionally, the proposed algorithm has more than 90% segmentation accuracy for segmenting peripapillary blood vessels in the images from the DRIVE and CHASE_DB1 datasets.

Journal ArticleDOI
TL;DR: Four transfer classifiers are presented that can train a classification scheme with only a small amount of representative training data, in addition to a larger amount of other training data with slightly different characteristics that may improve performance over supervised learning for segmentation across scanners and scan protocols.
Abstract: The variation between images obtained with different scanners or different imaging protocols presents a major challenge in automatic segmentation of biomedical images. This variation especially hampers the application of otherwise successful supervised-learning techniques which, in order to perform well, often require a large amount of labeled training data that is exactly representative of the target data. We therefore propose to use transfer learning for image segmentation. Transfer-learning techniques can cope with differences in distributions between training and target data, and therefore may improve performance over supervised learning for segmentation across scanners and scan protocols. We present four transfer classifiers that can train a classification scheme with only a small amount of representative training data, in addition to a larger amount of other training data with slightly different characteristics. The performance of the four transfer classifiers was compared to that of standard supervised classification on two magnetic resonance imaging brain-segmentation tasks with multi-site data: white matter, gray matter, and cerebrospinal fluid segmentation; and white-matter-/MS-lesion segmentation. The experiments showed that when there is only a small amount of representative training data available, transfer learning can greatly outperform common supervised-learning approaches, minimizing classification errors by up to 60%.

15 Oct 2015
TL;DR: In this article, where-CNN is used to learn a feature representation in which matching views are near one another and mismatched views are far apart, which achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases.
Abstract: : The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or bird's eye imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.

Journal ArticleDOI
TL;DR: It is demonstrated that Random Forest regression-voting can be used to generate high quality response images quickly and leads to fast and accurate shape model matching when applied in the Constrained Local Model framework.
Abstract: A widely used approach for locating points on deformable objects in images is to generate feature response images for each point, and then to fit a shape model to these response images. We demonstrate that Random Forest regression-voting can be used to generate high quality response images quickly. Rather than using a generative or a discriminative model to evaluate each pixel, a regressor is used to cast votes for the optimal position of each point. We show that this leads to fast and accurate shape model matching when applied in the Constrained Local Model framework. We evaluate the technique in detail, and compare it with a range of commonly used alternatives across application areas: the annotation of the joints of the hands in radiographs and the detection of feature points in facial images. We show that our approach outperforms alternative techniques, achieving what we believe to be the most accurate results yet published for hand joint annotation and state-of-the-art performance for facial feature point detection.

Journal ArticleDOI
TL;DR: A systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data concludes that the family of kernel-based MLRAs (e.g. GPR) is the most promising processing approach.
Abstract: Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC), collected at the agricultural site of Barrax (Spain), was used to evaluate different retrieval methods on their ability to estimate leaf area index (LAI). With regard to parametric methods, all possible band combinations for several two-band and three-band index formulations and a linear regression fitting function have been evaluated. From a set of over ten thousand indices evaluated, the best performing one was an optimized three-band combination according to ( ρ 560 - ρ 1610 - ρ 2190 ) / ( ρ 560 + ρ 1610 + ρ 2190 ) with a 10-fold cross-validation R CV 2 of 0.82 ( RMSE CV : 0.62). This family of methods excel for their fast processing speed, e.g., 0.05 s to calibrate and validate the regression function, and 3.8 s to map a simulated S2 image. With regard to non-parametric methods, 11 machine learning regression algorithms (MLRAs) have been evaluated. This methodological family has the advantage of making use of the full optical spectrum as well as flexible, nonlinear fitting. Particularly kernel-based MLRAs lead to excellent results, with variational heteroscedastic (VH) Gaussian Processes regression (GPR) as the best performing method, with a R CV 2 of 0.90 ( RMSE CV : 0.44). Additionally, the model is trained and validated relatively fast (1.70 s) and the processed image (taking 73.88 s) includes associated uncertainty estimates. More challenging is the inversion of a PROSAIL based radiative transfer model (RTM). After the generation of a look-up table (LUT), a multitude of cost functions and regularization options were evaluated. The best performing cost function is Pearson’s χ -square. It led to a R 2 of 0.74 ( RMSE : 0.80) against the validation dataset. While its validation went fast (0.33 s), due to a per-pixel LUT solving using a cost function, image processing took considerably more time (01:01:47). Summarizing, when it comes to accurate and sufficiently fast processing of imagery to generate vegetation attributes, this paper concludes that the family of kernel-based MLRAs (e.g. GPR) is the most promising processing approach.

Journal ArticleDOI
Peter Erwin1
TL;DR: In this article, the authors describe an open-source astronomical image-fitting program called IMFIT, specialized for galaxies but potentially useful for other sources, which is fast, flexible, and highly extensible.
Abstract: I describe a new, open-source astronomical image-fitting program called IMFIT, specialized for galaxies but potentially useful for other sources, which is fast, flexible, and highly extensible. A key characteristic of the program is an object-oriented design that allows new types of image components (two-dimensional surface-brightness functions) to be easily written and added to the program. Image functions provided with IMFIT include the usual suspects for galaxy decompositions (S?rsic, exponential, Gaussian), along with Core-S?rsic and broken-exponential profiles, elliptical rings, and three components that perform line-of-sight integration through three-dimensional luminosity-density models of disks and rings seen at arbitrary inclinations. Available minimization algorithms include Levenberg-Marquardt, Nelder-Mead simplex, and Differential Evolution, allowing trade-offs between speed and decreased sensitivity to local minima in the fit landscape. Minimization can be done using the standard ?2 statistic (using either data or model values to estimate per-pixel Gaussian errors, or else user-supplied error images) or Poisson-based maximum-likelihood statistics; the latter approach is particularly appropriate for cases of Poisson data in the low-count regime. I show that fitting low-signal-to-noise ratio galaxy images using ?2 minimization and individual-pixel Gaussian uncertainties can lead to significant biases in fitted parameter values, which are avoided if a Poisson-based statistic is used; this is true even when Gaussian read noise is present.

Posted Content
TL;DR: It is shown that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged, and a novel, differentiable error function is proposed.
Abstract: Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is L2. In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer. We compare the performance of several losses, and propose a novel, differentiable error function. We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.

Journal ArticleDOI
TL;DR: The potential of image fusion is demonstrated through 'sharpening' of IMS images, which uses microscopy measurements to predict ion distributions at a spatial resolution that exceeds that of measured ion images by ten times or more, and prediction of ion distributions in tissue areas that were not measured by IMS.
Abstract: We describe a predictive imaging modality created by 'fusing' two distinct technologies: imaging mass spectrometry (IMS) and microscopy. IMS-generated molecular maps, rich in chemical information but having coarse spatial resolution, are combined with optical microscopy maps, which have relatively low chemical specificity but high spatial information. The resulting images combine the advantages of both technologies, enabling prediction of a molecular distribution both at high spatial resolution and with high chemical specificity. Multivariate regression is used to model variables in one technology, using variables from the other technology. We demonstrate the potential of image fusion through several applications: (i) 'sharpening' of IMS images, which uses microscopy measurements to predict ion distributions at a spatial resolution that exceeds that of measured ion images by ten times or more; (ii) prediction of ion distributions in tissue areas that were not measured by IMS; and (iii) enrichment of biological signals and attenuation of instrumental artifacts, revealing insights not easily extracted from either microscopy or IMS individually.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: An automated method for image colorization that learns to colorize from examples that exploits a LEARCH framework to train a quadratic objective function in the chromaticity maps, comparable to a Gaussian random field.
Abstract: We describe an automated method for image colorization that learns to colorize from examples. Our method exploits a LEARCH framework to train a quadratic objective function in the chromaticity maps, comparable to a Gaussian random field. The coefficients of the objective function are conditioned on image features, using a random forest. The objective function admits correlations on long spatial scales, and can control spatial error in the colorization of the image. Images are then colorized by minimizing this objective function. We demonstrate that our method strongly outperforms a natural baseline on large-scale experiments with images of real scenes using a demanding loss function. We demonstrate that learning a model that is conditioned on scene produces improved results. We show how to incorporate a desired color histogram into the objective function, and that doing so can lead to further improvements in results.