scispace - formally typeset
Search or ask a question

Showing papers on "Image processing published in 2019"


Proceedings ArticleDOI
Ekin D. Cubuk1, Barret Zoph1, Dandelion Mane, Vijay K. Vasudevan1, Quoc V. Le1 
15 Jun 2019
TL;DR: This paper describes a simple procedure called AutoAugment to automatically search for improved data augmentation policies, which achieves state-of-the-art accuracy on CIFAR-10, CIFar-100, SVHN, and ImageNet (without additional data).
Abstract: Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment to automatically search for improved data augmentation policies. In our implementation, we have designed a search space where a policy consists of many sub-policies, one of which is randomly chosen for each image in each mini-batch. A sub-policy consists of two operations, each operation being an image processing function such as translation, rotation, or shearing, and the probabilities and magnitudes with which the functions are applied. We use a search algorithm to find the best policy such that the neural network yields the highest validation accuracy on a target dataset. Our method achieves state-of-the-art accuracy on CIFAR-10, CIFAR-100, SVHN, and ImageNet (without additional data). On ImageNet, we attain a Top-1 accuracy of 83.5% which is 0.4% better than the previous record of 83.1%. On CIFAR-10, we achieve an error rate of 1.5%, which is 0.6% better than the previous state-of-the-art. Augmentation policies we find are transferable between datasets. The policy learned on ImageNet transfers well to achieve significant improvements on other datasets, such as Oxford Flowers, Caltech-101, Oxford-IIT Pets, FGVC Aircraft, and Stanford Cars.

1,902 citations


Journal ArticleDOI
TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.

1,228 citations


Journal ArticleDOI
TL;DR: Comprehensive results show that the proposed CE-Net method outperforms the original U- net method and other state-of-the-art methods for optic disc segmentation, vessel detection, lung segmentation , cell contour segmentation and retinal optical coherence tomography layer segmentation.
Abstract: Medical image segmentation is an important step in medical image analysis. With the rapid development of a convolutional neural network in image processing, deep learning has been used for medical image segmentation, such as optic disc segmentation, blood vessel detection, lung segmentation, cell segmentation, and so on. Previously, U-net based approaches have been proposed. However, the consecutive pooling and strided convolutional operations led to the loss of some spatial information. In this paper, we propose a context encoder network (CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation. CE-Net mainly contains three major components: a feature encoder module, a context extractor, and a feature decoder module. We use the pretrained ResNet block as the fixed feature extractor. The context extractor module is formed by a newly proposed dense atrous convolution block and a residual multi-kernel pooling block. We applied the proposed CE-Net to different 2D medical image segmentation tasks. Comprehensive results show that the proposed method outperforms the original U-Net method and other state-of-the-art methods for optic disc segmentation, vessel detection, lung segmentation, cell contour segmentation, and retinal optical coherence tomography layer segmentation.

906 citations


Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation, which mainly contains three major components: a feature encoder module, a context extractor and a feature decoder module.
Abstract: Medical image segmentation is an important step in medical image analysis. With the rapid development of convolutional neural network in image processing, deep learning has been used for medical image segmentation, such as optic disc segmentation, blood vessel detection, lung segmentation, cell segmentation, etc. Previously, U-net based approaches have been proposed. However, the consecutive pooling and strided convolutional operations lead to the loss of some spatial information. In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation. CE-Net mainly contains three major components: a feature encoder module, a context extractor and a feature decoder module. We use pretrained ResNet block as the fixed feature extractor. The context extractor module is formed by a newly proposed dense atrous convolution (DAC) block and residual multi-kernel pooling (RMP) block. We applied the proposed CE-Net to different 2D medical image segmentation tasks. Comprehensive results show that the proposed method outperforms the original U-Net method and other state-of-the-art methods for optic disc segmentation, vessel detection, lung segmentation, cell contour segmentation and retinal optical coherence tomography layer segmentation.

788 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a systematic review of deep learning-based hyperspectral image classification literatures and compare several strategies for this topic, which can provide some guidelines for future studies on this topic.
Abstract: Hyperspectral image (HSI) classification has become a hot topic in the field of remote sensing. In general, the complex characteristics of hyperspectral data make the accurate classification of such data challenging for traditional machine learning methods. In addition, hyperspectral imaging often deals with an inherently nonlinear relation between the captured spectral information and the corresponding materials. In recent years, deep learning has been recognized as a powerful feature-extraction tool to effectively address nonlinear problems and widely used in a number of image processing tasks. Motivated by those successful applications, deep learning has also been introduced to classify HSIs and demonstrated good performance. This survey paper presents a systematic review of deep learning-based HSI classification literatures and compares several strategies for this topic. Specifically, we first summarize the main challenges of HSI classification which cannot be effectively overcome by traditional machine learning methods, and also introduce the advantages of deep learning to handle these problems. Then, we build a framework which divides the corresponding works into spectral-feature networks, spatial-feature networks, and spectral-spatial-feature networks to systematically review the recent achievements in deep learning-based HSI classification. In addition, considering the fact that available training samples in the remote sensing field are usually very limited and training deep networks require a large number of samples, we include some strategies to improve classification performance, which can provide some guidelines for future studies on this topic. Finally, several representative deep learning-based classification methods are conducted on real HSIs in our experiments.

761 citations


Posted ContentDOI
15 Feb 2019-bioRxiv
TL;DR: A high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software is provided.
Abstract: MRtrix3 is an open-source, cross-platform software package for medical image processing, analysis and visualization, with a particular emphasis on the investigation of the brain using diffusion MRI. It is implemented using a fast, modular and flexible general-purpose code framework for image data access and manipulation, enabling efficient development of new applications, whilst retaining high computational performance and a consistent command-line interface between applications. In this article, we provide a high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software.

728 citations


Journal ArticleDOI
TL;DR: The intersection between deep learning and cellular image analysis is reviewed and an overview of both the mathematical mechanics and the programming frameworks of deep learning that are pertinent to life scientists are provided.
Abstract: Recent advances in computer vision and machine learning underpin a collection of algorithms with an impressive ability to decipher the content of images. These deep learning algorithms are being applied to biological images and are transforming the analysis and interpretation of imaging data. These advances are positioned to render difficult analyses routine and to enable researchers to carry out new, previously impossible experiments. Here we review the intersection between deep learning and cellular image analysis and provide an overview of both the mathematical mechanics and the programming frameworks of deep learning that are pertinent to life scientists. We survey the field's progress in four key applications: image classification, image segmentation, object tracking, and augmented microscopy. Last, we relay our labs' experience with three key aspects of implementing deep learning in the laboratory: annotating training data, selecting and training a range of neural network architectures, and deploying solutions. We also highlight existing datasets and implementations for each surveyed application.

714 citations


Journal ArticleDOI
TL;DR: Warp is described, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation, and includes deep-learning-based models for accurate particle picking and image denoising.
Abstract: The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens must be tightly coupled to data preprocessing to ensure the best data quality and microscope usage. Here we describe Warp, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation. Warp corrects micrographs for global and local motion, estimates the local defocus and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. The software further includes deep-learning-based models for accurate particle picking and image denoising. The output from Warp can be fed into established programs for particle classification and 3D-map refinement. Our benchmarks show improvement in the nominal resolution, which went from 3.9 A to 3.2 A, of a published cryo-EM data set for influenza virus hemagglutinin. Warp is easy to install from http://github.com/cramerlab/warp and computationally inexpensive, and has an intuitive, streamlined user interface. The user-friendly software tool Warp enables automated, on-the-fly preprocessing of cryo-EM data, including motion correction, defocus estimation, particle picking and image denoising.

655 citations


Journal ArticleDOI
TL;DR: This paper provides an analysis of 8 different evaluation metrics and their properties, and makes recommendations for metric selections under specific assumptions and for specific applications.
Abstract: How best to evaluate a saliency model's ability to predict where humans look in images is an open research question. The choice of evaluation metric depends on how saliency is defined and how the ground truth is represented. Metrics differ in how they rank saliency models, and this results from how false positives and false negatives are treated, whether viewing biases are accounted for, whether spatial deviations are factored in, and how the saliency maps are pre-processed. In this paper, we provide an analysis of 8 different evaluation metrics and their properties. With the help of systematic experiments and visualizations of metric computations, we add interpretability to saliency scores and more transparency to the evaluation of saliency models. Building off the differences in metric properties and behaviors, we make recommendations for metric selections under specific assumptions and for specific applications.

526 citations


Journal ArticleDOI
TL;DR: The 2018 Data Science Bowl attracted 3,891 teams worldwide to make the first attempt to build a segmentation method that could be applied to any two-dimensional light microscopy image of stained nuclei across experiments, with no human interaction.
Abstract: Segmenting the nuclei of cells in microscopy images is often the first step in the quantitative analysis of imaging data for biological and biomedical applications. Many bioimage analysis tools can segment nuclei in images but need to be selected and configured for every experiment. The 2018 Data Science Bowl attracted 3,891 teams worldwide to make the first attempt to build a segmentation method that could be applied to any two-dimensional light microscopy image of stained nuclei across experiments, with no human interaction. Top participants in the challenge succeeded in this task, developing deep-learning-based models that identified cell nuclei across many image types and experimental conditions without the need to manually adjust segmentation parameters. This represents an important step toward configuration-free bioimage analysis software tools.

400 citations


Posted Content
TL;DR: The increasing popularity of unrolled deep networks is due, in part, to their potential in developing efficient, high-performance (yet interpretable) network architectures from reasonably sized training sets.
Abstract: Deep neural networks provide unprecedented performance gains in many real world problems in signal and image processing. Despite these gains, future development and practical deployment of deep networks is hindered by their blackbox nature, i.e., lack of interpretability, and by the need for very large training sets. An emerging technique called algorithm unrolling or unfolding offers promise in eliminating these issues by providing a concrete and systematic connection between iterative algorithms that are used widely in signal processing and deep neural networks. Unrolling methods were first proposed to develop fast neural network approximations for sparse coding. More recently, this direction has attracted enormous attention and is rapidly growing both in theoretic investigations and practical applications. The growing popularity of unrolled deep networks is due in part to their potential in developing efficient, high-performance and yet interpretable network architectures from reasonable size training sets. In this article, we review algorithm unrolling for signal and image processing. We extensively cover popular techniques for algorithm unrolling in various domains of signal and image processing including imaging, vision and recognition, and speech processing. By reviewing previous works, we reveal the connections between iterative algorithms and neural networks and present recent theoretical results. Finally, we provide a discussion on current limitations of unrolling and suggest possible future research directions.

Journal ArticleDOI
TL;DR: The estimation error shows that the presented algorithm is comparable to the minimum mean square error (MMSE) with full knowledge of the channel statistics, and it is better than an approximation to linear MMSE.
Abstract: In this letter, we present a deep learning algorithm for channel estimation in communication systems. We consider the time–frequency response of a fast fading communication channel as a 2D image. The aim is to find the unknown values of the channel response using some known values at the pilot locations. To this end, a general pipeline using deep image processing techniques, image super-resolution (SR), and image restoration (IR) is proposed. This scheme considers the pilot values, altogether, as a low-resolution image and uses an SR network cascaded with a denoising IR network to estimate the channel. Moreover, the implementation of the proposed pipeline is presented. The estimation error shows that the presented algorithm is comparable to the minimum mean square error (MMSE) with full knowledge of the channel statistics, and it is better than an approximation to linear MMSE. The results confirm that this pipeline can be used efficiently in channel estimation.

Proceedings ArticleDOI
15 Jun 2019
TL;DR: In this paper, the authors propose a technique to "unprocess" images by inverting each step of an image processing pipeline, thereby allowing them to synthesize realistic raw sensor measurements from commonly available Internet photos.
Abstract: Machine learning techniques work best when the data used for training resembles the data used for evaluation. This holds true for learned single-image denoising algorithms, which are applied to real raw camera sensor readings but, due to practical constraints, are often trained on synthetic image data. Though it is understood that generalizing from synthetic to real images requires careful consideration of the noise properties of camera sensors, the other aspects of an image processing pipeline (such as gain, color correction, and tone mapping) are often overlooked, despite their significant effect on how raw measurements are transformed into finished images. To address this, we present a technique to “unprocess” images by inverting each step of an image processing pipeline, thereby allowing us to synthesize realistic raw sensor measurements from commonly available Internet photos. We additionally model the relevant components of an image processing pipeline when evaluating our loss function, which allows training to be aware of all relevant photometric processing that will occur after denoising. By unprocessing and processing training data and model outputs in this way, we are able to train a simple convolutional neural network that has 14%-38% lower error rates and is 9×-18× faster than the previous state of the art on the Darmstadt Noise Dataset, and generalizes to sensors outside of that dataset as well.

Journal ArticleDOI
TL;DR: In this article, a conditional generative adversarial network (GAN) was proposed to preserve intermediate-to-high frequency details via an adversarial loss, and it offers enhanced synthesis performance via pixel-wise and perceptual losses for registered multi-contrast images and a cycle-consistency loss for unregistered images.
Abstract: Acquiring images of the same anatomy with multiple different contrasts increases the diversity of diagnostic information available in an MR exam. Yet, the scan time limitations may prohibit the acquisition of certain contrasts, and some contrasts may be corrupted by noise and artifacts. In such cases, the ability to synthesize unacquired or corrupted contrasts can improve diagnostic utility. For multi-contrast synthesis, the current methods learn a nonlinear intensity transformation between the source and target images, either via nonlinear regression or deterministic neural networks. These methods can, in turn, suffer from the loss of structural details in synthesized images. Here, in this paper, we propose a new approach for multi-contrast MRI synthesis based on conditional generative adversarial networks. The proposed approach preserves intermediate-to-high frequency details via an adversarial loss, and it offers enhanced synthesis performance via pixel-wise and perceptual losses for registered multi-contrast images and a cycle-consistency loss for unregistered images. Information from neighboring cross-sections are utilized to further improve synthesis quality. Demonstrations on T1- and T2- weighted images from healthy subjects and patients clearly indicate the superior performance of the proposed approach compared to the previous state-of-the-art methods. Our synthesis approach can help improve the quality and versatility of the multi-contrast MRI exams without the need for prolonged or repeated examinations.

Journal ArticleDOI
TL;DR: A gentle introduction to deep learning in medical image processing is given, proceeding from theoretical foundations to applications, including general reasons for the popularity of deep learning, including several major breakthroughs in computer science.
Abstract: This paper tries to give a gentle introduction to deep learning in medical image processing, proceeding from theoretical foundations to applications. We first discuss general reasons for the popularity of deep learning, including several major breakthroughs in computer science. Next, we start reviewing the fundamental basics of the perceptron and neural networks, along with some fundamental theory that is often omitted. Doing so allows us to understand the reasons for the rise of deep learning in many application domains. Obviously medical image processing is one of these areas which has been largely affected by this rapid progress, in particular in image detection and recognition, image segmentation, image registration, and computer-aided diagnosis. There are also recent trends in physical simulation, modeling, and reconstruction that have led to astonishing results. Yet, some of these approaches neglect prior knowledge and hence bear the risk of producing implausible results. These apparent weaknesses highlight current limitations of deep ()learning. However, we also briefly discuss promising approaches that might be able to resolve these problems in the future.

Journal ArticleDOI
TL;DR: Enhanced Convolutional Neural Networks (ECNN) is proposed with loss function optimization by BAT algorithm for automatic segmentation method and the experimental result shows the better performance while comparing with the existing methods.
Abstract: In medical image processing, Brain tumor segmentation plays an important role. Early detection of these tumors is highly required to give Treatment of patients. The patient’s life chances are improved by the early detection of it. The process of diagnosing the brain tumoursby the physicians is normally carried out using a manual way of segmentation. It is time consuming and a difficult one. To solve these problems, Enhanced Convolutional Neural Networks (ECNN) is proposed with loss function optimization by BAT algorithm for automatic segmentation method. The primary aim is to present optimization based MRIs image segmentation. Small kernels allow the design in a deep architecture. It has a positive consequence with respect to overfitting provided the lesser weights are assigned to the network. Skull stripping and image enhancement algorithms are used for pre-processing. The experimental result shows the better performance while comparing with the existing methods. The compared parameters are precision, recall and accuracy. In future, different selecting schemes can be adopted to improve the accuracy.

Journal ArticleDOI
TL;DR: Key questions are discussed, including how to obtain training data, whether discovery of unknown structures is possible, and the danger of inferring unsubstantiated image details.
Abstract: Deep learning is becoming an increasingly important tool for image reconstruction in fluorescence microscopy. We review state-of-the-art applications such as image restoration and super-resolution imaging, and discuss how the latest deep learning research could be applied to other image reconstruction tasks. Despite its successes, deep learning also poses substantial challenges and has limits. We discuss key questions, including how to obtain training data, whether discovery of unknown structures is possible, and the danger of inferring unsubstantiated image details.

Journal ArticleDOI
TL;DR: This survey provides a comprehensive survey of the texture feature extraction methods and identifies two classes of methods that deserve attention in the future, as their performances seem interesting, but their thorough study is not performed yet.
Abstract: Texture analysis is used in a very broad range of fields and applications, from texture classification (e.g., for remote sensing) to segmentation (e.g., in biomedical imaging), passing through image synthesis or pattern recognition (e.g., for image inpainting). For each of these image processing procedures, first, it is necessary to extract—from raw images—meaningful features that describe the texture properties. Various feature extraction methods have been proposed in the last decades. Each of them has its advantages and limitations: performances of some of them are not modified by translation, rotation, affine, and perspective transform; others have a low computational complexity; others, again, are easy to implement; and so on. This paper provides a comprehensive survey of the texture feature extraction methods. The latter are categorized into seven classes: statistical approaches, structural approaches, transform-based approaches, model-based approaches, graph-based approaches, learning-based approaches, and entropy-based approaches. For each method in these seven classes, we present the concept, the advantages, and the drawbacks and give examples of application. This survey allows us to identify two classes of methods that, particularly, deserve attention in the future, as their performances seem interesting, but their thorough study is not performed yet.

Journal ArticleDOI
TL;DR: In this article, the authors divide semantic image segmentation methods into two categories: traditional and recent DNN method, and comprehensively investigate recent methods based on DNN which are described in the eight aspects: fully convolutional network, up-sample ways, FCN joint with CRF methods, dilated convolution approaches, progresses in backbone network, pyramid methods, multi-level feature and multi-stage method, supervised, weakly-supervised and unsupervised methods.
Abstract: Semantic image segmentation, which becomes one of the key applications in image processing and computer vision domain, has been used in multiple domains such as medical area and intelligent transportation. Lots of benchmark datasets are released for researchers to verify their algorithms. Semantic segmentation has been studied for many years. Since the emergence of Deep Neural Network (DNN), segmentation has made a tremendous progress. In this paper, we divide semantic image segmentation methods into two categories: traditional and recent DNN method. Firstly, we briefly summarize the traditional method as well as datasets released for segmentation, then we comprehensively investigate recent methods based on DNN which are described in the eight aspects: fully convolutional network, up-sample ways, FCN joint with CRF methods, dilated convolution approaches, progresses in backbone network, pyramid methods, Multi-level feature and multi-stage method, supervised, weakly-supervised and unsupervised methods. Finally, a conclusion in this area is drawn.

Journal ArticleDOI
TL;DR: This paper proposes a simple approach to implicitly select skin tissues based on their distinct pulsatility feature and shows that this method outperforms state of the art algorithms, without any critical face or skin detection.

Journal ArticleDOI
TL;DR: This work proposes a simple convolutional neural network optimized for the problem of tuberculosis diagnosis which is faster and more efficient than previous models but preserves their accuracy.
Abstract: Automated diagnosis of tuberculosis (TB) from chest X-Rays (CXR) has been tackled with either hand-crafted algorithms or machine learning approaches such as support vector machines (SVMs) and convolutional neural networks (CNNs). Most deep neural network applied to the task of tuberculosis diagnosis have been adapted from natural image classification. These models have a large number of parameters as well as high hardware requirements, which makes them prone to overfitting and harder to deploy in mobile settings. We propose a simple convolutional neural network optimized for the problem which is faster and more efficient than previous models but preserves their accuracy. Moreover, the visualization capabilities of CNNs have not been fully investigated. We test saliency maps and grad-CAMs as tuberculosis visualization methods, and discuss them from a radiological perspective.

Journal ArticleDOI
TL;DR: A deep attention-based spatially recursive model that can learn to attend to critical object parts and encode them into spatially expressive representations and is end-to-end trainable to serve as the part detector and feature extractor.
Abstract: Fine-grained visual recognition is an important problem in pattern recognition applications. However, it is a challenging task due to the subtle interclass difference and large intraclass variation. Recent visual attention models are able to automatically locate critical object parts and represent them against appearance variations. However, without consideration of spatial dependencies in discriminative feature learning, these methods are underperformed in classifying fine-grained objects. In this paper, we present a deep attention-based spatially recursive model that can learn to attend to critical object parts and encode them into spatially expressive representations. Our network is technically premised on bilinear pooling, enabling local pairwise feature interactions between outputs from two different convolutional neural networks (CNNs) that correspond to distinct region detection and relevant feature extraction. Then, spatial long-short term memory (LSTMs) units are introduced to generate spatially meaningful hidden representations via the long-range dependency on all features in two dimensions. The attention model is leveraged between bilinear outcomes and spatial LSTMs for dynamic selection on varied inputs. Our model, which is composed of two-stream CNN layers, bilinear pooling, and spatial recursive encoding with attention, is end-to-end trainable to serve as the part detector and feature extractor whereby relevant features are localized, extracted, and encoded spatially for recognition purpose. We demonstrate the superiority of our method over two typical fine-grained recognition tasks: fine-grained image classification and person re-identification.

Journal ArticleDOI
TL;DR: A joint model is built to integrate the nonlocal self-similarity of video/hyperspectral frames and the rank minimization approach with the SCI sensing process and an alternating minimization algorithm is developed to solve the non-convex problem of SCI reconstruction.
Abstract: Snapshot compressive imaging (SCI) refers to compressive imaging systems where multiple frames are mapped into a single measurement, with video compressive imaging and hyperspectral compressive imaging as two representative applications. Though exciting results of high-speed videos and hyperspectral images have been demonstrated, the poor reconstruction quality precludes SCI from wide applications. This paper aims to boost the reconstruction quality of SCI via exploiting the high-dimensional structure in the desired signal. We build a joint model to integrate the nonlocal self-similarity of video/hyperspectral frames and the rank minimization approach with the SCI sensing process. Following this, an alternating minimization algorithm is developed to solve this non-convex problem. We further investigate the special structure of the sampling process in SCI to tackle the computational workload and memory issues in SCI reconstruction. Both simulation and real data (captured by four different SCI cameras) results demonstrate that our proposed algorithm leads to significant improvements compared with current state-of-the-art algorithms. We hope our results will encourage the researchers and engineers to pursue further in compressive imaging for real applications.

Journal ArticleDOI
TL;DR: A novel algorithm using synergetic neural networks for robustness and security of digital image watermarking is proposed, which obtains an optimal Peak Signal-to-noise ratio (PSNR) and can complete certain image processing operations with improved performance.

Journal ArticleDOI
TL;DR: Wasserstein convolutional neural network (WCNN) as discussed by the authors was proposed to learn invariant features between near-infrared (NIR) and visual (VIS) face images, and the Wasserstein distance was introduced into the NIR-VIS shared layer to measure the dissimilarity between heterogeneous feature distributions.
Abstract: Heterogeneous face recognition (HFR) aims at matching facial images acquired from different sensing modalities with mission-critical applications in forensics, security and commercial sectors. However, HFR presents more challenging issues than traditional face recognition because of the large intra-class variation among heterogeneous face images and the limited availability of training samples of cross-modality face image pairs. This paper proposes the novel Wasserstein convolutional neural network (WCNN) approach for learning invariant features between near-infrared (NIR) and visual (VIS) face images (i.e., NIR-VIS face recognition). The low-level layers of the WCNN are trained with widely available face images in the VIS spectrum, and the high-level layer is divided into three parts: the NIR layer, the VIS layer and the NIR-VIS shared layer. The first two layers aim at learning modality-specific features, and the NIR-VIS shared layer is designed to learn a modality-invariant feature subspace. The Wasserstein distance is introduced into the NIR-VIS shared layer to measure the dissimilarity between heterogeneous feature distributions. W-CNN learning is performed to minimize the Wasserstein distance between the NIR distribution and the VIS distribution for invariant deep feature representations of heterogeneous face images. To avoid the over-fitting problem on small-scale heterogeneous face data, a correlation prior is introduced on the fully-connected WCNN layers to reduce the size of the parameter space. This prior is implemented by a low-rank constraint in an end-to-end network. The joint formulation leads to an alternating minimization for deep feature representation at the training stage and an efficient computation for heterogeneous data at the testing stage. Extensive experiments using three challenging NIR-VIS face recognition databases demonstrate the superiority of the WCNN method over state-of-the-art methods.

Journal ArticleDOI
TL;DR: The effectiveness of the machine learning method for on-line detection of defects due to process non-conformities, providing the basis for adaptive SLM process control and part quality assurance is confirmed.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: A new loss function which incorporates area and size information and integrates this into a dense deep learning model is proposed which outperforms other mainstream loss function Cross-entropy on two common segmentation networks.
Abstract: Image segmentation is an important step in medical image processing and has been widely studied and developed for refinement of clinical analysis and applications. New models based on deep learning have improved results but are restricted to pixel-wise fitting of the segmentation map. Our aim was to tackle this limitation by developing a new model based on deep learning which takes into account the area inside as well as outside the region of interest as well as the size of boundaries during learning. Specifically, we propose a new loss function which incorporates area and size information and integrates this into a dense deep learning model. We evaluated our approach on a dataset of more than 2,000 cardiac MRI scans. Our results show that the proposed loss function outperforms other mainstream loss function Cross-entropy on two common segmentation networks. Our loss function is robust while using different hyperparameter lambda.

Journal ArticleDOI
01 Jan 2019
TL;DR: An algorithmic architecture for supervised multimodal image analysis with cross-modality fusion at the feature learning level, classifier level, and decision-making level is proposed and an image segmentation system based on deep convolutional neural networks is designed to contour the lesions of soft tissue sarcomas using multi-modal images.
Abstract: Multimodality medical imaging techniques have been increasingly applied in clinical practice and research studies. Corresponding multimodal image analysis and ensemble learning schemes have seen rapid growth and bring unique value to medical applications. Motivated by the recent success of applying deep learning methods to medical image processing, we first propose an algorithmic architecture for supervised multimodal image analysis with cross-modality fusion at the feature learning level, classifier level, and decision-making level. We then design and implement an image segmentation system based on deep convolutional neural networks to contour the lesions of soft tissue sarcomas using multimodal images, including those from magnetic resonance imaging, computed tomography, and positron emission tomography. The network trained with multimodal images shows superior performance compared to networks trained with single-modal images. For the task of tumor segmentation, performing image fusion within the network (i.e., fusing at convolutional or fully connected layers) is generally better than fusing images at the network output (i.e., voting). This paper provides empirical guidance for the design and application of multimodal image analysis.

Journal ArticleDOI
29 Mar 2019-PLOS ONE
TL;DR: A novel convolutional neural network, which includes a convolutionAL layer, small SE-ResNet module, and fully connected layer is designed, which achieves the similar performance with fewer parameters and proposes a new learning rate scheduler which can get excellent performance without complicatedly fine-tuning the learning rate.
Abstract: Although successful detection of malignant tumors from histopathological images largely depends on the long-term experience of radiologists, experts sometimes disagree with their decisions. Computer-aided diagnosis provides a second option for image diagnosis, which can improve the reliability of experts’ decision-making. Automatic and precision classification for breast cancer histopathological image is of great importance in clinical application for identifying malignant tumors from histopathological images. Advanced convolution neural network technology has achieved great success in natural image classification, and it has been used widely in biomedical image processing. In this paper, we design a novel convolutional neural network, which includes a convolutional layer, small SE-ResNet module, and fully connected layer. We propose a small SE-ResNet module which is an improvement on the combination of residual module and Squeeze-and-Excitation block, and achieves the similar performance with fewer parameters. In addition, we propose a new learning rate scheduler which can get excellent performance without complicatedly fine-tuning the learning rate. We use our model for the automatic classification of breast cancer histology images (BreakHis dataset) into benign and malignant and eight subtypes. The results show that our model achieves the accuracy between 98.87% and 99.34% for the binary classification and achieve the accuracy between 90.66% and 93.81% for the multi-class classification.

Journal ArticleDOI
TL;DR: This is the first in-depth review of the applications of deep learning algorithms for segmentation in WSI analysis and provides quick guidance for implementing deep learning into pathology image analysis.
Abstract: With the rapid development of image scanning techniques and visualization software, whole slide imaging (WSI) is becoming a routine diagnostic method. Accelerating clinical diagnosis from pathology images and automating image analysis efficiently and accurately remain significant challenges. Recently, deep learning algorithms have shown great promise in pathology image analysis, such as in tumor region identification, metastasis detection, and patient prognosis. Many machine learning algorithms, including convolutional neural networks, have been proposed to automatically segment pathology images. Among these algorithms, segmentation deep learning algorithms such as fully convolutional networks stand out for their accuracy, computational efficiency, and generalizability. Thus, deep learning-based pathology image segmentation has become an important tool in WSI analysis. In this review, the pathology image segmentation process using deep learning algorithms is described in detail. The goals are to provide quick guidance for implementing deep learning into pathology image analysis and to provide some potential ways of further improving segmentation performance. Although there have been previous reviews on using machine learning methods in digital pathology image analysis, this is the first in-depth review of the applications of deep learning algorithms for segmentation in WSI analysis.