scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Medical Imaging in 2016"


Journal ArticleDOI
TL;DR: Two specific computer-aided detection problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification are studied, achieving the state-of-the-art performance on the mediastinal LN detection, and the first five-fold cross-validation classification results are reported.
Abstract: Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and deep convolutional neural networks (CNNs). CNNs enable learning data-driven, highly representative, hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.

4,249 citations


Journal ArticleDOI
TL;DR: This paper considered four distinct medical imaging applications in three specialties involving classification, detection, and segmentation from three different imaging modalities, and investigated how the performance of deep CNNs trained from scratch compared with the pre-trained CNNs fine-tuned in a layer-wise manner.
Abstract: Training a deep convolutional neural network (CNN) from scratch is difficult because it requires a large amount of labeled training data and a great deal of expertise to ensure proper convergence. A promising alternative is to fine-tune a CNN that has been pre-trained using, for instance, a large set of labeled natural images. However, the substantial differences between natural and medical images may advise against such knowledge transfer. In this paper, we seek to answer the following central question in the context of medical image analysis: Can the use of pre-trained deep CNNs with sufficient fine-tuning eliminate the need for training a deep CNN from scratch? To address this question, we considered four distinct medical imaging applications in three specialties (radiology, cardiology, and gastroenterology) involving classification, detection, and segmentation from three different imaging modalities, and investigated how the performance of deep CNNs trained from scratch compared with the pre-trained CNNs fine-tuned in a layer-wise manner. Our experiments consistently demonstrated that 1) the use of a pre-trained CNN with adequate fine-tuning outperformed or, in the worst case, performed as well as a CNN trained from scratch; 2) fine-tuned CNNs were more robust to the size of training sets than CNNs trained from scratch; 3) neither shallow tuning nor deep tuning was the optimal choice for a particular application; and 4) our layer-wise fine-tuning scheme could offer a practical way to reach the best performance for the application at hand based on the amount of available data.

2,294 citations


Journal ArticleDOI
TL;DR: This paper proposes an automatic segmentation method based on Convolutional Neural Networks (CNN), exploring small 3 ×3 kernels, which allows designing a deeper architecture, besides having a positive effect against overfitting, given the fewer number of weights in the network.
Abstract: Among brain tumors, gliomas are the most common and aggressive, leading to a very short life expectancy in their highest grade. Thus, treatment planning is a key stage to improve the quality of life of oncological patients. Magnetic resonance imaging (MRI) is a widely used imaging technique to assess these tumors, but the large amount of data produced by MRI prevents manual segmentation in a reasonable time, limiting the use of precise quantitative measurements in the clinical practice. So, automatic and reliable segmentation methods are required; however, the large spatial and structural variability among brain tumors make automatic segmentation a challenging problem. In this paper, we propose an automatic segmentation method based on Convolutional Neural Networks (CNN), exploring small 3 $\times$ 3 kernels. The use of small kernels allows designing a deeper architecture, besides having a positive effect against overfitting, given the fewer number of weights in the network. We also investigated the use of intensity normalization as a pre-processing step, which though not common in CNN-based segmentation methods, proved together with data augmentation to be very effective for brain tumor segmentation in MRI images. Our proposal was validated in the Brain Tumor Segmentation Challenge 2013 database (BRATS 2013), obtaining simultaneously the first position for the complete, core, and enhancing regions in Dice Similarity Coefficient metric (0.88, 0.83, 0.77) for the Challenge data set. Also, it obtained the overall first position by the online evaluation platform. We also participated in the on-site BRATS 2015 Challenge using the same model, obtaining the second place, with Dice Similarity Coefficient metric of 0.78, 0.65, and 0.75 for the complete, core, and enhancing regions, respectively.

1,894 citations


Journal ArticleDOI
TL;DR: The papers in this special section focus on the technology and applications supported by deep learning, which have proven to be powerful tools for a broad range of computer vision tasks.
Abstract: The papers in this special section focus on the technology and applications supported by deep learning. Deep learning is a growing trend in general data analysis and has been termed one of the 10 breakthrough technologies of 2013. Deep learning is an improvement of artificial neural networks, consisting of more layers that permit higher levels of abstraction and improved predictions from data. To date, it is emerging as the leading machine-learning tool in the general imaging and computer vision domains. In particular, convolutional neural networks (CNNs) have proven to be powerful tools for a broad range of computer vision tasks. Deep CNNs automatically learn mid-level and high-level abstractions obtained from raw data (e.g., images). Recent results indicate that the generic descriptors extracted from CNNs are extremely effective in object recognition and localization in natural images. Medical image analysis groups across the world are quickly entering the field and applying CNNs and other deep learning methodologies to a wide variety of applications.

1,428 citations


Journal ArticleDOI
TL;DR: A comparative analysis proved the effectiveness of the proposed CNN against previous methods in a challenging dataset, and demonstrated the potential of CNNs in analyzing lung patterns.
Abstract: Automated tissue characterization is one of the most crucial components of a computer aided diagnosis (CAD) system for interstitial lung diseases (ILDs). Although much research has been conducted in this field, the problem remains challenging. Deep learning techniques have recently achieved impressive results in a variety of computer vision problems, raising expectations that they might be applied in other domains, such as medical image analysis. In this paper, we propose and evaluate a convolutional neural network (CNN), designed for the classification of ILD patterns. The proposed network consists of 5 convolutional layers with 2 $\,\times\,$ 2 kernels and LeakyReLU activations, followed by average pooling with size equal to the size of the final feature maps and three dense layers. The last dense layer has 7 outputs, equivalent to the classes considered: healthy, ground glass opacity (GGO), micronodules, consolidation, reticulation, honeycombing and a combination of GGO/reticulation. To train and evaluate the CNN, we used a dataset of 14696 image patches, derived by 120 CT scans from different scanners and hospitals. To the best of our knowledge, this is the first deep CNN designed for the specific problem. A comparative analysis proved the effectiveness of the proposed CNN against previous methods in a challenging dataset. The classification performance ( $\sim 85.5\%$ ) demonstrated the potential of CNNs in analyzing lung patterns. Future work includes, extending the CNN to three-dimensional data provided by CT volume scans and integrating the proposed method into a CAD system that aims to provide differential diagnosis for ILDs as a supportive tool for radiologists.

1,053 citations


Journal ArticleDOI
TL;DR: A Spatially Constrained Convolutional Neural Network (SC-CNN) to perform nucleus detection and a novel Neighboring Ensemble Predictor (NEP) coupled with CNN to more accurately predict the class label of detected cell nuclei are proposed.
Abstract: Detection and classification of cell nuclei in histopathology images of cancerous tissue stained with the standard hematoxylin and eosin stain is a challenging task due to cellular heterogeneity. Deep learning approaches have been shown to produce encouraging results on histopathology images in various studies. In this paper, we propose a Spatially Constrained Convolutional Neural Network (SC-CNN) to perform nucleus detection. SC-CNN regresses the likelihood of a pixel being the center of a nucleus, where high probability values are spatially constrained to locate in the vicinity of the centers of nuclei. For classification of nuclei, we propose a novel Neighboring Ensemble Predictor (NEP) coupled with CNN to more accurately predict the class label of detected cell nuclei. The proposed approaches for detection and classification do not require segmentation of nuclei. We have evaluated them on a large dataset of colorectal adenocarcinoma images, consisting of more than 20,000 annotated nuclei belonging to four different classes. Our results show that the joint detection and classification of the proposed SC-CNN and NEP produces the highest average F1 score as compared to other recently published approaches. Prospectively, the proposed methods could offer benefit to pathology practice in terms of quantitative analysis of tissue constituents in whole-slide images, and potentially lead to a better understanding of cancer.

1,043 citations


Journal ArticleDOI
TL;DR: It was showed that the proposed multi-view ConvNets is highly suited to be used for false positive reduction of a CAD system.
Abstract: We propose a novel Computer-Aided Detection (CAD) system for pulmonary nodules using multi-view convolutional networks (ConvNets), for which discriminative features are automatically learnt from the training data. The network is fed with nodule candidates obtained by combining three candidate detectors specifically designed for solid, subsolid, and large nodules. For each candidate, a set of 2-D patches from differently oriented planes is extracted. The proposed architecture comprises multiple streams of 2-D ConvNets, for which the outputs are combined using a dedicated fusion method to get the final classification. Data augmentation and dropout are applied to avoid overfitting. On 888 scans of the publicly available LIDC-IDRI dataset, our method reaches high detection sensitivities of 85.4% and 90.1% at 1 and 4 false positives per scan, respectively. An additional evaluation on independent datasets from the ANODE09 challenge and DLCST is performed. We showed that the proposed multi-view ConvNets is highly suited to be used for false positive reduction of a CAD system.

1,030 citations


Journal ArticleDOI
TL;DR: A supervised segmentation technique that uses a deep neural network trained on a large sample of examples preprocessed with global contrast normalization, zero-phase whitening, and augmented using geometric transformations and gamma corrections, which significantly outperform the previous algorithms on the area under ROC curve measure.
Abstract: The condition of the vascular network of human eye is an important diagnostic factor in ophthalmology. Its segmentation in fundus imaging is a nontrivial task due to variable size of vessels, relatively low contrast, and potential presence of pathologies like microaneurysms and hemorrhages. Many algorithms, both unsupervised and supervised, have been proposed for this purpose in the past. We propose a supervised segmentation technique that uses a deep neural network trained on a large (up to 400 $\thinspace$ 000) sample of examples preprocessed with global contrast normalization, zero-phase whitening, and augmented using geometric transformations and gamma corrections. Several variants of the method are considered, including structured prediction, where a network classifies multiple pixels simultaneously. When applied to standard benchmarks of fundus imaging, the DRIVE, STARE, and CHASE databases, the networks significantly outperform the previous algorithms on the area under ROC curve measure (up to $ > {0.99}$ ) and accuracy of classification (up to $ > {0.97}$ ). The method is also resistant to the phenomenon of central vessel reflex, sensitive in detection of fine vessels ( ${\rm sensitivity} > {0.87}$ ), and fares well on pathological cases.

807 citations


Journal ArticleDOI
TL;DR: A Stacked Sparse Autoencoder, an instance of a deep learning strategy, is presented for efficient nuclei detection on high-resolution histopathological images of breast cancer and out-performed nine other state of the art nuclear detection strategies.
Abstract: Automated nuclear detection is a critical step for a number of computer assisted pathology related image analysis algorithms such as for automated grading of breast cancer tissue specimens. The Nottingham Histologic Score system is highly correlated with the shape and appearance of breast cancer nuclei in histopathological images. However, automated nucleus detection is complicated by 1) the large number of nuclei and the size of high resolution digitized pathology images, and 2) the variability in size, shape, appearance, and texture of the individual nuclei. Recently there has been interest in the application of “Deep Learning” strategies for classification and analysis of big image data. Histopathology, given its size and complexity, represents an excellent use case for application of deep learning strategies. In this paper, a Stacked Sparse Autoencoder (SSAE), an instance of a deep learning strategy, is presented for efficient nuclei detection on high-resolution histopathological images of breast cancer. The SSAE learns high-level features from just pixel intensities alone in order to identify distinguishing features of nuclei. A sliding window operation is applied to each image in order to represent image patches via high-level features obtained via the auto-encoder, which are then subsequently fed to a classifier which categorizes each image patch as nuclear or non-nuclear. Across a cohort of 500 histopathological images (2200 $\times$ 2200) and approximately 3500 manually segmented individual nuclei serving as the groundtruth, SSAE was shown to have an improved F-measure 84.49% and an average area under Precision-Recall curve (AveP) 78.83%. The SSAE approach also out-performed nine other state of the art nuclear detection strategies.

735 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel automatic method to detect CMBs from magnetic resonance (MR) images by exploiting the 3D convolutional neural network (CNN), outperforming previous methods using low-level descriptors or 2D CNNs by a significant margin.
Abstract: Cerebral microbleeds (CMBs) are small haemorrhages nearby blood vessels. They have been recognized as important diagnostic biomarkers for many cerebrovascular diseases and cognitive dysfunctions. In current clinical routine, CMBs are manually labelled by radiologists but this procedure is laborious, time-consuming, and error prone. In this paper, we propose a novel automatic method to detect CMBs from magnetic resonance (MR) images by exploiting the 3D convolutional neural network (CNN). Compared with previous methods that employed either low-level hand-crafted descriptors or 2D CNNs, our method can take full advantage of spatial contextual information in MR volumes to extract more representative high-level features for CMBs, and hence achieve a much better detection accuracy. To further improve the detection performance while reducing the computational cost, we propose a cascaded framework under 3D CNNs for the task of CMB detection. We first exploit a 3D fully convolutional network (FCN) strategy to retrieve the candidates with high probabilities of being CMBs, and then apply a well-trained 3D CNN discrimination model to distinguish CMBs from hard mimics. Compared with traditional sliding window strategy, the proposed 3D FCN strategy can remove massive redundant computations and dramatically speed up the detection process. We constructed a large dataset with 320 volumetric MR scans and performed extensive experiments to validate the proposed method, which achieved a high sensitivity of 93.16% with an average number of 2.74 false positives per subject, outperforming previous methods using low-level descriptors or 2D CNNs by a significant margin. The proposed method, in principle, can be adapted to other biomarker detection tasks from volumetric medical data.

567 citations


Journal ArticleDOI
TL;DR: This paper presents a method for the automatic segmentation of MR brain images into a number of tissue classes using a convolutional neural network, and demonstrates its robustness to differences in age and acquisition protocol.
Abstract: Automatic segmentation in MR brain images is important for quantitative analysis in large-scale studies with images acquired at all ages. This paper presents a method for the automatic segmentation of MR brain images into a number of tissue classes using a convolutional neural network. To ensure that the method obtains accurate segmentation details as well as spatial consistency, the network uses multiple patch sizes and multiple convolution kernel sizes to acquire multi-scale information about each voxel. The method is not dependent on explicit features, but learns to recognise the information that is important for the classification based on training data. The method requires a single anatomical MR image only. The segmentation method is applied to five different data sets: coronal $ {\rm T}_{2}$ -weighted images of preterm infants acquired at 30 weeks postmenstrual age (PMA) and 40 weeks PMA, axial $ {\rm T}_{2}$ -weighted images of preterm infants acquired at 40 weeks PMA, axial $ {\rm T}_{1}$ -weighted images of ageing adults acquired at an average age of 70 years, and $ {\rm T}_{1}$ -weighted images of young adults acquired at an average age of 23 years. The method obtained the following average Dice coefficients over all segmented tissue classes for each data set, respectively: 0.87, 0.82, 0.84, 0.86, and 0.91. The results demonstrate that the method obtains accurate segmentations in all five sets, and hence demonstrates its robustness to differences in age and acquisition protocol.

Journal ArticleDOI
Qiaoliang Li1, Bowei Feng1, Linpei Xie1, Ping Liang1, Huisheng Zhang1, Tianfu Wang1 
TL;DR: A wide and deep neural network with strong induction ability is proposed to model the transformation, and an efficient training strategy is presented, where instead of a single label of the center pixel, the network can output the label map of all pixels for a given image patch.
Abstract: This paper presents a new supervised method for vessel segmentation in retinal images. This method remolds the task of segmentation as a problem of cross-modality data transformation from retinal image to vessel map. A wide and deep neural network with strong induction ability is proposed to model the transformation, and an efficient training strategy is presented. Instead of a single label of the center pixel, the network can output the label map of all pixels for a given image patch. Our approach outperforms reported state-of-the-art methods in terms of sensitivity, specificity and accuracy. The result of cross-training evaluation indicates its robustness to the training set. The approach needs no artificially designed feature and no preprocessing step, reducing the impact of subjective factors. The proposed method has the potential for application in image diagnosis of ophthalmologic diseases, and it may provide a new, general, high-performance computing framework for image segmentation.

Journal ArticleDOI
TL;DR: An experimental study on learning from crowds that handles data aggregation directly as part of the learning process of the convolutional neural network (CNN) via additional crowdsourcing layer (AggNet), which gives valuable insights into the functionality of deep CNN learning from crowd annotations and proves the necessity of data aggregation integration.
Abstract: The lack of publicly available ground-truth data has been identified as the major challenge for transferring recent developments in deep learning to the biomedical imaging domain. Though crowdsourcing has enabled annotation of large scale databases for real world images, its application for biomedical purposes requires a deeper understanding and hence, more precise definition of the actual annotation task. The fact that expert tasks are being outsourced to non-expert users may lead to noisy annotations introducing disagreement between users. Despite being a valuable resource for learning annotation models from crowdsourcing, conventional machine-learning methods may have difficulties dealing with noisy annotations during training. In this manuscript, we present a new concept for learning from crowds that handle data aggregation directly as part of the learning process of the convolutional neural network (CNN) via additional crowdsourcing layer (AggNet). Besides, we present an experimental study on learning from crowds designed to answer the following questions. 1) Can deep CNN be trained with data collected from crowdsourcing? 2) How to adapt the CNN to train on multiple types of annotation datasets (ground truth and crowd-based)? 3) How does the choice of annotation and aggregation affect the accuracy? Our experimental setup involved Annot8, a self-implemented web-platform based on Crowdflower API realizing image annotation tasks for a publicly available biomedical image database. Our results give valuable insights into the functionality of deep CNN learning from crowd annotations and prove the necessity of data aggregation integration.

Journal ArticleDOI
TL;DR: In this paper, a coarse-to-fine cascade framework is proposed to generate 2D or 2.5D views via sampling through scale transformations, random translations and rotations, which are used to train deep convolutional neural network (ConvNet) classifiers.
Abstract: Automated computer-aided detection (CADe) has been an important tool in clinical practice and research. State-of-the-art methods often show high sensitivities at the cost of high false-positives (FP) per patient rates. We design a two-tiered coarse-to-fine cascade framework that first operates a candidate generation system at sensitivities $\sim 100\%$ of but at high FP levels. By leveraging existing CADe systems, coordinates of regions or volumes of interest (ROI or VOI) are generated and function as input for a second tier, which is our focus in this study. In this second stage, we generate 2D (two-dimensional) or 2.5D views via sampling through scale transformations, random translations and rotations. These random views are used to train deep convolutional neural network (ConvNet) classifiers. In testing, the ConvNets assign class (e.g., lesion, pathology) probabilities for a new set of random views that are then averaged to compute a final per-candidate classification probability. This second tier behaves as a highly selective process to reject difficult false positives while preserving high sensitivities. The methods are evaluated on three data sets: 59 patients for sclerotic metastasis detection, 176 patients for lymph node detection, and 1,186 patients for colonic polyp detection. Experimental results show the ability of ConvNets to generalize well to different medical imaging CADe applications and scale elegantly to various data sets. Our proposed methods improve performance markedly in all cases. Sensitivities improved from 57% to 70%, 43% to 77%, and 58% to 75% at 3 FPs per patient for sclerotic metastases, lymph nodes and colonic polyps, respectively.

Journal ArticleDOI
TL;DR: This paper presents the culmination of the research in designing a system for computer-aided detection of polyps in colonoscopy videos based on a hybrid context-shape approach, which utilizes context information to remove non-polyp structures and shape information to reliably localize polyps.
Abstract: This paper presents the culmination of our research in designing a system for computer-aided detection (CAD) of polyps in colonoscopy videos. Our system is based on a hybrid context-shape approach, which utilizes context information to remove non-polyp structures and shape information to reliably localize polyps. Specifically, given a colonoscopy image, we first obtain a crude edge map. Second, we remove non-polyp edges from the edge map using our unique feature extraction and edge classification scheme. Third, we localize polyp candidates with probabilistic confidence scores in the refined edge maps using our novel voting scheme. The suggested CAD system has been tested using two public polyp databases, CVC-ColonDB, containing 300 colonoscopy images with a total of 300 polyp instances from 15 unique polyps, and ASU-Mayo database, which is our collection of colonoscopy videos containing 19,400 frames and a total of 5,200 polyp instances from 10 unique polyps. We have evaluated our system using free-response receiver operating characteristic (FROC) analysis. At 0.1 false positives per frame, our system achieves a sensitivity of 88.0% for CVC-ColonDB and a sensitivity of 48% for the ASU-Mayo database. In addition, we have evaluated our system using a new detection latency analysis where latency is defined as the time from the first appearance of a polyp in the colonoscopy video to the time of its first detection by our system. At 0.05 false positives per frame, our system yields a polyp detection latency of 0.3 seconds.

Journal ArticleDOI
TL;DR: Stain density correlation with ground truth and preference by pathologists were higher for images normalized using the method when compared to other alternatives, and a computationally faster extension of this technique is proposed for large whole-slide images that selects an appropriate patch sample instead of using the entire image to compute the stain color basis.
Abstract: Staining and scanning of tissue samples for microscopic examination is fraught with undesirable color variations arising from differences in raw materials and manufacturing techniques of stain vendors, staining protocols of labs, and color responses of digital scanners. When comparing tissue samples, color normalization and stain separation of the tissue images can be helpful for both pathologists and software. Techniques that are used for natural images fail to utilize structural properties of stained tissue samples and produce undesirable color distortions. The stain concentration cannot be negative. Tissue samples are stained with only a few stains and most tissue regions are characterized by at most one effective stain. We model these physical phenomena that define the tissue structure by first decomposing images in an unsupervised manner into stain density maps that are sparse and non-negative. For a given image, we combine its stain density maps with stain color basis of a pathologist-preferred target image, thus altering only its color while preserving its structure described by the maps. Stain density correlation with ground truth and preference by pathologists were higher for images normalized using our method when compared to other alternatives. We also propose a computationally faster extension of this technique for large whole-slide images that selects an appropriate patch sample instead of using the entire image to compute the stain color basis.

Journal ArticleDOI
TL;DR: A novel segmentation approach based on deep 3D convolutional encoder networks with shortcut connections with results showing that this method performs comparably to the top-ranked state-of-the-art methods, even when only relatively small data sets are available for training.
Abstract: We propose a novel segmentation approach based on deep 3D convolutional encoder networks with shortcut connections and apply it to the segmentation of multiple sclerosis (MS) lesions in magnetic resonance images. Our model is a neural network that consists of two interconnected pathways, a convolutional pathway, which learns increasingly more abstract and higher-level image features, and a deconvolutional pathway, which predicts the final segmentation at the voxel level. The joint training of the feature extraction and prediction pathways allows for the automatic learning of features at different scales that are optimized for accuracy for any given combination of image types and segmentation task. In addition, shortcut connections between the two pathways allow high- and low-level features to be integrated, which enables the segmentation of lesions across a wide range of sizes. We have evaluated our method on two publicly available data sets (MICCAI 2008 and ISBI 2015 challenges) with the results showing that our method performs comparably to the top-ranked state-of-the-art methods, even when only relatively small data sets are available for training. In addition, we have compared our method with five freely available and widely used MS lesion segmentation methods (EMS, LST-LPA, LST-LGA, Lesion-TOADS, and SLS) on a large data set from an MS clinical trial. The results show that our method consistently outperforms these other methods across a wide range of lesion sizes.

Journal ArticleDOI
TL;DR: The proposed CNN regression approach has been quantitatively evaluated on 3 potential clinical applications, demonstrating its significant advantage in providing highly accurate real-time 2-D/3-D registration with a significantly enlarged capture range when compared to intensity-based methods.
Abstract: In this paper, we present a Convolutional Neural Network (CNN) regression approach to address the two major limitations of existing intensity-based 2-D/3-D registration technology: 1) slow computation and 2) small capture range. Different from optimization-based methods, which iteratively optimize the transformation parameters over a scalar-valued metric function representing the quality of the registration, the proposed method exploits the information embedded in the appearances of the digitally reconstructed radiograph and X-ray images, and employs CNN regressors to directly estimate the transformation parameters. An automatic feature extraction step is introduced to calculate 3-D pose-indexed features that are sensitive to the variables to be regressed while robust to other factors. The CNN regressors are then trained for local zones and applied in a hierarchical manner to break down the complex regression task into multiple simpler sub-tasks that can be learned separately. Weight sharing is furthermore employed in the CNN regression model to reduce the memory footprint. The proposed approach has been quantitatively evaluated on 3 potential clinical applications, demonstrating its significant advantage in providing highly accurate real-time 2-D/3-D registration with a significantly enlarged capture range when compared to intensity-based methods.

Journal ArticleDOI
TL;DR: The state-of-the-art results show that the learned breast density scores have a very strong positive relationship with manual ones, and that the learning texture scores are predictive of breast cancer.
Abstract: Mammographic risk scoring has commonly been automated by extracting a set of handcrafted features from mammograms, and relating the responses directly or indirectly to breast cancer risk. We present a method that learns a feature hierarchy from unlabeled data. When the learned features are used as the input to a simple classifier, two different tasks can be addressed: i) breast density segmentation, and ii) scoring of mammographic texture. The proposed model learns features at multiple scales. To control the models capacity a novel sparsity regularizer is introduced that incorporates both lifetime and population sparsity. We evaluated our method on three different clinical datasets. Our state-of-the-art results show that the learned breast density scores have a very strong positive relationship with manual ones, and that the learned texture scores are predictive of breast cancer. The model is easy to apply and generalizes to many other segmentation and scoring problems.

Journal ArticleDOI
TL;DR: This paper proposes a method to improve and speed-up the CNN training for medical image analysis tasks by dynamically selecting misclassified negative samples during training using the selective sampling method.
Abstract: Convolutional neural networks (CNNs) are deep learning network architectures that have pushed forward the state-of-the-art in a range of computer vision applications and are increasingly popular in medical image analysis. However, training of CNNs is time-consuming and challenging. In medical image analysis tasks, the majority of training examples are easy to classify and therefore contribute little to the CNN learning process. In this paper, we propose a method to improve and speed-up the CNN training for medical image analysis tasks by dynamically selecting misclassified negative samples during training. Training samples are heuristically sampled based on classification by the current status of the CNN. Weights are assigned to the training samples and informative samples are more likely to be included in the next CNN training iteration. We evaluated and compared our proposed method by training a CNN with (SeS) and without (NSeS) the selective sampling method. We focus on the detection of hemorrhages in color fundus images. A decreased training time from 170 epochs to 60 epochs with an increased performance—on par with two human experts—was achieved with areas under the receiver operating characteristics curve of 0.894 and 0.972 on two data sets. The SeS CNN statistically outperformed the NSeS CNN on an independent test set.

Journal ArticleDOI
TL;DR: The proposed algorithm of applying the LAD filter on orientation scores (LAD-OS) outperforms most of the state-of-the-art methods and is capable of dealing with typically difficult cases like crossings, central arterial reflex, closely parallel and tiny vessels.
Abstract: This paper presents a robust and fully automatic filter-based approach for retinal vessel segmentation. We propose new filters based on 3D rotating frames in so-called orientation scores, which are functions on the Lie-group domain of positions and orientations $\mathbb {R}^{2} \rtimes S^{1}$ . By means of a wavelet-type transform, a 2D image is lifted to a 3D orientation score, where elongated structures are disentangled into their corresponding orientation planes. In the lifted domain $\mathbb {R}^{2} \rtimes S^{1}$ , vessels are enhanced by means of multi-scale second-order Gaussian derivatives perpendicular to the line structures. More precisely, we use a left-invariant rotating derivative (LID) frame, and a locally adaptive derivative (LAD) frame. The LAD is adaptive to the local line structures and is found by eigensystem analysis of the left-invariant Hessian matrix (computed with the LID). After multi-scale filtering via the LID or LAD in the orientation score domain, the results are projected back to the 2D image plane giving us the enhanced vessels. Then a binary segmentation is obtained through thresholding. The proposed methods are validated on six retinal image datasets with different image types, on which competitive segmentation performances are achieved. In particular, the proposed algorithm of applying the LAD filter on orientation scores (LAD-OS) outperforms most of the state-of-the-art methods. The LAD-OS is capable of dealing with typically difficult cases like crossings, central arterial reflex, closely parallel and tiny vessels. The high computational speed of the proposed methods allows processing of large datasets in a screening setting.

Journal ArticleDOI
TL;DR: A learning-based framework for robust and automatic nucleus segmentation with shape preservation is proposed that is applicable to different staining histopathology images and general enough to perform well across multiple scenarios.
Abstract: Computer-aided image analysis of histopathology specimens could potentially provide support for early detection and improved characterization of diseases such as brain tumor, pancreatic neuroendocrine tumor (NET), and breast cancer. Automated nucleus segmentation is a prerequisite for various quantitative analyses including automatic morphological feature computation. However, it remains to be a challenging problem due to the complex nature of histopathology images. In this paper, we propose a learning-based framework for robust and automatic nucleus segmentation with shape preservation. Given a nucleus image, it begins with a deep convolutional neural network (CNN) model to generate a probability map, on which an iterative region merging approach is performed for shape initializations. Next, a novel segmentation algorithm is exploited to separate individual nuclei combining a robust selection-based sparse shape model and a local repulsive deformable model. One of the significant benefits of the proposed framework is that it is applicable to different staining histopathology images. Due to the feature learning characteristic of the deep CNN and the high level shape prior modeling, the proposed method is general enough to perform well across multiple scenarios. We have tested the proposed algorithm on three large-scale pathology image datasets using a range of different tissue and stain preparations, and the comparative experiments with recent state of the arts demonstrate the superior performance of the proposed approach.

Journal ArticleDOI
TL;DR: It is demonstrated how deep learning, a group of algorithms based on recent advances in the field of artificial neural networks, can be applied to reduce diffusion MRI data processing to a single optimized step and how classical data processing can be streamlined by means of deep learning.
Abstract: Numerous scientific fields rely on elaborate but partly suboptimal data processing pipelines. An example is diffusion magnetic resonance imaging (diffusion MRI), a non-invasive microstructure assessment method with a prominent application in neuroimaging. Advanced diffusion models providing accurate microstructural characterization so far have required long acquisition times and thus have been inapplicable for children and adults who are uncooperative, uncomfortable, or unwell. We show that the long scan time requirements are mainly due to disadvantages of classical data processing. We demonstrate how deep learning, a group of algorithms based on recent advances in the field of artificial neural networks, can be applied to reduce diffusion MRI data processing to a single optimized step. This modification allows obtaining scalar measures from advanced models at twelve-fold reduced scan time and detecting abnormalities without using diffusion models. We set a new state of the art by estimating diffusion kurtosis measures from only 12 data points and neurite orientation dispersion and density measures from only 8 data points. This allows unprecedentedly fast and robust protocols facilitating clinical routine and demonstrates how classical data processing can be streamlined by means of deep learning.

Journal ArticleDOI
TL;DR: The results of the empirical evaluations collectively demonstrate the potential contribution of the proposed standardization algorithm to improved diagnostic accuracy and consistency in computer-aided diagnosis for histopathology data.
Abstract: Variations in the color and intensity of hematoxylin and eosin (H&E) stained histological slides can potentially hamper the effectiveness of quantitative image analysis. This paper presents a fully automated algorithm for standardization of whole-slide histopathological images to reduce the effect of these variations. The proposed algorithm, called whole-slide image color standardizer (WSICS), utilizes color and spatial information to classify the image pixels into different stain components. The chromatic and density distributions for each of the stain components in the hue-saturation-density color model are aligned to match the corresponding distributions from a template whole-slide image (WSI). The performance of the WSICS algorithm was evaluated on two datasets. The first originated from 125 H&E stained WSIs of lymph nodes, sampled from 3 patients, and stained in 5 different laboratories on different days of the week. The second comprised 30 H&E stained WSIs of rat liver sections. The result of qualitative and quantitative evaluations using the first dataset demonstrate that the WSICS algorithm outperforms competing methods in terms of achieving color constancy. The WSICS algorithm consistently yields the smallest standard deviation and coefficient of variation of the normalized median intensity measure. Using the second dataset, we evaluated the impact of our algorithm on the performance of an already published necrosis quantification system. The performance of this system was significantly improved by utilizing the WSICS algorithm. The results of the empirical evaluations collectively demonstrate the potential contribution of the proposed standardization algorithm to improved diagnostic accuracy and consistency in computer-aided diagnosis for histopathology data.

Journal ArticleDOI
TL;DR: A novel method for automatic detection of both microaneurysms and hemorrhages in color fundus images is described and validated and it proves to be robust with respect to variability in image resolution, quality and acquisition system.
Abstract: The development of an automatic telemedicine system for computer-aided screening and grading of diabetic retinopathy depends on reliable detection of retinal lesions in fundus images. In this paper, a novel method for automatic detection of both microaneurysms and hemorrhages in color fundus images is described and validated. The main contribution is a new set of shape features, called Dynamic Shape Features, that do not require precise segmentation of the regions to be classified. These features represent the evolution of the shape during image flooding and allow to discriminate between lesions and vessel segments. The method is validated per-lesion and per-image using six databases, four of which are publicly available. It proves to be robust with respect to variability in image resolution, quality and acquisition system. On the Retinopathy Online Challenge's database, the method achieves a FROC score of 0.420 which ranks it fourth. On the Messidor database, when detecting images with diabetic retinopathy, the proposed method achieves an area under the ROC curve of 0.899, comparable to the score of human experts, and it outperforms state-of-the-art approaches.

Journal ArticleDOI
TL;DR: Experimental results show that the learning-based method proposed can accurately predict CT images in various scenarios, even for the images undergoing large shape variation, and also outperforms two state-of-the-art methods.
Abstract: Computed tomography (CT) imaging is an essential tool in various clinical diagnoses and radiotherapy treatment planning. Since CT image intensities are directly related to positron emission tomography (PET) attenuation coefficients, they are indispensable for attenuation correction (AC) of the PET images. However, due to the relatively high dose of radiation exposure in CT scan, it is advised to limit the acquisition of CT images. In addition, in the new PET and magnetic resonance (MR) imaging scanner, only MR images are available, which are unfortunately not directly applicable to AC. These issues greatly motivate the development of methods for reliable estimate of CT image from its corresponding MR image of the same subject. In this paper, we propose a learning-based method to tackle this challenging problem. Specifically, we first partition a given MR image into a set of patches. Then, for each patch, we use the structured random forest to directly predict a CT patch as a structured output, where a new ensemble model is also used to ensure the robust prediction. Image features are innovatively crafted to achieve multi-level sensitivity, with spatial information integrated through only rigid-body alignment to help avoiding the error-prone inter-subject deformable registration. Moreover, we use an auto-context model to iteratively refine the prediction. Finally, we combine all of the predicted CT patches to obtain the final prediction for the given MR image. We demonstrate the efficacy of our method on two datasets: human brain and prostate images. Experimental results show that our method can accurately predict CT images in various scenarios, even for the images undergoing large shape variation, and also outperforms two state-of-the-art methods.

Journal ArticleDOI
TL;DR: A multi-stage deep learning framework for image classification and apply it on bodypart recognition achieves better performances than state-of-the-art approaches, including the standard deep CNN.
Abstract: In general image recognition problems, discriminative information often lies in local image patches. For example, most human identity information exists in the image patches containing human faces. The same situation stays in medical images as well. “Bodypart identity” of a transversal slice—which bodypart the slice comes from—is often indicated by local image information, e.g., a cardiac slice and an aorta arch slice are only differentiated by the mediastinum region. In this work, we design a multi-stage deep learning framework for image classification and apply it on bodypart recognition. Specifically, the proposed framework aims at: 1) discover the local regions that are discriminative and non-informative to the image classification problem, and 2) learn a image-level classifier based on these local regions. We achieve these two tasks by the two stages of learning scheme, respectively. In the pre-train stage, a convolutional neural network (CNN) is learned in a multi-instance learning fashion to extract the most discriminative and and non-informative local patches from the training slices. In the boosting stage, the pre-learned CNN is further boosted by these local patches for image classification. The CNN learned by exploiting the discriminative local appearances becomes more accurate than those learned from global image context. The key hallmark of our method is that it automatically discovers the discriminative and non-informative local patches through multi-instance deep learning. Thus, no manual annotation is required. Our method is validated on a synthetic dataset and a large scale CT dataset. It achieves better performances than state-of-the-art approaches, including the standard deep CNN.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a new deformable MR prostate segmentation method by unifying deep feature learning with the sparse patch matching, which showed superior performance than other state-of-the-art segmentation methods.
Abstract: Automatic and reliable segmentation of the prostate is an important but difficult task for various clinical applications such as prostate cancer radiotherapy. The main challenges for accurate MR prostate localization lie in two aspects: (1) inhomogeneous and inconsistent appearance around prostate boundary, and (2) the large shape variation across different patients. To tackle these two problems, we propose a new deformable MR prostate segmentation method by unifying deep feature learning with the sparse patch matching. First, instead of directly using handcrafted features, we propose to learn the latent feature representation from prostate MR images by the stacked sparse auto-encoder (SSAE). Since the deep learning algorithm learns the feature hierarchy from the data, the learned features are often more concise and effective than the handcrafted features in describing the underlying data. To improve the discriminability of learned features, we further refine the feature representation in a supervised fashion. Second, based on the learned features, a sparse patch matching method is proposed to infer a prostate likelihood map by transferring the prostate labels from multiple atlases to the new prostate MR image. Finally, a deformable segmentation is used to integrate a sparse shape model with the prostate likelihood map for achieving the final segmentation. The proposed method has been extensively evaluated on the dataset that contains 66 T2-wighted prostate MR images. Experimental results show that the deep-learned features are more effective than the handcrafted features in guiding MR prostate segmentation. Moreover, our method shows superior performance than other state-of-the-art segmentation methods.

Journal ArticleDOI
TL;DR: A novel enhancement function is proposed, which overcomes the deficiencies of the established functions and is quantitatively evaluates and compares the novel and the established enhancement functions on clinical image datasets of the lung, cerebral and fundus vasculatures.
Abstract: A number of imaging techniques are being used for diagnosis and treatment of vascular pathologies like stenoses, aneurysms, embolisms, malformations and remodelings, which may affect a wide range of anatomical sites. For computer-aided detection and highlighting of potential sites of pathology or to improve visualization and segmentation, angiographic images are often enhanced by Hessian based filters. These filters aim to indicate elongated and/or rounded structures by an enhancement function based on Hessian eigenvalues. However, established enhancement functions generally produce a response, which exhibits deficiencies such as poor and non-uniform response for vessels of different sizes and varying contrast, at bifurcations and aneurysms. This may compromise subsequent analysis of the enhanced images. This paper has three important contributions: i) reviews several established enhancement functions and elaborates their deficiencies, ii) proposes a novel enhancement function, which overcomes the deficiencies of the established functions, and iii) quantitatively evaluates and compares the novel and the established enhancement functions on clinical image datasets of the lung, cerebral and fundus vasculatures.

Journal ArticleDOI
TL;DR: A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks.
Abstract: Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud–based evaluation framework is presented in this paper including results of benchmarking current state–of–the–art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non–manually–annotated medical images are available to the research community.