scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Deeply-supervised CNN for prostate segmentation

14 May 2017-pp 178-184
TL;DR: The proposed model can effectively detect the prostate region with additional deeply supervised layers compared with other approaches and significant segmentation accuracy improvement has been achieved by the method compared to other reported approaches.
Abstract: Prostate segmentation from Magnetic Resonance (MR) images plays an important role in image guided intervention. However, the lack of clear boundary specifically at the apex and base, and huge variation of shape and texture between the images from different patients make the task very challenging. To overcome these problems, in this paper, we propose a deeply supervised convolutional neural network (CNN) utilizing the convolutional information to accurately segment the prostate from MR images. The proposed model can effectively detect the prostate region with additional deeply supervised layers compared with other approaches. Since some information will be abandoned after convolution, it is necessary to pass the features extracted from early stages to later stages. The experimental results show that significant segmentation accuracy improvement has been achieved by our proposed method compared to other reported approaches.
Citations
More filters
Journal ArticleDOI
TL;DR: UNet++ as mentioned in this paper proposes an efficient ensemble of U-Nets of varying depths, which partially share an encoder and co-learn simultaneously using deep supervision, leading to a highly flexible feature fusion scheme.
Abstract: The state-of-the-art models for medical image segmentation are variants of U-Net and fully convolutional networks (FCN). Despite their success, these models have two limitations: (1) their optimal depth is apriori unknown, requiring extensive architecture search or inefficient ensemble of models of varying depths; and (2) their skip connections impose an unnecessarily restrictive fusion scheme, forcing aggregation only at the same-scale feature maps of the encoder and decoder sub-networks. To overcome these two limitations, we propose UNet++, a new neural architecture for semantic and instance segmentation, by (1) alleviating the unknown network depth with an efficient ensemble of U-Nets of varying depths, which partially share an encoder and co-learn simultaneously using deep supervision; (2) redesigning skip connections to aggregate features of varying semantic scales at the decoder sub-networks, leading to a highly flexible feature fusion scheme; and (3) devising a pruning scheme to accelerate the inference speed of UNet++. We have evaluated UNet++ using six different medical image segmentation datasets, covering multiple imaging modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and electron microscopy (EM), and demonstrating that (1) UNet++ consistently outperforms the baseline models for the task of semantic segmentation across different datasets and backbone architectures; (2) UNet++ enhances segmentation quality of varying-size objects—an improvement over the fixed-depth U-Net; (3) Mask RCNN++ (Mask R-CNN with UNet++ design) outperforms the original Mask R-CNN for the task of instance segmentation; and (4) pruned UNet++ models achieve significant speedup while showing only modest performance degradation. Our implementation and pre-trained models are available at https://github.com/MrGiovanni/UNetPlusPlus .

1,487 citations

Journal ArticleDOI
Kai Hu1, Zhenzhen Zhang1, Xiaorui Niu1, Yuan Zhang1, Chunhong Cao1, Fen Xiao1, Xieping Gao1 
TL;DR: A novel retinal vessel segmentation method of the fundus images based on convolutional neural network (CNN) and fully connected conditional random fields (CRFs) which allows for detection of more tiny blood vessels and more precise locating of the edges.

233 citations

Proceedings ArticleDOI
01 Jan 2022
TL;DR: UNETR as discussed by the authors utilizes a transformer encoder to learn sequence representations of the input volume and effectively capture the global multi-scale information, while also following the successful U-shaped network design for the encoder and decoder.
Abstract: Fully Convolutional Neural Networks (FCNNs) with contracting and expanding paths have shown prominence for the majority of medical image segmentation applications since the past decade. In FCNNs, the encoder plays an integral role by learning both global and local features and contextual representations which can be utilized for semantic output prediction by the decoder. Despite their success, the locality of convolutional layers in FCNNs, limits the capability of learning long-range spatial dependencies. Inspired by the recent success of transformers for Natural Language Processing (NLP) in long-range sequence learning, we reformulate the task of volumetric (3D) medical image segmentation as a sequence-to-sequence prediction problem. We introduce a novel architecture, dubbed as UNEt TRansformers (UNETR), that utilizes a transformer as the encoder to learn sequence representations of the input volume and effectively capture the global multi-scale information, while also following the successful "U-shaped" network design for the encoder and decoder. The transformer encoder is directly connected to a decoder via skip connections at different resolutions to compute the final semantic segmentation output. We have validated the performance of our method on the Multi Atlas Labeling Beyond The Cranial Vault (BTCV) dataset for multi-organ segmentation and the Medical Segmentation Decathlon (MSD) dataset for brain tumor and spleen segmentation tasks. Our benchmarks demonstrate new state-of-the-art performance on the BTCV leaderboard.

219 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a multi-site network (MS-Net) for improving prostate segmentation by learning robust representations, leveraging multiple sources of data, and developed Domain Specific Batch Normalization layers in the network backbone, enabling the network to estimate statistics and perform feature normalization for each site separately.
Abstract: Automated prostate segmentation in MRI is highly demanded for computer-assisted diagnosis. Recently, a variety of deep learning methods have achieved remarkable progress in this task, usually relying on large amounts of training data. Due to the nature of scarcity for medical images, it is important to effectively aggregate data from multiple sites for robust model training, to alleviate the insufficiency of single-site samples. However, the prostate MRIs from different sites present heterogeneity due to the differences in scanners and imaging protocols, raising challenges for effective ways of aggregating multi-site data for network training. In this paper, we propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations, leveraging multiple sources of data. To compensate for the inter-site heterogeneity of different MRI datasets, we develop Domain-Specific Batch Normalization layers in the network backbone, enabling the network to estimate statistics and perform feature normalization for each site separately. Considering the difficulty of capturing the shared knowledge from multiple datasets, a novel learning paradigm, i.e. , Multi-site-guided Knowledge Transfer, is proposed to enhance the kernels to extract more generic representations from multi-site data. Extensive experiments on three heterogeneous prostate MRI datasets demonstrate that our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.

191 citations

Journal ArticleDOI
TL;DR: A novel deeply supervised deep learning-based approach with a group dilated convolution to automatically segment the MRI prostate is developed, demonstrated its clinical feasibility, and validated its accuracy against manual segmentation.
Abstract: Purpose Reliable automated segmentation of the prostate is indispensable for image-guided prostate interventions. However, the segmentation task is challenging due to inhomogeneous intensity distributions, variation in prostate anatomy, among other problems. Manual segmentation can be time-consuming and is subject to inter- and intraobserver variation. We developed an automated deep learning-based method to address this technical challenge. Methods We propose a three-dimensional (3D) fully convolutional networks (FCN) with deep supervision and group dilated convolution to segment the prostate on magnetic resonance imaging (MRI). In this method, a deeply supervised mechanism was introduced into a 3D FCN to effectively alleviate the common exploding or vanishing gradients problems in training deep models, which forces the update process of the hidden layer filters to favor highly discriminative features. A group dilated convolution which aggregates multiscale contextual information for dense prediction was proposed to enlarge the effective receptive field of convolutional neural networks, which improve the prediction accuracy of prostate boundary. In addition, we introduced a combined loss function including cosine and cross entropy, which measures similarity and dissimilarity between segmented and manual contours, to further improve the segmentation accuracy. Prostate volumes manually segmented by experienced physicians were used as a gold standard against which our segmentation accuracy was measured. Results The proposed method was evaluated on an internal dataset comprising 40 T2-weighted prostate MR volumes. Our method achieved a Dice similarity coefficient (DSC) of 0.86 ± 0.04, a mean surface distance (MSD) of 1.79 ± 0.46 mm, 95% Hausdorff distance (95%HD) of 7.98 ± 2.91 mm, and absolute relative volume difference (aRVD) of 15.65 ± 10.82. A public dataset (PROMISE12) including 50 T2-weighted prostate MR volumes was also employed to evaluate our approach. Our method yielded a DSC of 0.88 ± 0.05, MSD of 1.02 ± 0.35 mm, 95% HD of 9.50 ± 5.11 mm, and aRVD of 8.93 ± 7.56. Conclusion We developed a novel deeply supervised deep learning-based approach with a group dilated convolution to automatically segment the MRI prostate, demonstrated its clinical feasibility, and validated its accuracy against manual segmentation. The proposed technique could be a useful tool for image-guided interventions in prostate cancer.

172 citations


Cites methods from "Deeply-supervised CNN for prostate ..."

  • ...Notably, our 3D DSD-FCN segmentation method differs from previous work on MR prostate segmentation with respect to network architecture, convolutional kernels construction, and loss function.(40,41) A keyword combination including “MRI & prostate & segmentation & deeply supervised & dilated convolution” was used to search for any relevant publications on PubMed....

    [...]

References
More filters
Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Proceedings Article
04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

55,235 citations


"Deeply-supervised CNN for prostate ..." refers background in this paper

  • ...[7], [8] and object detection [9], [10]....

    [...]

  • ...As VGGNet [9] has demonstrated that the representation depth is beneficial for classification accuracy....

    [...]

  • ...Classification is the most common application in CNNs, such as GoogLeNet [3], VGG-Net [9], where the output is a class label for the image....

    [...]

Proceedings Article
01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

49,914 citations

Book ChapterDOI
05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .

49,590 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations