scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Stack-U-Net: refinement network for improved optic disc and cup image segmentation

TL;DR: A special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement, which outperforms others by multiple benchmarks without a need for increasing the volume of datasets is proposed.
Abstract: In this work, we propose a special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement. The model was mainly applied to achieve higher recognition quality for the task of finding borders of the optic disc and cup, which are relevant to the presence of glaucoma. Compared to a single U-Net and the state-of-the-art methods for the investigated tasks, the presented method outperforms others by multiple benchmarks without increasing the volume of datasets. Our experiments include comparison with the best-known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS, and evaluation on a private data set collected in collaboration with University of California San Francisco Medical School. The analysis of the architecture details is presented. It is argued that the model can be employed for a broad scope of image segmentation problems of similar nature.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel document binarization method called Cascading Modular U-Nets (CMU-Net) consisting of pre-trained modular modules useful for overcoming the problem of a shortage of training images and a novel cascading scheme for improving overall cascade model performance.

39 citations

Journal ArticleDOI
TL;DR: The proposed two-stage learning framework based on two connected Stacked U-Nets outperforms many current segmentation methods, and the consistent good segmentation performance on images from different organs indicates the generalized adaptability of the approach.
Abstract: Nuclei segmentation is a fundamental but challenging task in histopathological image analysis. One of the main problems is the existence of overlapping regions which increases the difficulty of independent nuclei separation. In this study, to solve the segmentation of nuclei and overlapping regions, we introduce a nuclei segmentation method based on two-stage learning framework consisting of two connected Stacked U-Nets (SUNets). The proposed SUNets consists of four parallel backbone nets, which are merged by the attention generation model. In the first stage, a Stacked U-Net is utilized to predict pixel-wise segmentation of nuclei. The output binary map together with RGB values of the original images are concatenated as the input of the second stage of SUNets. Due to the sizable imbalance of overlapping and background regions, the first network is trained with cross-entropy loss, while the second network is trained with focal loss. We applied the method on two publicly available datasets and achieved state-of-the-art performance for nuclei segmentation-mean Aggregated Jaccard Index (AJI) results were 0.5965 and 0.6210, and F1 scores were 0.8247 and 0.8060, respectively; our method also segmented the overlapping regions between nuclei, with average AJI = 0.3254. The proposed two-stage learning framework outperforms many current segmentation methods, and the consistent good segmentation performance on images from different organs indicates the generalized adaptability of our approach.

25 citations

Journal ArticleDOI
TL;DR: A unified convolutional neural network, named ResFPN-Net, which learns the boundary feature and the inner relation between OD and OC for automatic segmentation and is effective in analysing fundus images for glaucoma screening and can be applied in other relative biomedical image segmentation applications.
Abstract: Automatic segmentation of optic disc (OD) and optic cup (OC) is an essential task for analysing colour fundus images. In clinical practice, accurate OD and OC segmentation assist ophthalmologists in diagnosing glaucoma. In this paper, we propose a unified convolutional neural network, named ResFPN-Net, which learns the boundary feature and the inner relation between OD and OC for automatic segmentation. The proposed ResFPN-Net is mainly composed of multi-scale feature extractor, multi-scale segmentation transition and attention pyramid architecture. The multi-scale feature extractor achieved the feature encoding of fundus images and captured the boundary representations. The multi-scale segmentation transition is employed to retain the features of different scales. Moreover, an attention pyramid architecture is proposed to learn rich representations and the mutual connection in the OD and OC. To verify the effectiveness of the proposed method, we conducted extensive experiments on two public datasets. On the Drishti-GS database, we achieved a Dice coefficient of 97.59%, 89.87%, the accuracy of 99.21%, 98.77%, and the Averaged Hausdorff distance of 0.099, 0.882 on the OD and OC segmentation, respectively. We achieved a Dice coefficient of 96.41%, 83.91%, the accuracy of 99.30%, 99.24%, and the Averaged Hausdorff distance of 0.166, 1.210 on the RIM-ONE database for OD and OC segmentation, respectively. Comprehensive results show that the proposed method outperforms other competitive OD and OC segmentation methods and appears more adaptable in cross-dataset scenarios. The introduced multi-scale loss function achieved significantly lower training loss and higher accuracy compared with other loss functions. Furthermore, the proposed method is further validated in OC to OD ratio calculation task and achieved the best MAE of 0.0499 and 0.0630 on the Drishti-GS and RIM-ONE datasets, respectively. Finally, we evaluated the effectiveness of the glaucoma screening on Drishti-GS and RIM-ONE datasets, achieving the AUC of 0.8947 and 0.7964. These results proved that the proposed ResFPN-Net is effective in analysing fundus images for glaucoma screening and can be applied in other relative biomedical image segmentation applications.

12 citations

Journal ArticleDOI
TL;DR: Whether deep learning techniques may be helpful in performing accurate and low-cost measurements related to glaucoma, which may promote patient empowerment and help medical doctors better monitor patients is verified.
Abstract: Artificial intelligence techniques are now being applied in different medical solutions ranging from disease screening to activity recognition and computer-aided diagnosis. The combination of computer science methods and medical knowledge facilitates and improves the accuracy of the different processes and tools. Inspired by these advances, this paper performs a literature review focused on state-of-the-art glaucoma screening, segmentation, and classification based on images of the papilla and excavation using deep learning techniques. These techniques have been shown to have high sensitivity and specificity in glaucoma screening based on papilla and excavation images. The automatic segmentation of the contours of the optic disc and the excavation then allows the identification and assessment of the glaucomatous disease’s progression. As a result, we verified whether deep learning techniques may be helpful in performing accurate and low-cost measurements related to glaucoma, which may promote patient empowerment and help medical doctors better monitor patients.

10 citations

Journal ArticleDOI
TL;DR: DDSC-Net as discussed by the authors proposes a two-stage method where the optic disc is firstly located and then the optic discs and cup are segmented jointly according to the interesting areas.
Abstract: Glaucoma is an eye disease that causes vision loss and even blindness. The cup to disc ratio (CDR) is an important indicator for glaucoma screening and diagnosis. Accurate segmentation for the optic disc and cup helps obtain CDR. Although many deep learning-based methods have been proposed to segment the disc and cup for fundus image, achieving highly accurate segmentation performance is still a great challenge due to the heavy overlap between the optic disc and cup. In this paper, we propose a two-stage method where the optic disc is firstly located and then the optic disc and cup are segmented jointly according to the interesting areas. Also, we consider the joint optic disc and cup segmentation task as a multi-category semantic segmentation task for which a deep learning-based model named DDSC-Net (densely connected depthwise separable convolution network) is proposed. Specifically, we employ depthwise separable convolutional layer and image pyramid input to form a deeper and wider network to improve segmentation performance. Finally, we evaluate our method on two publicly available datasets, Drishti-GS and REFUGE dataset. The experiment results show that the proposed method outperforms state-of-the-art methods, such as pOSAL, GL-Net, M-Net and Stack-U-Net in terms of disc coefficients, with the scores of 0.9780 (optic disc) and 0.9123 (optic cup) on the DRISHTI-GS dataset, and the scores of 0.9601 (optic disc) and 0.8903 (optic cup) on the REFUGE dataset. Particularly, in the more challenging optic cup segmentation task, our method outperforms GL-Net by 0.7 $$\%$$ in terms of disc coefficients on the Drishti-GS dataset and outperforms pOSAL by 0.79 $$\%$$ on the REFUGE dataset, respectively. The promising segmentation performances reveal that our method has the potential in assisting the screening and diagnosis of glaucoma.

9 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Proceedings Article
01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

49,914 citations

Book ChapterDOI
05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .

49,590 citations

Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations

Posted Content
TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at this http URL .

19,534 citations


"Stack-U-Net: refinement network for..." refers methods in this paper

  • ...In this work we intend to provide a new end-to-end approach to the medical segmentation task of optic disc and cup borders localization, which is based on well-known and highly-performing U-Net [6] convolutional neural network (CNN) of encoder-decoder style....

    [...]

  • ...It consists of basic blocks, and each of them follows the encoderdecoder architecture similar to U-Net [6], depicted on Fig....

    [...]

Trending Questions (1)
How to Train an image segmentation model?

It is argued that the model can be employed for a broad scope of image segmentation problems of similar nature.