Stack-U-Net: refinement network for improved optic disc and cup image segmentation

doi:10.1117/12.2511572

Home
/
Papers
/
Stack-U-Net: refinement network for improved optic disc and cup image segmentation

Proceedings Article•DOI•

Stack-U-Net: refinement network for improved optic disc and cup image segmentation

Artem Sevastopolsky¹, Stepan Drapak², Konstantin Kiselev, Blake M. Snyder³, Blake M. Snyder⁴, Jeremy D. Keenan³, Anastasia Georgievskaya - Show less +3 more•Institutions (4)

Skolkovo Institute of Science and Technology¹, Moscow State University², University of California, San Francisco³, University of Colorado Denver⁴

15 Mar 2019-Vol. 10949, pp 1094928

TL;DR: A special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement, which outperforms others by multiple benchmarks without a need for increasing the volume of datasets is proposed.

read less

Abstract: In this work, we propose a special cascade network for image segmentation, which is based on the U-Net networks as building blocks and the idea of the iterative refinement. The model was mainly applied to achieve higher recognition quality for the task of finding borders of the optic disc and cup, which are relevant to the presence of glaucoma. Compared to a single U-Net and the state-of-the-art methods for the investigated tasks, the presented method outperforms others by multiple benchmarks without increasing the volume of datasets. Our experiments include comparison with the best-known methods on publicly available databases DRIONS-DB, RIM-ONE v.3, DRISHTI-GS, and evaluation on a private data set collected in collaboration with University of California San Francisco Medical School. The analysis of the architecture details is presented. It is argued that the model can be employed for a broad scope of image segmentation problems of similar nature.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Complex image processing with less data—Document image binarization by integrating multiple pre-trained U-Net modules

[...]

Seokjun Kang¹, Brian Kenji Iwana¹, Seiichi Uchida¹•Institutions (1)

Kyushu University¹

01 Jan 2021-Pattern Recognition

TL;DR: A novel document binarization method called Cascading Modular U-Nets (CMU-Net) consisting of pre-trained modular modules useful for overcoming the problem of a shortage of training images and a novel cascading scheme for improving overall cascade model performance.

...read moreread less

39 citations

Journal Article•DOI•

Nuclear Segmentation in Histopathological Images Using Two-Stage Stacked U-Nets With Attention Mechanism.

[...]

Yan Kong¹, Georgi Z. Genchev¹, Georgi Z. Genchev², Xiaolei Wang¹, Hongyu Zhao³, Hui Lu¹, Hui Lu² - Show less +3 more•Institutions (3)

Shanghai Jiao Tong University¹, Boston Children's Hospital², Yale University³

26 Oct 2020-Frontiers in Bioengineering and Biotechnology

TL;DR: The proposed two-stage learning framework based on two connected Stacked U-Nets outperforms many current segmentation methods, and the consistent good segmentation performance on images from different organs indicates the generalized adaptability of the approach.

...read moreread less

Abstract: Nuclei segmentation is a fundamental but challenging task in histopathological image analysis. One of the main problems is the existence of overlapping regions which increases the difficulty of independent nuclei separation. In this study, to solve the segmentation of nuclei and overlapping regions, we introduce a nuclei segmentation method based on two-stage learning framework consisting of two connected Stacked U-Nets (SUNets). The proposed SUNets consists of four parallel backbone nets, which are merged by the attention generation model. In the first stage, a Stacked U-Net is utilized to predict pixel-wise segmentation of nuclei. The output binary map together with RGB values of the original images are concatenated as the input of the second stage of SUNets. Due to the sizable imbalance of overlapping and background regions, the first network is trained with cross-entropy loss, while the second network is trained with focal loss. We applied the method on two publicly available datasets and achieved state-of-the-art performance for nuclei segmentation-mean Aggregated Jaccard Index (AJI) results were 0.5965 and 0.6210, and F1 scores were 0.8247 and 0.8060, respectively; our method also segmented the overlapping regions between nuclei, with average AJI = 0.3254. The proposed two-stage learning framework outperforms many current segmentation methods, and the consistent good segmentation performance on images from different organs indicates the generalized adaptability of our approach.

...read moreread less

25 citations

Journal Article•DOI•

Joint optic disc and cup segmentation based on multi-scale feature analysis and attention pyramid architecture for glaucoma screening

[...]

Guangmin Sun¹, Zhongxiang Zhang¹, Junjie Zhang¹, Meilong Zhu¹, Xiao-rong Zhu², Jin-Kui Yang², Yu Li¹ - Show less +3 more•Institutions (2)

Beijing University of Technology¹, Beijing Tongren Hospital²

30 Sep 2021-Neural Computing and Applications

TL;DR: A unified convolutional neural network, named ResFPN-Net, which learns the boundary feature and the inner relation between OD and OC for automatic segmentation and is effective in analysing fundus images for glaucoma screening and can be applied in other relative biomedical image segmentation applications.

...read moreread less

Abstract: Automatic segmentation of optic disc (OD) and optic cup (OC) is an essential task for analysing colour fundus images. In clinical practice, accurate OD and OC segmentation assist ophthalmologists in diagnosing glaucoma. In this paper, we propose a unified convolutional neural network, named ResFPN-Net, which learns the boundary feature and the inner relation between OD and OC for automatic segmentation. The proposed ResFPN-Net is mainly composed of multi-scale feature extractor, multi-scale segmentation transition and attention pyramid architecture. The multi-scale feature extractor achieved the feature encoding of fundus images and captured the boundary representations. The multi-scale segmentation transition is employed to retain the features of different scales. Moreover, an attention pyramid architecture is proposed to learn rich representations and the mutual connection in the OD and OC. To verify the effectiveness of the proposed method, we conducted extensive experiments on two public datasets. On the Drishti-GS database, we achieved a Dice coefficient of 97.59%, 89.87%, the accuracy of 99.21%, 98.77%, and the Averaged Hausdorff distance of 0.099, 0.882 on the OD and OC segmentation, respectively. We achieved a Dice coefficient of 96.41%, 83.91%, the accuracy of 99.30%, 99.24%, and the Averaged Hausdorff distance of 0.166, 1.210 on the RIM-ONE database for OD and OC segmentation, respectively. Comprehensive results show that the proposed method outperforms other competitive OD and OC segmentation methods and appears more adaptable in cross-dataset scenarios. The introduced multi-scale loss function achieved significantly lower training loss and higher accuracy compared with other loss functions. Furthermore, the proposed method is further validated in OC to OD ratio calculation task and achieved the best MAE of 0.0499 and 0.0630 on the Drishti-GS and RIM-ONE datasets, respectively. Finally, we evaluated the effectiveness of the glaucoma screening on Drishti-GS and RIM-ONE datasets, achieving the AUC of 0.8947 and 0.7964. These results proved that the proposed ResFPN-Net is effective in analysing fundus images for glaucoma screening and can be applied in other relative biomedical image segmentation applications.

...read moreread less

12 citations

Journal Article•DOI•

Literature Review on Artificial Intelligence Methods for Glaucoma Screening, Segmentation, and Classification

[...]

Jose Carlos Raposo da Camara, Alexandre Neto, Ivan Miguel Pires, Marcia Villasana, Eftim Zdravevski, Antonio José Ledo Alves da Cunha - Show less +2 more

20 Jan 2022-Journal of Imaging

TL;DR: Whether deep learning techniques may be helpful in performing accurate and low-cost measurements related to glaucoma, which may promote patient empowerment and help medical doctors better monitor patients is verified.

...read moreread less

Abstract: Artificial intelligence techniques are now being applied in different medical solutions ranging from disease screening to activity recognition and computer-aided diagnosis. The combination of computer science methods and medical knowledge facilitates and improves the accuracy of the different processes and tools. Inspired by these advances, this paper performs a literature review focused on state-of-the-art glaucoma screening, segmentation, and classification based on images of the papilla and excavation using deep learning techniques. These techniques have been shown to have high sensitivity and specificity in glaucoma screening based on papilla and excavation images. The automatic segmentation of the contours of the optic disc and the excavation then allows the identification and assessment of the glaucomatous disease’s progression. As a result, we verified whether deep learning techniques may be helpful in performing accurate and low-cost measurements related to glaucoma, which may promote patient empowerment and help medical doctors better monitor patients.

...read moreread less

10 citations

Journal Article•DOI•

Joint optic disc and cup segmentation based on densely connected depthwise separable convolution deep network

[...]

Bingyan Liu¹, Daru Pan¹, Hui Song¹•Institutions (1)

South China Normal University¹

28 Jan 2021-BMC Medical Imaging

TL;DR: DDSC-Net as discussed by the authors proposes a two-stage method where the optic disc is firstly located and then the optic discs and cup are segmented jointly according to the interesting areas.

...read moreread less

Abstract: Glaucoma is an eye disease that causes vision loss and even blindness. The cup to disc ratio (CDR) is an important indicator for glaucoma screening and diagnosis. Accurate segmentation for the optic disc and cup helps obtain CDR. Although many deep learning-based methods have been proposed to segment the disc and cup for fundus image, achieving highly accurate segmentation performance is still a great challenge due to the heavy overlap between the optic disc and cup. In this paper, we propose a two-stage method where the optic disc is firstly located and then the optic disc and cup are segmented jointly according to the interesting areas. Also, we consider the joint optic disc and cup segmentation task as a multi-category semantic segmentation task for which a deep learning-based model named DDSC-Net (densely connected depthwise separable convolution network) is proposed. Specifically, we employ depthwise separable convolutional layer and image pyramid input to form a deeper and wider network to improve segmentation performance. Finally, we evaluate our method on two publicly available datasets, Drishti-GS and REFUGE dataset. The experiment results show that the proposed method outperforms state-of-the-art methods, such as pOSAL, GL-Net, M-Net and Stack-U-Net in terms of disc coefficients, with the scores of 0.9780 (optic disc) and 0.9123 (optic cup) on the DRISHTI-GS dataset, and the scores of 0.9601 (optic disc) and 0.8903 (optic cup) on the REFUGE dataset. Particularly, in the more challenging optic cup segmentation task, our method outperforms GL-Net by 0.7 $$\%$$ in terms of disc coefficients on the Drishti-GS dataset and outperforms pOSAL by 0.79 $$\%$$ on the REFUGE dataset, respectively. The promising segmentation performances reveal that our method has the potential in assisting the screening and diagnosis of glaucoma.

...read moreread less

9 citations

1
2
3
4
…
5

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

49,914 citations

Book Chapter•DOI•

U-Net: Convolutional Networks for Biomedical Image Segmentation

[...]

Olaf Ronneberger¹, Philipp Fischer¹, Thomas Brox¹•Institutions (1)

University of Freiburg¹

05 Oct 2015

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .

...read moreread less

49,590 citations

Journal Article•DOI•

ImageNet Large Scale Visual Recognition Challenge

[...]

Olga Russakovsky¹, Jia Deng², Hao Su¹, Jonathan Krause¹, Sanjeev Satheesh¹, Sean Ma¹, Zhiheng Huang¹, Andrej Karpathy¹, Aditya Khosla³, Michael S. Bernstein¹, Alexander C. Berg⁴, Li Fei-Fei¹ - Show less +8 more•Institutions (4)

Stanford University¹, University of Michigan², Massachusetts Institute of Technology³, University of North Carolina at Chapel Hill⁴

01 Dec 2015-International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

...read moreread less

30,811 citations

Posted Content•

U-Net: Convolutional Networks for Biomedical Image Segmentation

[...]

Olaf Ronneberger¹, Philipp Fischer¹, Thomas Brox¹•Institutions (1)

University of Freiburg¹

18 May 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

19,534 citations

"Stack-U-Net: refinement network for..." refers methods in this paper

...In this work we intend to provide a new end-to-end approach to the medical segmentation task of optic disc and cup borders localization, which is based on well-known and highly-performing U-Net [6] convolutional neural network (CNN) of encoder-decoder style....
[...]
...It consists of basic blocks, and each of them follows the encoderdecoder architecture similar to U-Net [6], depicted on Fig....
[...]