U-Net: Convolutional Networks for Biomedical Image Segmentation

doi:10.1007/978-3-319-24574-4_28

Home
/
Papers
/
U-Net: Convolutional Networks for Biomedical Image Segmentation

Book Chapter•DOI•

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger¹, Philipp Fischer¹, Thomas Brox¹•Institutions (1)

University of Freiburg¹

05 Oct 2015-pp 234-241

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

read less

Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Feature Pyramid Networks for Object Detection

[...]

Tsung-Yi Lin¹, Piotr Dollár², Ross Girshick², Kaiming He², Bharath Hariharan², Serge Belongie¹ - Show less +2 more•Institutions (2)

Cornell University¹, Facebook²

21 Jul 2017

TL;DR: This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.

...read moreread less

Abstract: Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.

...read moreread less

16,727 citations

Journal Article•DOI•

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

[...]

Vijay Badrinarayanan¹, Alex Kendall¹, Roberto Cipolla¹•Institutions (1)

University of Cambridge¹

01 Dec 2017-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Abstract: We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1] . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/ .

...read moreread less

13,468 citations

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

11,127 citations

Posted Content•

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

[...]

Liang-Chieh Chen¹, George Papandreou¹, Iasonas Kokkinos², Kevin Murphy¹, Alan L. Yuille³ - Show less +1 more•Institutions (3)

Google¹, University College London², Johns Hopkins University³

02 Jun 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: DeepLab as discussed by the authors proposes atrous spatial pyramid pooling (ASPP) to segment objects at multiple scales by probing an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views.

...read moreread less

Abstract: In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

...read moreread less

10,120 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A benchmark for comparison of cell tracking algorithms

[...]

Martin Maška, Vladimír Ulman, David Svoboda, Pavel Matula, Petr Matula, Cristina Ederra, Ainhoa Urbiola, Tomás España, Subramanian Venkatesan, Deepak M.W. Balak, Pavel Karas, Tereza Bolcková, Markéta Štreitová, Craig Carthel, Stefano Coraluppi, Nathalie Harder, Karl Rohr, Klas E. G. Magnusson, Joakim Jalden, Helen M. Blau, Oleh Dzyubachyk, Pavel Křížek, Guy M. Hagen, David Pastor-Escuredo, Daniel Jimenez-Carretero, Maria J. Ledesma-Carbayo, Arrate Muñoz-Barrutia, Erik Meijering, Michal Kozubek, Carlos Ortiz-de-Solorzano¹ - Show less +26 more•Institutions (1)

University of Navarra¹

01 Jun 2014-Bioinformatics

TL;DR: Six algorithms covering a variety of segmentation and tracking paradigms have been compared and ranked based on their performance on both synthetic and real datasets in the Cell Tracking Challenge.

...read moreread less

Abstract: Motivation: Automatic tracking of cells in multidimensional time� lapse fluorescence microscopy is an important task in many biomed� ical applications. A novel framework for objective evaluation of cell tracking algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2013 Cell Tracking Challenge. In this paper, we present the logistics, datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. Results: The main contributions of the challenge include the crea� tion of a comprehensive video dataset repository and the definition of objective measures for comparison and ranking of the algorithms. With this benchmark, six algorithms covering a variety of segmenta� tion and tracking paradigms have been compared and ranked based on their performance on both synthetic and real datasets. Given the diversity of the datasets, we do not declare a single winner of the challenge. Instead, we present and discuss the results for each individual dataset separately. Availability and implementation: The challenge website (http://www.codesolorzano.com/celltrackingchallenge) provides access to the training and competition datasets, along with the ground truth of the training videos. It also provides access to Win� dows and Linux executable files of the evaluation software and most of the algorithms that competed in the challenge.

...read moreread less

412 citations

Journal Article•DOI•

An integrated micro- and macroarchitectural analysis of the Drosophila brain by computer-assisted serial section electron microscopy.

[...]

Albert Cardona¹, Stephan Saalfeld², Stephan Preibisch², Benjamin Schmid², Anchi Cheng³, J. Pulokas³, Pavel Tomancak², Volker Hartenstein² - Show less +4 more•Institutions (3)

University of Zurich¹, Max Planck Society², Scripps Research Institute³

05 Oct 2010-PLOS Biology

TL;DR: This study applies the software package TrakEM2 to reconstruct neuronal microcircuitry from TEM sections of a small brain, the early larval brain of Drosophila melanogaster, and describes the network motifs most frequently encountered in the Drosophile neuropile.

...read moreread less

Abstract: The analysis of microcircuitry (the connectivity at the level of individual neuronal processes and synapses), which is indispensable for our understanding of brain function, is based on serial transmission electron microscopy (TEM) or one of its modern variants. Due to technical limitations, most previous studies that used serial TEM recorded relatively small stacks of individual neurons. As a result, our knowledge of microcircuitry in any nervous system is very limited. We applied the software package TrakEM2 to reconstruct neuronal microcircuitry from TEM sections of a small brain, the early larval brain of Drosophila melanogaster. TrakEM2 enables us to embed the analysis of the TEM image volumes at the microcircuit level into a light microscopically derived neuro-anatomical framework, by registering confocal stacks containing sparsely labeled neural structures with the TEM image volume. We imaged two sets of serial TEM sections of the Drosophila first instar larval brain neuropile and one ventral nerve cord segment, and here report our first results pertaining to Drosophila brain microcircuitry. Terminal neurites fall into a small number of generic classes termed globular, varicose, axiform, and dendritiform. Globular and varicose neurites have large diameter segments that carry almost exclusively presynaptic sites. Dendritiform neurites are thin, highly branched processes that are almost exclusively postsynaptic. Due to the high branching density of dendritiform fibers and the fact that synapses are polyadic, neurites are highly interconnected even within small neuropile volumes. We describe the network motifs most frequently encountered in the Drosophila neuropile. Our study introduces an approach towards a comprehensive anatomical reconstruction of neuronal microcircuitry and delivers microcircuitry comparisons between vertebrate and insect neuropile.

...read moreread less

395 citations

Proceedings Article•DOI•

Image Segmentation with Cascaded Hierarchical Models and Logistic Disjunctive Normal Networks

[...]

Mojtaba Seyedhosseini¹, Mehdi Sajjadi¹, Tolga Tasdizen¹•Institutions (1)

University of Utah¹

01 Dec 2013

TL;DR: This work proposes a multi-resolution contextual framework, called cascaded hierarchical model (CHM), which learns contextual information in a hierarchical framework for image segmentation, and introduces a novel classification scheme, called logistic disjunctive normal networks (LDNN), which outperforms state-of-the-art classifiers and can be used in the CHM to improve object segmentation performance.

...read moreread less

Abstract: Contextual information plays an important role in solving vision problems such as image segmentation. However, extracting contextual information and using it in an effective way remains a difficult problem. To address this challenge, we propose a multi-resolution contextual framework, called cascaded hierarchical model (CHM), which learns contextual information in a hierarchical framework for image segmentation. At each level of the hierarchy, a classifier is trained based on down sampled input images and outputs of previous levels. Our model then incorporates the resulting multi-resolution contextual information into a classifier to segment the input image at original resolution. We repeat this procedure by cascading the hierarchical framework to improve the segmentation accuracy. Multiple classifiers are learned in the CHM, therefore, a fast and accurate classifier is required to make the training tractable. The classifier also needs to be robust against over fitting due to the large number of parameters learned during training. We introduce a novel classification scheme, called logistic disjunctive normal networks (LDNN), which consists of one adaptive layer of feature detectors implemented by logistic sigmoid functions followed by two fixed layers of logical units that compute conjunctions and disjunctions, respectively. We demonstrate that LDNN outperforms state-of-the-art classifiers and can be used in the CHM to improve object segmentation performance.

...read moreread less

90 citations