How many patients were used for training, validation and testing?

The authors performed data splitting at patient level and used images from 10, 2, 6 patients for training, validation and testing, respectively.

What is the label of the qth instance in X p?

Suppose the label of the qth instance in X p is l pq , Yp is converted into a binary image Ypq based on whether the value of each pixel in Yp equals to l pq .

How many patients were selected for training?

The authors randomly selected T1c and FLAIR images of 19, 25 patients with a single scan for validation and testing, respectively, and used T1c images of the remaining patients for training.

How was the resized input of the4DeepMedic?

To deal with organs at different scales, the authors resized the input of P-Net so that the4DeepMedic and HighRes3DNet were implemented in http://niftynet.iominimal value of width and height was 96 pixels.

What was the weight function used to train the CNNs?

To deal with different organs and different modalities, the region inside a bounding box was normalized by the mean value and standard deviation of that region, and then used as the input of the CNNs.

What was the test used to determine the performance difference between two different segmentation methods?

The authors used a paired Student’s t-test to determine whether the performance difference between two segmentation methods was significant [30].

What is the way to improve the accuracy of BIFSeg?

To address this problem, BIFSeg allows optional supervised fine-tuning that leverages user interactions to achieve higher robustness and accuracy.

(Open Access) Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning (2018) | Guotai Wang

Q: What are the contributions mentioned in the paper "Interactive medical image segmentation using deep learning with image-specific fine tuning" ?

To address these problems, the authors propose a novel deep learning-based interactive segmentation framework by incorporating CNNs into a bounding box and scribblebased segmentation pipeline. The authors propose image-specific fine tuning to make a CNN model adaptive to a specific test image, which can be either unsupervised ( without additional user interactions ) or supervised ( with additional scribbles ). The authors also propose a weighted loss function considering network and interaction-based uncertainty for the fine tuning. The authors applied this framework to two applications: Manuscript received October 11, 2017 ; revised January 4, 2018 ; accepted January 5, 2018. This work was supported in part by the Wellcome Trust under Grant WT101957, Grant WT97914, and Grant HICF-T4-275, in part by the EPSRC under Grant NS/A000027/1, Grant EP/H046410/1, Grant EP/J020990/1, Grant EP/K005278, and Grant NS/A000050/1, in part by the Wellcome/EPSRC under Grant 203145Z/16/Z, in part by the Royal Society under Grant RG160569, in part by the National Institute for Health Research University College London ( UCL ) Hospitals Biomedical Research Centre, in part by the Great Ormond Street Hospital Charity, in part by UCL ORS and GRS, in part by NVIDIA, and in part by Emerald, a GPU-accelerated High Performance Computer, made available by the Science and Engineering South Consortium operated in partnership with the STFC RutherfordAppleton Laboratory. Color versions of one or more of the figures in this paper are available online at http: //ieeexplore.

Q: What is the simplest way to train a CNN?

With T̂ , the CNN model (e.g., P-Net or PC-Net) is trained to extract the target from its bounding box, which is a binary segmentation problem irrespective of the object type.

King’s Research Portal

DOI:

10.1109/TMI.2018.2791721

Document Version

Publisher's PDF, also known as Version of record

Link to publication record in King's Research Portal

Citation for published version (APA):

Wang, G., Li, W., Zuluaga, M. A., Pratt, R., Patel, P. A., Aertsen, M., Doel, T., David, A. L., Deprest, J., Ourselin,

S., & Vercauteren, T. (2018). Interactive Medical Image Segmentation using Deep Learning with Image-specific

Fine-tuning. IEEE Transactions on Medical Imaging, 37(7), 1562 - 1573.

https://doi.org/10.1109/TMI.2018.2791721

Citing this paper

Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may

differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination,

volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are

again advised to check the publisher's website for any subsequent corrections.

General rights

Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright

owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights.

•Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research.

•You may not further distribute the material or use it for any profit-making activity or commercial gain

•You may freely distribute the URL identifying the publication in the Research Portal

Take down policy

If you believe that this document breaches copyright please contact librarypure@kcl.ac.uk providing details, and we will remove access to

the work immediately and investigate your claim.

Download date: 09. Aug. 2022

1562 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 37, NO. 7, JULY 2018

Interactive Medical Image Segmentation Using

Deep Learning With Image-Speciﬁc Fine Tuning

Guotai Wang , Wenqi Li , Maria A. Zuluaga , Rosalind Pratt, Premal A. Patel, Michael Aertsen,

Tom Doel, Anna L. David, Jan Deprest, Sébastien Ourselin, and Tom Vercauteren

Abstract

— Convolutional neural networks (CNNs) have

achieved state-of-the-art performance for automatic

medical image segmentation. However, they have not

demonstrated sufﬁciently accurate and robust results for

clinical use. In addition, they are limited by the lack of

image-speciﬁc adaptation and the lack of generalizability

to previously unseen object classes (a.k.a. zero-shot

learning). To address these problems, we propose a novel

deep learning-based interactive segmentation framework

by incorporating CNNs into a bounding box and scribble-

based segmentation pipeline. We propose image-speciﬁc

ﬁne tuning to make a CNN model adaptive to a speciﬁc test

image, which can be either unsupervised(without additional

user interactions) or supervised (with additional scribbles).

We also propose a weighted loss function considering

network and interaction-based uncertainty for the ﬁne

tuning. We applied this framework to two applications:

Manuscript received October 11, 2017; revised January 4, 2018;

accepted January 5, 2018. Date of publication January 26, 2018; date

of current version June 30, 2018. This work was supported in part

by the Wellcome Trust under Grant WT101957, Grant WT97914, and

Grant HICF-T4-275, in part by the EPSRC under Grant NS/A000027/1,

Grant EP/H046410/1, Grant EP/J020990/1, Grant EP/K005278, and

Grant NS/A000050/1, in part by the Wellcome/EPSRC under Grant

203145Z/16/Z, in part by the Royal Society under Grant RG160569,

in part by the National Institute for Health Research University College

London (UCL) Hospitals Biomedical Research Centre, in part by the

Great Ormond Street Hospital Charity, in part by UCL ORS and GRS,

in part by NVIDIA, and in part by Emerald, a GPU-accelerated High

Performance Computer, made available by the Science and Engineering

South Consortium operated in partnership with the STFC Rutherford-

Appleton Laboratory.

(Corresponding author: Guotai Wang.)

G. Wang, W. Li, R. Pratt, P. A. Patel, T. Doel, and S. Ourselin

are with the Wellcome EPSRC Centre for Interventional and Surgical

Sciences, Department of Medical Physics and Biomedical Engineering,

University College London, London WC1E 6BT, U.K. (e-mail: guotai.

wang.14@ucl.ac.uk).

M. A. Zuluaga is with the Department of Medical Physics and

Biomedical Engineering, University College London, London WC1E

6BT, U.K., with the Facultad de Medicina, Universidad Nacional de

Colombia, Bogotá 111321, Colombia, and also with Amadeus S.A.S.,

06560 Sophia-Antipolis, France.

M. Aertsen is with the Department of Radiology, University Hospitals

KU Leuven, 3000 Leuven, Belgium

A. L. David and J. Deprest are with the Wellcome EPSRC Centre

for Interventional and Surgical Sciences, Institute for Women’s Health,

University College London, London WC1E 6BT, U.K., and also with

Department of Obstetrics and Gynaecology, KU Leuven, 3000 Leuven,

Belgium.

T. Vercauteren is with the Wellcome EPSRC Centre for Interventional

and Surgical Sciences, Department of Medical Physics and Biomedical

Engineering, University College London, London WC1E 6BT, U.K., and

also with KU Leuven, 3000 Leuven, Belgium.

Color versions of one or more of the ﬁgures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TMI.2018.2791721

2-D segmentation of multiple organs from fetal magnetic

resonance (MR) slices, where only two types of these

organs were annotated for training and 3-D segmentation of

brain tumor core (excluding edema) and whole brain tumor

(including edema) from different MR sequences, where

only the tumor core in one MR sequence was annotated

for training. Experimental results show that: 1) our model

is more robust to segment previously unseen objects than

state-of-the-art CNNs; 2) image-speciﬁc ﬁne tuning with

the proposed weighted loss function signiﬁcantly improves

segmentation accuracy; and 3) our method leads to

accurate results with fewer user interactions and less user

time than traditional interactive segmentation methods.

Index Terms

— Interactive image segmentation, convolu-

tional neural network, ﬁne-tuning, fetal MRI, brain tumor.

I. INTRODUCTION

EEP learning with convolutional neural networks

(CNNs) has achieved state-of-the-art performance for

automated medical image segmentation [1]. However, auto-

matic segmentation methods have not demonstrated sufﬁ-

ciently accurate and robust results for clinical use due to

the inherent challenges of medical images, such as poor

image quality, different imaging and segmentation protocols,

and variations among patients [2]. Alternatively, interactive

segmentation methods are widely adopted, as they integrate

the user’s knowledge and take into account the application

requirements for more robust segmentation performance [2].

As such, interactive segmentation remains the state of the art

for existing commercial surgical planning and navigation prod-

ucts. Though leveraging user interactions often leads to more

robust segmentations, an interactive method should require as

short user time as possible to reduce the burden on users.

Motivated by these observations, we investigate combining

CNNs with user interactions for medical image segmentation

to achieve higher segmentation accuracy and robustness with

fewer user interactions and less user time. However, there are

very few studies on using CNNs for interactive segmenta-

tion [3]–[5]. This is mainly due to the requirement of large

amounts of annotated images for training, the lack of image-

speciﬁc adaptation and the demanding balance among model

complexity, inference time and memory space efﬁciency.

The ﬁrst challenge of using CNNs for interactive segmenta-

tion is that current CNNs do not generalize well to previously

unseen object classes that are not present in the training set.

As a result, they require labeled instances of each object

class to be present in the training set. For medical images,

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/

WANG

et al.

: INTERACTIVE MEDICAL IMAGE SEGMENTATION USING DEEP LEARNING 1563

annotations are often expensive to acquire as both expertise

and time are needed to produce accurate annotations. This

limits the performance of CNNs to segment objects for which

annotations are not available in the training stage.

Second, interactive segmentation often requires image-

speciﬁc learning to deal with large context variations among

different images, but current CNNs are not adaptive to dif-

ferent test images, as parameters of the model are learned

from training images and then ﬁxed in the testing stage,

without image-speciﬁc adaptation. It has been shown that

image-speciﬁc adaptation of a pre-trained Gaussian Mixture

Model (GMM) helps to improve segmentation accuracy [6].

However, transitioning from simple GMMs to powerful but

complex CNNs in this context has not yet been demonstrated.

Third, fast inference and memory efﬁciency are demanded

for interactive segmentation. They can be relatively easily

achieved for 2D images, but become much more problematic

for 3D images. For example, DeepMedic [7] works on 3D

local patches to reduce memory requirements but results in a

slow inference. HighRes3DNet [8] works on 3D whole images

with relatively fast inference but needs a large amount of GPU

memory, leading to high hardware requirements. To make a

CNN-based interactive segmentation method efﬁcient to use,

enabling CNNs to respond quickly to user interactions and

to work on a machine with limited GPU resources (e.g.,

a standard desktop PC or a laptop) is desirable. DeepIGeoS [5]

combines CNNs with user interactions and has demonstrated

good interactivity. However, it has a lack of adaptability to

unseen image contexts.

This paper presents a new framework to address these

challenges for deep learning-based interactive segmentation.

To generalize to previously unseen objects, we propose a

bounding-box-based segmentation pipeline that extracts the

foreground from a given region of interest, and design a 2D

and a 3D CNN with good compactness to avoid over-ﬁtting.

To make CNNs adaptive to different test images, we propose

image-speciﬁc ﬁne-tuning. In addition, our networks consider

a balance among receptive ﬁeld, inference time and memory

efﬁciency so as to be responsive to user interactions and have

low requirements in terms of GPU resources.

A. Contributions

The contributions of this work are four-fold. First, we pro-

pose a novel deep learning-based framework for interactive 2D

and 3D medical image segmentation by incorporating CNNs

into a bounding box and scribble-based binary segmentation

pipeline. Second, we propose image-speciﬁc ﬁne-tuning to

adapt a CNN model to each test image independently. The

ﬁne-tuning can be either unsupervised (without additional user

interactions) or supervised by user-provided scribbles. Third,

we propose a weighted loss function considering network

and interaction-based uncertainty during the image-speciﬁc

ﬁne-tuning. Fourth, we present the ﬁrst attempt to employ

CNNs to deal with previously unseen objects (a.k.a. zero-

shot learning) in the context of image segmentation. The

proposed framework does not require all the object classes

to be annotated for training. Thus, it can be applied to new

organs or new segmentation protocols directly.

B. Related Works

1) CNNs for Image Segmentation:

For natural image

segmentation, FCN [9] and DeepLab [10] are among the

state-of-the-art performing methods. For 2D biomedical

image segmentation, efﬁcient networks such as U-Net [11],

DCAN [12] and Nabla-net [13] have been proposed. For

3D volumes, patch-based CNNs have been proposed for

segmentation of the brain tumor [7] and pancreas [14], and

more powerful end-to-end 3D CNNs include V-Net [15],

HighRes3DNet [8], and 3D deeply supervised network [16].

2) Interactive Segmentation Methods:

A wide range of

interactive segmentation methods have been proposed [2].

Representative methods include Graph Cuts [17], Random

Walks [18] and GeoS [19]. Machine learning has been popu-

larly used to achieve high accuracy and interaction efﬁciency.

For example, GMMs are used by GrabCut [20] to segment

color images. Online Random Forests (ORFs) are employed

by Slic-Seg [21] for placenta segmentation from fetal Magnetic

Resonance images (MRI). In [22], active learning is used

to segment 3D Computed Tomography (CT) images. They

have achieved more accurate segmentations with fewer user

interactions than traditional interactive segmentation methods.

To combine user interactions with CNNs, DeepCut [3] and

ScribbleSup [23] propose to leverage user-provided bounding

boxes or scribbles, but they employ user interactions as sparse

annotations for the training set rather than as guidance for

dealing with test images. 3D U-Net [24] learns from anno-

tations of some slices in a volume and produces a dense

3D segmentation, but is not responsive to user interactions.

In [4], an FCN is combined with user interactions for 2D RGB

image segmentation, without adaptation for medical images.

DeepIGeoS [5] uses geodesic distance transforms of scribbles

as additional channels of CNNs for interactive segmentation,

but cannot deal with previously unseen object classes.

3) Model Adaptation:

Previous learning-based interactive

segmentation methods often employ image-speciﬁc models.

For example, GrabCut [20] and Slic-Seg [21] learn from the

target image with GMMs and ORFs, respectively, so that they

can be well adapted to the speciﬁc target image. Learning a

model from a training set with image-speciﬁc adaptation in the

testing stage has also been used to improve the segmentation

performance. For example, an adaptive GMM has been used to

address the distribution mismatch between the training and test

images [6]. For CNNs, ﬁne-tuning [25] is used for domain-

wise model adaptation to address the distribution mismatch

between different training sets. However, to the best of our

knowledge, this paper is the ﬁrst work to propose image-

speciﬁc model adaptation for CNNs.

II. M

ETHOD

The proposed interactive framework with Bounding box and

Image-speciﬁc Fine-tuning-based Segmentation (BIFSeg) is

depicted in Fig. 1. To deal with different (including previously

unseen) objects in a uniﬁed framework, we propose to use

a CNN that takes as input the content of a bounding box

of one instance and gives a binary segmentation for that

instance. In the testing stage, the user provides a bounding

1564 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 37, NO. 7, JULY 2018

Fig. 1. The proposed Bounding box and Image-speciﬁc Fine-tuning-based Segmentation (BIFSeg). 2D images are shown as examples. During

training, each instance is cropped with its bounding box, and the CNN is trained for binary segmentation. In the testing stage, image-speciﬁc

ﬁne-tuning with optional scribbles and a weighted loss function is used. Note that the object class (e.g. maternal kidneys) for testing may have not

been present in the training set.

Fig. 2. Our resolution-preserving networks with dilated convolution for 2D segmentation (a) and 3D segmentation (b). The numbers in dark blue

boxes denote convolution kernel sizes and numbers of output channels, and the numbers on the top of these boxes denote dilation parameters.

box, and BIFSeg extracts the region inside the bounding

box and feeds it into the pre-trained CNN with a forward

pass to obtain an initial segmentation. This is based on

the fact that our CNNs are designed and trained to learn

some common features, such as saliency, contrast and hyper-

intensity, across different objects, which helps to generalize to

unseen objects. Then we use unsupervised (without additional

user interactions) or supervised (with user-provided scribbles)

image-speciﬁc ﬁne-tuning to further reﬁne the segmentation.

This is because there is likely a mismatch between the

common features learned from the training set and those

in (previously unseen) test objects. Therefore, we use ﬁne-

tuning to leverage image-speciﬁc features and make our CNNs

adaptive to a speciﬁc test image for better segmentation. Our

framework is general, ﬂexible and can handle both 2D and

3D segmentations with few assumptions of network structures.

In this paper, we choose to use the state-of-the-art network

structures proposed in [5] for their compactness and efﬁciency.

The contribution of BIFSeg is nonetheless largely different

from [5] as BIFSeg focuses on segmentation of previously

unseen object classes and ﬁne-tunes the CNN model on the

ﬂy for image-wise adaptation that can be guided by user

interactions.

A. CNN Models

For 2D images, we adopt the P-Net [5] for bounding

box-based binary segmentation. The network is resolution-

preserving using dilated convolution [10]. As shown in

Fig. 2(a), it consists of six blocks with a receptive ﬁeld

of 181×181. The ﬁrst ﬁve blocks have dilation parameters

of 1, 2, 4, 8 and 16, respectively, so they capture features at

different scales. Features from these ﬁve blocks are concate-

nated and fed into block6 that serves as a classiﬁer. A softmax

layer is used to obtain probability-like outputs. In the testing

stage, we update the model based on image-speciﬁc ﬁne-

tuning. To ensure efﬁcient ﬁne-tuning and fast response to

WANG

et al.

: INTERACTIVE MEDICAL IMAGE SEGMENTATION USING DEEP LEARNING 1565

user interactions, we only ﬁne-tune parameters of the classiﬁer

(block6). Thus, features in the concatenation layer for the test

image can be stored before the ﬁne-tuning.

For 3D images, we use a network extended from P-Net,

as shown in Fig. 2(b). It considers a trade-off among receptive

ﬁeld, inference time and memory efﬁciency. The network has

an anisotropic receptive ﬁeld of 85×85×9. Compared with

slice-based networks, it employs 3D contexts. Compared with

large isotropic 3D receptive ﬁelds [8], it has less memory con-

sumption [26]. Besides, anisotropic acquisition is often used in

Magnetic Resonance (MR) imaging. We use 3×3×3kernels

in the ﬁrst two blocks and 3×3×1 kernels in block3 to block5.

Similar to P-Net, we ﬁne-tune the classiﬁer (block6) with pre-

computed concatenated features. To save space for storing

the concatenated features, we use 1×1×1 convolutions to

compress the features in block1 to block5 and then concatenate

them. We refer to this 3D network with feature compression

as PC-Net.

B. Training of CNNs

The training stage for 2D/3D segmentation is shown in the

ﬁrst row of Fig. 1. Consider a K -ary segmentation training set

T ={(X

, Y

), (X

, Y

),...} where X

is one training image

and Y

is the corresponding label map. The label set of T is

{0, 1, 2,...,K −1} with 0 being the background label. Let N

denote the number of instances of the kth object type, so the

total number of instances is

N =



. Each image X

can

have instances of multiple object classes. Suppose the label of

the qth instance in X

is l

, Y

is converted into a binary

image Y

based on whether the value of each pixel in Y

equals to l

. The bounding box B

of that training instance

is automatically calculated based on Y

and expanded by

a random margin in the range of 0 to 10 pixels/voxels.

and Y

are cropped based on B

. Thus, T is converted

into a cropped set

T ={(

), (

),...} with size

N and label set {0, 1} where 1 is the label of the instance

foreground and 0 the background. With

T , the CNN model

(e.g., P-Net or PC-Net) is trained to extract the target from

its bounding box, which is a binary segmentation problem

irrespective of the object type. A cross entropy loss function

is used for training.

C. Unsupervised and Supervised Image-Speciﬁc

Fine-Tuning

In the testing stage, let

X denote the sub-image inside a

user-provided bounding box and

Y be the target label of

The set of parameters of the trained CNN is θ . With the initial

segmentation

obtained by the trained CNN, the user may

provide (i.e., supervised) or not provide (i.e., unsupervised) a

set of scribbles to guide the update of

.LetS

and S

denote

the scribbles for foreground and background, respectively,

so the entire set of scribbles is S = S

∪ S

.Lets

denote

the user-provided label of a pixel in the scribbles, then we

have s

= 1ifi ∈ S

and s

= 0ifi ∈ S

. We minimize an

objective function that is similar to GrabCut [20] but we use

P-Net or PC-Net instead of a GMM:

arg min

Y ,θ

⎧

⎨

⎩

Y,θ) =



φ(ˆy

X,θ)+ λ



i, j

ψ(ˆy

, ˆy

⎫

⎬

⎭

subject to :ˆy

= s

if i ∈ S (1)

where E(

Y ,θ) is constrained by user interactions if S is

not empty. φ and ψ are the unary and pairwise energy

terms, respectively. λ is the weight of ψ. An unconstrained

optimization of an energy similar to E wasusedin[3]for

weakly supervised learning. In that work, the energy was based

on the probability and label map of all the images in a training

set, which was a different task from ours, as we focus on a

single test image. We follow a typical choice of ψ [17]:

ψ(ˆy

, ˆy

X) =[ˆy

=ˆy

]exp



−

(

X(i) −

X( j))

2σ



(2)

where [·] is 1 if ˆy

=ˆy

and 0 otherwise. d

is the Euclidean

distance between pixel i and pixel j. σ controls the effect of

intensity difference. φ is deﬁned as:

φ(ˆy

X,θ) =−logP( ˆy

X,θ)

=−



ˆy

logp

+ (1 −ˆy

)log(1 − p

)



(3)

where P( ˆy

X,θ) is the probability given by softmax output

of the CNN, and p

= P( ˆy

= 1|

X,θ) is the probability of

pixel i belonging to the foreground.

The optimization of Eq. (1) can be decomposed into steps

that alternatively update the segmentation label

Y and network

parameters θ [3], [20]. In the label update step, we ﬁx θ

and solve for

Y , and Eq. (1) becomes a Conditional Random

Field (CRF) problem:

arg min

⎧

⎨

⎩

E(θ) =



φ(ˆy

X,θ)+ λ



i, j

ψ(ˆy

, ˆy

⎫

⎬

⎭

subject to :ˆy

= s

if i ∈ S (4)

For implementation ease, the constrained optimization in

Eq. (4) is converted to an unconstrained equivalent:

arg min

⎧

⎨

⎩





( ˆy

X,θ)+ λ



i, j

ψ(ˆy

, ˆy

⎫

⎬

⎭

(5)



( ˆy

X,θ) =

⎧

⎪

⎨

⎪

⎩

+∞ if i ∈ S and ˆy

= s

0ifi ∈ S and ˆy

= s

−logP( ˆy

X,θ) otherwise

(6)

Since θ and therefore φ



are ﬁxed, and ψ is submodular,

Eq. (5) can be solved by Graph Cuts [17]. In the network

update step, we ﬁx

Y and solve for θ:

arg min



Y ) =



φ(ˆy

X,θ)



subject to :ˆy

= s

if i ∈ S (7)

Thanks to the constrained optimization in Eq. (4), the label

update step necessarily leads to ˆy

= s

for i ∈ S. Eq. (7) can

Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning

Figures

Citations

A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges

Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation

U-Net and Its Variants for Medical Image Segmentation: A Review of Theory and Applications

Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks.

Extracting Possibly Representative COVID-19 Biomarkers from X-ray Images with Deep Learning Approach and Image Data Related to Pulmonary Diseases.

References

U-Net: Convolutional Networks for Biomedical Image Segmentation

Fully convolutional networks for semantic segmentation

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Caffe: Convolutional Architecture for Fast Feature Embedding

A survey on deep learning in medical image analysis

Related Papers (5)

U-Net: Convolutional Networks for Biomedical Image Segmentation

A survey on deep learning in medical image analysis

Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

Very Deep Convolutional Networks for Large-Scale Image Recognition

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Interactive medical image segmentation using deep learning with image-specific fine tuning" ?

Q2. What is the simplest way to train a CNN?

Q3. How many patients were used for training, validation and testing?

Q4. What is the label of the qth instance in X p?

Q5. How many patients were selected for training?

Q6. How was the resized input of the4DeepMedic?

Q7. What was the weight function used to train the CNNs?

Q8. What is the proposed interactive framework with bounding box and image-specific fine-tuning?

Q9. What was the test used to determine the performance difference between two different segmentation methods?

Q10. What is the way to improve the accuracy of BIFSeg?