scispace - formally typeset
Open AccessBook ChapterDOI

Towards Whole Placenta Segmentation At Late Gestation Using Multi-View Ultrasound Images

TLDR
This work proposes a method to extract the human placenta at late gestation using multi-view 3D US images from different views acquired with a multi-probe system using a novel technique based on 3D convolutional neural networks.
Abstract
We propose a method to extract the human placenta at late gestation using multi-view 3D US images. This is the first step towards automatic quantification of placental volume and morphology from US images along the whole pregnancy beyond early stages (where the entire placenta can be captured with a single 3D US image). Our method uses 3D US images from different views acquired with a multi-probe system. A whole placenta segmentation is obtained from these images by using a novel technique based on 3D convolutional neural networks. We demonstrate the performance of our method on 3D US images of the placenta in the last trimester. We achieve a high Dice overlap of up to 0.8 with respect to manual annotations, and the derived placental volumes are comparable to corresponding volumes extracted from MR.

read more

Content maybe subject to copyright    Report

Towards Whole Placenta Segmentation At Late
Gestation Using Multi-View Ultrasound Images
Veronika A. Zimmer
a
, Alberto Gomez
a
, Emily Skelton
a
, Nicolas Toussaint
a
,
Tong Zhang
a
, Bishesh Khanal
a,b
, Robert Wright
a
, Yohan Noh
a,c
, Alison Ho
d
,
Jacqueline Matthew
a
, Joseph V. Hajnal
a
, and Julia A. Schnabel
a
a
School of Biomedical Engineering and Imaging Sciences
King’s College London, London, UK
b
Nepal Applied Mathematics and Informatics Institute for Research (NAAMII)
c
Department of Mechanical and Aerospace Engineering, Brunel University London,
Uxbridge, UK
d
Womens Health Academic Centre, Kings College London, London, UK
Abstract. We propose a method to extract the human placenta at late
gestation using multi-view 3D US images. This is the first step towards
automatic quantification of placental volume and morphology from US
images along the whole pregnancy beyond early stages (where the entire
placenta can be captured with a single 3D US image). Our method uses
3D US images from different views acquired with a multi-probe system.
A whole placenta segmentation is obtained from these images by using a
novel technique based on 3D convolutional neural networks. We demon-
strate the performance of our method on 3D US images of the placenta
in the last trimester. We achieve a high Dice overlap of up to 0.8 with
respect to manual annotations, and the derived placental volumes are
comparable to corresponding volumes extracted from MR.
1 Introduction
Fetal ultrasound (US) is the primary imaging modality to monitor fetal devel-
opment. While the fetal body, especially the fetal brain, are subjects of intensive
research, only few methods exist to study the placenta in utero [11]. Placental
development and function influence fetal health yet only placental side and cord
insertion are routinely assessed using US. Limiting factors are the large size of
the placenta at late gestation, high variation in shape and position, and the
limited field-of-view (FoV) and lack of contrast in US. Placenta magnetic reso-
nance image (MRI) acquisition overcomes some of those challenges as it provides
a large FoV and an excellent soft-tissue contrast. Recently, the first method to
assess the placenta in utero in a standardized way was presented using fetal MRI
[5]. However, fetal MRI is corrupted by motion artifacts due to fetal motion and
maternal breathing and fetal MRI reconstruction is an active field of research
[10]. US is still the standard screening tool because, in contrast to MRI, it is
performed in real-time, is relatively inexpensive, and widely available. MRI is
generally only used upon referral from the US clinic to gain insight into specific

Fig. 1. Ultrasound placenta imaging using a multi-probe system. Left: Slice of 3D US
image covering only part of the placenta; Middle: Slice of 3D multi-view image of the
whole placenta; Right: Physical multi-probe holder for two and three US probes.
conditions. US based placenta segmentation could therefore lead to automatic
volume quantification, morphology and function, in clinical routine scans. In [9],
a semi-automatic method based on the random walker algorithm was proposed
to segment the placenta. State-of-the-art segmentation methods using convolu-
tional neural networks (CNNs) have been used in [7, 4] and in [13] additionally
for the fetus and the gestational sac. These methods focused on early pregnancies
between 10-14 weeks of gestational age (GA), when the placenta is small enough
to fit in the limited FoV of US. Routine anomaly screening is performed at 20
weeks GA but placental volume at later gestations may be of benefit in predict-
ing/monitoring fetal development. Therefore a larger field of view is required to
capture the whole placenta by US.
Multi-view imaging can be used to extend the FoV of a single image. For
example, the entire placenta can be captured by acquiring, aligning and fus-
ing multiple 3D US images (Fig. 1). In previous works [12, 6, 1], registration
algorithm and/or tracker information were employed to align the images and
provide multi-view US. The resulting image has an extended FoV, and view-
dependent artifacts such as shadows can be minimized through the additional
signal information from multiple views [14]. Aligning US placenta remains how-
ever challenging, due to the lack of salient features to drive the registration
process. External tracking, on the other hand, can provide position information
of the US probe but is oblivious to maternal and fetal motion.
In this work, we introduce, for the first time, a pipeline to extract the whole
placenta at late gestation. The approach consist of three stages: first, multi-
view image acquisition, second, multi-view image fusion, and third, multi-view
placenta segmentation. The multi-view US images are acquired using a time-
interleaved multi-probe US system without the need of image registration. We
present a voxel-wise image fusion method to combine the images and to reduce
view-dependent artifacts, and compare four approaches based on CNNs to ex-
tract the whole placenta from the multi-view images.
We test our pipeline on a dataset of 3D US images to estimate placental vol-
ume in the last trimester of pregnancy. We successfully fuse multi-view images to
get an extended FoV and are able to extract the placenta. The derived placental
volumes are comparable to corresponding volumes extracted from MR. To the
best of out knowledge, this is the first time the placenta is segmented at late
gestation from US imaging.

2 Methods and Materials
The three stages of the proposed pipeline for whole placenta extraction (multi-
view acquisition, fusion and segmentation) are described below.
2.1 Multi-probe ultrasound imaging
We acquire multiple US images using an in-house US signal multiplexer which
allows to connect multiple Philips X6-1 probes to a Philips EPIQ V7 US system.
The multiplexer switches rapidly between up to three probes so that images from
each probe are acquired in a time-interleaved fashion. The manual operation of
the transducers is very slow compared to the acquisition frame rate. Therefore,
for the purpose of data processing, consecutive images are assumed to have
been acquired simultaneously over a small time window. We designed a physical
device that fixes the probes in an angle of 30
to each other, which ensures a
large overlap between the images, and allows easy and comfortable operation
(see Fig. 1).
2.2 Multi-view image fusion
In our application, the goal is to combine multiple images of the placenta to ex-
tend the FoV while minimizing view-dependent artifacts. The multi-view fusion
method proposed in [14] produces high quality fusion but is computationally
expensive. We propose a simplification of that method by replacing the B-spline
based fusion for a voxel-wise fusion as follows.
Let I
1
, . . . , I
V
, with I
v
:
R
d
R
, i = 1, . . . , V be images of the same
object taken from V views. Their spatial correspondences are known through
spatial transformations φ
v
:
v
R
d
R
d
. The fused image I
F
:
F
R
with
1
, . . . ,
V
F
at point x
F
is obtained by
I
F
(x) =
P
V
v=1
w
v
(x) ·
1
x
v
· (I
v
φ
v
)(x)
P
V
v=1
w
v
(x)
with weight function w :
R
d
R
. In other words, the intensity of a point x of
the fused image is calculated by the weighted mean of corresponding points in
the single images. The weight of a (transformed) data point x are formulated as a
function of the depth in the US image with respect to the probe position b
R
d
and the beam angle α [
π
2
,
π
2
] in the same way as in [14]. In effect, image
points with a strong signal (to correct for shadow artifacts) and at a position
close to the center of the US frustum (where the quality of the image is typically
the best) will receive higher weights.
2.3 Whole placental segmentation
Semantic segmentation using neural networks In recent years, convolu-
tional neural networks (CNNs) have shown excellent segmentation results, out-
performing conventional methods in quality of the segmentation and in speed

[8, 3]. In a supervised CNN approach, the segmentation of an object is learned
only driven by the data from a training dataset T = {(I
n
, S
n
), n = 1, . . . , N },
with images I
n
:
R
d
R
and ground truth segmentations S
n
:
R
d
{0, 1} (in
our case US images with d = 3 and manual annotations of the placenta). The
model f estimates for an input image I the segmentation map S: S = f (I, Θ).
During the training process of model f(I, Θ) with the training set T , the pa-
rameters of the network Θ
R
P
are optimized to minimize a loss function L,
which measures the agreement between the ground truth S
n
and the estimated
segmentation of the model. The parameters Θ include the connection weights,
biases and convolutional kernel weights.
Our segmentation network is based on the widely used U-net architecture
[8] which is a fully convolutional neural network with an encoder-decoder archi-
tecture, a bottleneck layer in between and skip connections from encoder layers
to decoder. We use ReLUs, strided convolution using max pooling for down-
sampling, and zero padding. We set [32, 64, 128, 256] feature maps per layer
where all the convolutional kernels and feature maps are in 3D.
For training, we resample all images to size 128 × 128 × 128. During train-
ing, we minimize the Dice loss using the Adam optimizer and a learning rate of
0.001. We augment our dataset by image flipping in x- and z-axis (avoid flip-
ping the image upside down to keep the correct positioning of the US frustum),
intensity rescaling by ±10% and small random translation (up to five pixels in
all directions). Rotations are avoided because that would produce non-realistic
view direction-dependent image features.
Multi-view image segmentation At later gestation, the field-of-view of US
is too small to capture the whole placenta. Multiple probes, as described above,
or placenta sweeps using an appropriate registration or tracking method to align
the images, can be used to visualize the whole placenta in one US volume. Those
images differ not only in the view of the placenta, but also in view-dependent
artifacts, such as shadows or attenuation. To provide a consistent segmentation of
the whole placenta across multiple images, we propose four CNN-based variants,
which make use of the multi-view information in different ways (see Fig. 2).
(S1) The model f
1
is trained on N single US images I
s
n
with manual anno-
tations S
s
n
, n = 1, . . . , N of the placenta, without using any information of
correspondences between different views of the same placenta. The resulting
segmentations for the individual images are aligned and fused using maxi-
mum intensity voting to obtain the segmentation of the whole placenta.
(S2) The model f
2
is trained on M < N fused multi-view images I
mv
m
with man-
ual annotations S
mv
m
, m = 1, . . . , M . The training set is smaller compared to
approach (S1) since one image I
mv
m
is the fusion of two or three images I
s
n
.
The fused images are resampled to size 192 × 128 × 128. The larger size in
the first dimension is to account for the larger FoV in the fused images.
(S3) The model f
1
, trained on individual images, is re-trained as model f
3
using fused multi-view images (re-sampled to the size of individual images)
which have been separately annotated. Using pre-trained weights from f
1
as
initialization allows to train on the smaller dataset of fused images.

Fig. 2. Illustration of strategies to obtain multi-view placenta segmentation using 3D
convolutional networks.
(S4) The last model f
4
is trained in a similar manner as f
1
, except that the
individual annotations are obtained from the manual segmentations of the
fused multi-view images by mapping them from the fused image space back
to the single image space. This reduces the amount of manual segmentations
to carry out for the same amount of training data. The manual segmentation
task is easier since fused images have better image quality and larger FoV.
3 Experimental Results
3.1 Dataset
We used a dataset of 127 3D US images to test our pipeline, which were selected
from 4D (3D+time) image streams from 30 patients covering different parts of
the placenta. A subset of 94 images were acquired with a two-transducer (64) or
a three-transducer (30) holder device, and the rest were acquired using a single
transducer. Two patients were in the second trimester (24 and 25 weeks GA)
and the others in the third trimester (29 34 weeks GA). We split the dataset
into training (85 images), validation (16 images) and test set (26 images). When
trained only on multi-view images (approaches (S2) and (S3)), the sets reduce
to 27, 3 and 12 images for training, validation and testing, respectively.
3.2 Results
The results for placenta segmentation are shown in Table 1 and a representative
example of whole placenta segmentations is shown in Fig. 3. The accuracy of
the segmentation during training, validation and inference are calculated using
the Dice overlap and the absolut volume difference relative to the ground truth
segmentation. Methods (S1) and (S3) both achieve the best results with a mean
Dice of 0.8 and a volume difference of around 16% for the multi-view images

Citations
More filters
Journal ArticleDOI

A Review on Deep-Learning Algorithms for Fetal Ultrasound-Image Analysis

TL;DR: A detailed survey of the most recent work in the field can be found in this paper , with a total of 145 research papers published after 2017 and each paper is analyzed and commented on from both the methodology and application perspective.
Journal ArticleDOI

Ultrasound Medical Imaging Techniques: A Survey

TL;DR: In this paper, the authors present the advances in the techniques used for US medical imaging and present the studies on the different organs that the US uses the most and categorize the research in this field into three groups, i.e., segmentation, classification, and miscellaneous.
Journal ArticleDOI

Towards Standardized Acquisition With a Dual-Probe Ultrasound Robot for Fetal Imaging

TL;DR: In this paper, a dual-probe ultrasound robot is used for the acquisition of ultrasound images from the human abdominal region for the intelligent fetal imaging and diagnosis (iFIND) project.
Book ChapterDOI

A Multi-task Approach Using Positional Information for Ultrasound Placenta Segmentation.

TL;DR: By including the position of the placenta as an auxiliary task, the segmentation accuracy for both anterior and posterior placentas is improved when the specific type ofplacenta is not included in the training set.
Journal Article

Deep Learning strategies for Ultrasound in Pregnancy.

TL;DR: An overview of deep learning methods applied to ultrasound in pregnancy is presented, introducing their architectures and analyzing their strategies, then some common problems are presented and some perspectives into potential future research are provided.
References
More filters
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Posted Content

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Journal ArticleDOI

A survey on deep learning in medical image analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.
Journal ArticleDOI

Towards Automated Semantic Segmentation in Prenatal Volumetric Ultrasound

TL;DR: This work proposes the first and fully automatic framework to simultaneously segment multiple anatomical structures with intensive clinical interest, including fetus, gestational sac, and placenta, and proposes a composite architecture for dense labeling.
Journal ArticleDOI

Fully automated, real-time 3D ultrasound segmentation to estimate first trimester placental volume using deep learning

TL;DR: A new technique to fully automate the segmentation of an organ from 3D ultrasound (3D-US) volumes, using the placenta as the target organ, demonstrated good similarity to the ground-truth and almost identical clinical results for the prediction of SGA.
Related Papers (5)
Frequently Asked Questions (17)
Q1. What contributions have the authors mentioned in the paper "Towards whole placenta segmentation at late gestation using multi-view ultrasound images" ?

The authors propose a method to extract the human placenta at late gestation using multi-view 3D US images. The authors demonstrate the performance of their method on 3D US images of the placenta in the last trimester. The authors achieve a high Dice overlap of up to 0. 8 with respect to manual annotations, and the derived placental volumes are comparable to corresponding volumes extracted from MR. 

Future work will include the refinement of the segmentation method. Their method enables future work for the comparison of US and MR placenta in more detail. 

The authors acquire multiple US images using an in-house US signal multiplexer which allows to connect multiple Philips X6-1 probes to a Philips EPIQ V7 US system. 

State-of-the-art segmentation methods using convolutional neural networks (CNNs) have been used in [7, 4] and in [13] additionally for the fetus and the gestational sac. 

In a supervised CNN approach, the segmentation of an object is learned only driven by the data from a training dataset T = {(In, Sn), n = 1, . . . , N}, with images 

The authors used a dataset of 127 3D US images to test their pipeline, which were selected from 4D (3D+time) image streams from 30 patients covering different parts of the placenta. 

In effect, image points with a strong signal (to correct for shadow artifacts) and at a position close to the center of the US frustum (where the quality of the image is typically the best) will receive higher weights. 

Multiple probes, as described above, or placenta sweeps using an appropriate registration or tracking method to align the images, can be used to visualize the whole placenta in one US volume. 

The resulting segmentations for the individual images are aligned and fused using maximum intensity voting to obtain the segmentation of the whole placenta. 

The resulting image has an extended FoV, and viewdependent artifacts such as shadows can be minimized through the additional signal information from multiple views [14]. 

US is still the standard screening tool because, in contrast to MRI, it is performed in real-time, is relatively inexpensive, and widely available. 

The multiplexer switches rapidly between up to three probes so that images from each probe are acquired in a time-interleaved fashion. 

In other words, the intensity of a point x of the fused image is calculated by the weighted mean of corresponding points in the single images. 

While the fetal body, especially the fetal brain, are subjects of intensive research, only few methods exist to study the placenta in utero [11]. 

This work was supported by the Wellcome Trust IEH Award [102431], by the Wellcome/EPSRC Centre for Medical Engineering [WT203148/Z/16/Z] and by the National Institute for Health Research (NIHR)Biomedical Research Centre at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London. 

These methods focused on early pregnancies between 10-14 weeks of gestational age (GA), when the placenta is small enough to fit in the limited FoV of US. 

The authors present a voxel-wise image fusion method to combine the images and to reduce view-dependent artifacts, and compare four approaches based on CNNs to extract the whole placenta from the multi-view images.