What contributions have the authors mentioned in the paper "Towards whole placenta segmentation at late gestation using multi-view ultrasound images" ?

The authors propose a method to extract the human placenta at late gestation using multi-view 3D US images. The authors demonstrate the performance of their method on 3D US images of the placenta in the last trimester. The authors achieve a high Dice overlap of up to 0. 8 with respect to manual annotations, and the derived placental volumes are comparable to corresponding volumes extracted from MR.

What is the funding for this work?

This work was supported by the Wellcome Trust IEH Award [102431], by the Wellcome/EPSRC Centre for Medical Engineering [WT203148/Z/16/Z] and by the National Institute for Health Research (NIHR)Biomedical Research Centre at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London.

(Open Access) Towards Whole Placenta Segmentation At Late Gestation Using Multi-View Ultrasound Images (2019) | Veronika A. Zimmer

Q: What are the future works in "Towards whole placenta segmentation at late gestation using multi-view ultrasound images" ?

Future work will include the refinement of the segmentation method. Their method enables future work for the comparison of US and MR placenta in more detail.

Q: How do the authors acquire multiple US images?

The authors acquire multiple US images using an in-house US signal multiplexer which allows to connect multiple Philips X6-1 probes to a Philips EPIQ V7 US system.

Q: What is the primary imaging modality used to monitor fetal development?

State-of-the-art segmentation methods using convolutional neural networks (CNNs) have been used in [7, 4] and in [13] additionally for the fetus and the gestational sac.

Q: How many images were used to test the pipeline?

The authors used a dataset of 127 3D US images to test their pipeline, which were selected from 4D (3D+time) image streams from 30 patients covering different parts of the placenta.

Q: What is the effect of the US signal multiplexer?

In effect, image points with a strong signal (to correct for shadow artifacts) and at a position close to the center of the US frustum (where the quality of the image is typically the best) will receive higher weights.

Q: What is the advantage of multi-view imaging?

The resulting image has an extended FoV, and viewdependent artifacts such as shadows can be minimized through the additional signal information from multiple views [14].

Towards Whole Placenta Segmentation At Late

Gestation Using Multi-View Ultrasound Images

Veronika A. Zimmer

, Alberto Gomez

, Emily Skelton

, Nicolas Toussaint

Tong Zhang

, Bishesh Khanal

a,b

, Robert Wright

, Yohan Noh

a,c

, Alison Ho

Jacqueline Matthew

, Joseph V. Hajnal

, and Julia A. Schnabel

School of Biomedical Engineering and Imaging Sciences

King’s College London, London, UK

Nepal Applied Mathematics and Informatics Institute for Research (NAAMII)

Department of Mechanical and Aerospace Engineering, Brunel University London,

Uxbridge, UK

Womens Health Academic Centre, Kings College London, London, UK

Abstract. We propose a method to extract the human placenta at late

gestation using multi-view 3D US images. This is the ﬁrst step towards

automatic quantiﬁcation of placental volume and morphology from US

images along the whole pregnancy beyond early stages (where the entire

placenta can be captured with a single 3D US image). Our method uses

3D US images from diﬀerent views acquired with a multi-probe system.

A whole placenta segmentation is obtained from these images by using a

novel technique based on 3D convolutional neural networks. We demon-

strate the performance of our method on 3D US images of the placenta

in the last trimester. We achieve a high Dice overlap of up to 0.8 with

respect to manual annotations, and the derived placental volumes are

comparable to corresponding volumes extracted from MR.

1 Introduction

Fetal ultrasound (US) is the primary imaging modality to monitor fetal devel-

opment. While the fetal body, especially the fetal brain, are subjects of intensive

research, only few methods exist to study the placenta in utero [11]. Placental

development and function inﬂuence fetal health yet only placental side and cord

insertion are routinely assessed using US. Limiting factors are the large size of

the placenta at late gestation, high variation in shape and position, and the

limited ﬁeld-of-view (FoV) and lack of contrast in US. Placenta magnetic reso-

nance image (MRI) acquisition overcomes some of those challenges as it provides

a large FoV and an excellent soft-tissue contrast. Recently, the ﬁrst method to

assess the placenta in utero in a standardized way was presented using fetal MRI

[5]. However, fetal MRI is corrupted by motion artifacts due to fetal motion and

maternal breathing and fetal MRI reconstruction is an active ﬁeld of research

[10]. US is still the standard screening tool because, in contrast to MRI, it is

performed in real-time, is relatively inexpensive, and widely available. MRI is

generally only used upon referral from the US clinic to gain insight into speciﬁc

Fig. 1. Ultrasound placenta imaging using a multi-probe system. Left: Slice of 3D US

image covering only part of the placenta; Middle: Slice of 3D multi-view image of the

whole placenta; Right: Physical multi-probe holder for two and three US probes.

conditions. US based placenta segmentation could therefore lead to automatic

volume quantiﬁcation, morphology and function, in clinical routine scans. In [9],

a semi-automatic method based on the random walker algorithm was proposed

to segment the placenta. State-of-the-art segmentation methods using convolu-

tional neural networks (CNNs) have been used in [7, 4] and in [13] additionally

for the fetus and the gestational sac. These methods focused on early pregnancies

between 10-14 weeks of gestational age (GA), when the placenta is small enough

to ﬁt in the limited FoV of US. Routine anomaly screening is performed at 20

weeks GA but placental volume at later gestations may be of beneﬁt in predict-

ing/monitoring fetal development. Therefore a larger ﬁeld of view is required to

capture the whole placenta by US.

Multi-view imaging can be used to extend the FoV of a single image. For

example, the entire placenta can be captured by acquiring, aligning and fus-

ing multiple 3D US images (Fig. 1). In previous works [12, 6, 1], registration

algorithm and/or tracker information were employed to align the images and

provide multi-view US. The resulting image has an extended FoV, and view-

dependent artifacts such as shadows can be minimized through the additional

signal information from multiple views [14]. Aligning US placenta remains how-

ever challenging, due to the lack of salient features to drive the registration

process. External tracking, on the other hand, can provide position information

of the US probe but is oblivious to maternal and fetal motion.

In this work, we introduce, for the ﬁrst time, a pipeline to extract the whole

placenta at late gestation. The approach consist of three stages: ﬁrst, multi-

view image acquisition, second, multi-view image fusion, and third, multi-view

placenta segmentation. The multi-view US images are acquired using a time-

interleaved multi-probe US system without the need of image registration. We

present a voxel-wise image fusion method to combine the images and to reduce

view-dependent artifacts, and compare four approaches based on CNNs to ex-

tract the whole placenta from the multi-view images.

We test our pipeline on a dataset of 3D US images to estimate placental vol-

ume in the last trimester of pregnancy. We successfully fuse multi-view images to

get an extended FoV and are able to extract the placenta. The derived placental

volumes are comparable to corresponding volumes extracted from MR. To the

best of out knowledge, this is the ﬁrst time the placenta is segmented at late

gestation from US imaging.

2 Methods and Materials

The three stages of the proposed pipeline for whole placenta extraction (multi-

view acquisition, fusion and segmentation) are described below.

2.1 Multi-probe ultrasound imaging

We acquire multiple US images using an in-house US signal multiplexer which

allows to connect multiple Philips X6-1 probes to a Philips EPIQ V7 US system.

The multiplexer switches rapidly between up to three probes so that images from

each probe are acquired in a time-interleaved fashion. The manual operation of

the transducers is very slow compared to the acquisition frame rate. Therefore,

for the purpose of data processing, consecutive images are assumed to have

been acquired simultaneously over a small time window. We designed a physical

device that ﬁxes the probes in an angle of 30

◦

to each other, which ensures a

large overlap between the images, and allows easy and comfortable operation

(see Fig. 1).

2.2 Multi-view image fusion

In our application, the goal is to combine multiple images of the placenta to ex-

tend the FoV while minimizing view-dependent artifacts. The multi-view fusion

method proposed in [14] produces high quality fusion but is computationally

expensive. We propose a simpliﬁcation of that method by replacing the B-spline

based fusion for a voxel-wise fusion as follows.

Let I

, . . . , I

, with I

: Ω ⊂

→

, i = 1, . . . , V be images of the same

object taken from V views. Their spatial correspondences are known through

spatial transformations φ

: Ω

⊂

→ Ω ⊂

. The fused image I

: Ω

→

with Ω

, . . . , Ω

⊂ Ω

at point x ∈ Ω

is obtained by

(x) =

v=1

(x) ·

x∈Ω

· (I

◦ φ

)(x)

v=1

(x)

with weight function w :

→

. In other words, the intensity of a point x of

the fused image is calculated by the weighted mean of corresponding points in

the single images. The weight of a (transformed) data point x are formulated as a

function of the depth in the US image with respect to the probe position b ∈

and the beam angle α ∈ [−

] in the same way as in [14]. In eﬀect, image

points with a strong signal (to correct for shadow artifacts) and at a position

close to the center of the US frustum (where the quality of the image is typically

the best) will receive higher weights.

2.3 Whole placental segmentation

Semantic segmentation using neural networks In recent years, convolu-

tional neural networks (CNNs) have shown excellent segmentation results, out-

performing conventional methods in quality of the segmentation and in speed

[8, 3]. In a supervised CNN approach, the segmentation of an object is learned

only driven by the data from a training dataset T = {(I

, S

), n = 1, . . . , N },

with images I

→

and ground truth segmentations S

→ {0, 1} (in

our case US images with d = 3 and manual annotations of the placenta). The

model f estimates for an input image I the segmentation map S: S = f (I, Θ).

During the training process of model f(I, Θ) with the training set T , the pa-

rameters of the network Θ ∈

are optimized to minimize a loss function L,

which measures the agreement between the ground truth S

and the estimated

segmentation of the model. The parameters Θ include the connection weights,

biases and convolutional kernel weights.

Our segmentation network is based on the widely used U-net architecture

[8] which is a fully convolutional neural network with an encoder-decoder archi-

tecture, a bottleneck layer in between and skip connections from encoder layers

to decoder. We use ReLUs, strided convolution using max pooling for down-

sampling, and zero padding. We set [32, 64, 128, 256] feature maps per layer

where all the convolutional kernels and feature maps are in 3D.

For training, we resample all images to size 128 × 128 × 128. During train-

ing, we minimize the Dice loss using the Adam optimizer and a learning rate of

0.001. We augment our dataset by image ﬂipping in x- and z-axis (avoid ﬂip-

ping the image upside down to keep the correct positioning of the US frustum),

intensity rescaling by ±10% and small random translation (up to ﬁve pixels in

all directions). Rotations are avoided because that would produce non-realistic

view direction-dependent image features.

Multi-view image segmentation At later gestation, the ﬁeld-of-view of US

is too small to capture the whole placenta. Multiple probes, as described above,

or placenta sweeps using an appropriate registration or tracking method to align

the images, can be used to visualize the whole placenta in one US volume. Those

images diﬀer not only in the view of the placenta, but also in view-dependent

artifacts, such as shadows or attenuation. To provide a consistent segmentation of

the whole placenta across multiple images, we propose four CNN-based variants,

which make use of the multi-view information in diﬀerent ways (see Fig. 2).

(S1) The model f

is trained on N single US images I

with manual anno-

tations S

, n = 1, . . . , N of the placenta, without using any information of

correspondences between diﬀerent views of the same placenta. The resulting

segmentations for the individual images are aligned and fused using maxi-

mum intensity voting to obtain the segmentation of the whole placenta.

(S2) The model f

is trained on M < N fused multi-view images I

with man-

ual annotations S

, m = 1, . . . , M . The training set is smaller compared to

approach (S1) since one image I

is the fusion of two or three images I

The fused images are resampled to size 192 × 128 × 128. The larger size in

the ﬁrst dimension is to account for the larger FoV in the fused images.

(S3) The model f

, trained on individual images, is re-trained as model f

using fused multi-view images (re-sampled to the size of individual images)

which have been separately annotated. Using pre-trained weights from f

initialization allows to train on the smaller dataset of fused images.

Fig. 2. Illustration of strategies to obtain multi-view placenta segmentation using 3D

convolutional networks.

(S4) The last model f

is trained in a similar manner as f

, except that the

individual annotations are obtained from the manual segmentations of the

fused multi-view images by mapping them from the fused image space back

to the single image space. This reduces the amount of manual segmentations

to carry out for the same amount of training data. The manual segmentation

task is easier since fused images have better image quality and larger FoV.

3 Experimental Results

3.1 Dataset

We used a dataset of 127 3D US images to test our pipeline, which were selected

from 4D (3D+time) image streams from 30 patients covering diﬀerent parts of

the placenta. A subset of 94 images were acquired with a two-transducer (64) or

a three-transducer (30) holder device, and the rest were acquired using a single

transducer. Two patients were in the second trimester (24 and 25 weeks GA)

and the others in the third trimester (29 − 34 weeks GA). We split the dataset

into training (85 images), validation (16 images) and test set (26 images). When

trained only on multi-view images (approaches (S2) and (S3)), the sets reduce

to 27, 3 and 12 images for training, validation and testing, respectively.

3.2 Results

The results for placenta segmentation are shown in Table 1 and a representative

example of whole placenta segmentations is shown in Fig. 3. The accuracy of

the segmentation during training, validation and inference are calculated using

the Dice overlap and the absolut volume diﬀerence relative to the ground truth

segmentation. Methods (S1) and (S3) both achieve the best results with a mean

Dice of 0.8 and a volume diﬀerence of around 16% for the multi-view images

Towards Whole Placenta Segmentation At Late Gestation Using Multi-View Ultrasound Images

Figures

Citations

A Review on Deep-Learning Algorithms for Fetal Ultrasound-Image Analysis

Ultrasound Medical Imaging Techniques: A Survey

Towards Standardized Acquisition With a Dual-Probe Ultrasound Robot for Fetal Imaging

A Multi-task Approach Using Positional Information for Ultrasound Placenta Segmentation.

Deep Learning strategies for Ultrasound in Pregnancy.

References

U-Net: Convolutional Networks for Biomedical Image Segmentation

U-Net: Convolutional Networks for Biomedical Image Segmentation

A survey on deep learning in medical image analysis

Towards Automated Semantic Segmentation in Prenatal Volumetric Ultrasound

Fully automated, real-time 3D ultrasound segmentation to estimate first trimester placental volume using deep learning

Related Papers (5)

Fast Fully Automatic Segmentation of the Human Placenta from Motion Corrupted MRI

Weakly supervised learning of placental ultrasound images with residual networks

Combining Deep Learning and Multi-atlas Label Fusion for Automated Placenta Segmentation from 3DUS.

Slic-Seg: A minimally interactive segmentation of the placenta from sparse and motion-corrupted fetal MRI in multiple views.

Nested Graph Cut for Automatic Segmentation of High-Frequency Ultrasound Images of the Mouse Embryo

Frequently Asked Questions (17)

Q1. What contributions have the authors mentioned in the paper "Towards whole placenta segmentation at late gestation using multi-view ultrasound images" ?

Q2. What are the future works in "Towards whole placenta segmentation at late gestation using multi-view ultrasound images" ?

Q3. How do the authors acquire multiple US images?

Q4. What is the primary imaging modality used to monitor fetal development?

Q5. How is the segmentation of an object learned?

Q6. How many images were used to test the pipeline?

Q7. What is the effect of the US signal multiplexer?

Q8. What is the way to visualize the whole placenta in one volume?

Q9. What is the resulting segmentation of the whole placenta?

Q10. What is the advantage of multi-view imaging?

Q11. What is the main reason why ultrasound is the standard screening tool?

Q12. How many probes can be used to acquire multiple images?

Q13. What is the intensity of a point x of the fused image?

Q14. What is the primary imaging modality to study the placenta in utero?

Q15. What is the funding for this work?

Q16. What is the primary imaging modality for assessing placental development?

Q17. How can the authors extract the whole placenta from multi-view images?