Inter-foetus Membrane Segmentation for TTTS Using Adversarial Networks
Summary (3 min read)
1 Introduction
- Twin-to-Twin Transfusion Syndrome (TTTS) is a pathology with deadly consequences that occurs in the 15% of monochorionic pregnancies (75% of twin homozygous pregnancies) 3 .
- Fetoscopic Minimally Invasive Surgery (MIS) has largely decreased maternal and foetal morbidity or mortality 35 , becoming the recommended technique for the first-line treatment of TTTS.
- The surgery consists of a direct interruption of anastomoses that are responsible for TTTS via laser photo-coagulation.
- The selection of the vessels to be treated relies on the location of abnormal vascular formations, at the small branches of normal blood vessels.
- Additional challenges include large variability in the illumination level, which ranges from intense illumination (causing specular reflections) to dim lighting conditions.
2.1 SAN architecture
- Similarly to the original generative framework, the Segmentation Adversarial Network (SAN) implemented in this work consists of two networks, where the generator (which here acts as segmentation network (S )) and the discriminator (here, the critic network (C )) are alternately trained to minimise and maximise an objective function, respectively.
- Figure 2 shows the overall diagram of the framework.
- Considering improvements in network training speed and performances reported in the literature 40 , Leaky ReLU is chosen over the standard one.
- The last step of the encoding path is made of a convolution layer with ReLU activation.
- The architecture of the C network (Table 2 ) contains the same encoding path of the segmentor network for feature extractions.
2.2 Training strategy
- In the SAN framework there are two loss functions, one for the segmentor S and one for the critic C network.
- The computation of the L L1 loss term is based on high-level features differences between the predicted and the true segmentation extracted from the critic network.
- L L1 loss function force the segmentor to learn both global and local features that capture long-and short-range spatial relationships between pixels.
3.1 Dataset
- In order to train the proposed framework, the authors built a new dataset in collaboration with the Department of Foetal and Perinatal Medicine, Istituto Giannina Gaslini, Genoa .
- The dataset consisted of 900 frames (frame size: 720 x 576 pixels) extracted from 6 videos (150 frames per video) of patients acquired during the normal surgical practice.
- The authors randomly assembled a dataset acquired from patients who received TTTS laser treatments at the same hospital.
- The followed procedures were in accordance with the image data collection and retrospec-.
- The black borders surrounding the FoV do not bring any additional information to segment the membrane but increase the GPU-memory and computational-cost requirements during training.
3.2 Training setting and Ablation Study
- To limit memory requirements in the training phase, still promoting the convergence of the gradient, SAN was trained with mini-batches (batch size = 30 frames) minimising L SAN (Eq. 5) with Adam 19 .
- To initialise the weights, the segmentor was prior trained without the critic in the first 25 epochs.
- The best model was selected as the one that minimised the DSC on the validation set.
- The framework was originally proposed for skin lesion segmentation for ISBI International Skin Imaging Collaboration 2017 41 .
- To evaluate inter-annotator variability the authors asked a second expert to annotate the fetoscopic video used as test set.
3.3 Performance metrics
- The Lilliefors test was used to assess population normality on DSC.
- The Kruskal-Wallis on DSC and Westenberg-Mood test on IQR, both imposing a significance level (p) equal to 0.05, were used to assess whether or not remarkable differences existed between the tested architectures.
4 Results
- The processing time of images in the test set was less than a millisecond, on average.
- This performance confirms the compatibility with real-time applications of this approach.
- In (i) all the networks achieved good results despite the presence of spots and specularities; (ii) all networks achieved good results despite the fact that the U-Net and the residual architecture produced some spots in the lower area where the texture could suggest the presence of the membrane.
- This suggests that the action of critic network provides the ability to the segmentor network to enhance the processing of poor quality images (e.g., with laser pointer, light specularities, drop of light intensity, etc.).
5 Discussion
- During TTTS surgery, the identification of the inter-foetal membrane helps the surgeon to remain oriented in the surgical site.
- The complexity of the placental environment, especially in advanced pregnancies, makes this task very challenging also for expert clinicians when performing surgery.
- The authors also compare this framework with state-of-the-art FCNNs for medical-image segmentation.
- For this reason, some kind of images (e.g., with a small portion of the membrane) are less numerous than others, limiting the network learning capability.
- Further improvements will deal with the exploitation of temporal features, as suggested in 43, 6 , considering that the temporal information is naturally encoded in the surgical videos.
5.1 Conclusion
- The authors proposed an adversarial framework for accurate and fast inter-foetal mem- Sample segmentation results on the test set using (second column) U-Net, (third column) U-Net with the residual implementation and (last column) the proposed SAN along the manual expert clinician ground-truth (first column).
- Each network was trained both with grey-scale and RGB fetoscopic images.
- The green, grey and blue contours refers to the ground-truth, grey scale-based and RGB-based segmentation results, respectively.
Did you find this useful? Give us your feedback
Citations
40 citations
24 citations
24 citations
19 citations
13 citations
Cites background from "Inter-foetus Membrane Segmentation ..."
...[17] implemented an adversarial network consisting of two fully convolutional neural networks....
[...]
References
123,388 citations
111,197 citations
49,639 citations
49,590 citations
38,211 citations
Related Papers (5)
Frequently Asked Questions (18)
Q2. What are the future works mentioned in the paper "Inter-foetus membrane segmentation for ttts using adversarial networks" ?
This problem will be addressed in the future by investigating extensions of this framework supported by a broader dataset and more advanced data augmentation techniques. Further improvements will deal with the exploitation of temporal features, as suggested in43,6, considering that the temporal information is naturally encoded in the surgical videos.
Q3. What tests were used to assess the performance of the SAN?
The Kruskal-Wallis on DSC and Westenberg-Mood test on IQR, both imposing a significance level (p) equal to 0.05, were used to assess whether or not remarkable differences existed between the tested architectures.
Q4. Why did the second expert annotate only the test set?
The authors asked the second expert to annotate only the test set (150frames) due to the high time demand needed to perform manual annotation.
Q5. What is the risk of perinatal mortality of one or both foetuses?
The risk of perinatal mortality of one or both foetuses can exceed 90% without any treatment, with an incidence of physical or neurological complications in the 50% of the surviving foetuses31,32.
Q6. What could be exploited to avoid the processing of uninformative video portions?
Frame selection strategies26 could be exploited too, such as to avoid the processing of uninformative (e.g., blurred) video portions.
Q7. What is the encoding path of the decoder?
Each step of the decoder is made of a strided deconvolution layer with BN and a ReLU activation layer followed by a residual block.
Q8. What was the first proposed framework for natural images?
Adversarial training was initially proposed by Goodfellow et al.15 as a generative framework for natural images (i.e., in the context of Generative Adversarial Networks (GANs)) made of a generator and a discriminator network15.
Q9. What is the median DSC for U-Net and the proposed adversarial network?
In (iv), the presence of low contrast and laser light compromises the detection of the membrane in U-Net and Residual networks while in their framework produces good segmentation.
Q10. How many frames did the proposed framework achieve?
In this paper, the authors proposed an adversarial framework for accurate and fast inter-foetal membrane segmentation in fetoscopic MIS images achieving a median DSC of 91.91% on a new dataset of 150 images from intraoperative TTTS surgery videos.
Q11. What is the architecture of the segmentor network S?
The architecture of the segmentor network S (Table 1) is based on the U-Net33 encoderdecoder structure, a fully convolutional network that naturally performs overlap-tile extraction, preserving spatial connectivity between tiles while speeding up network training.
Q12. What other work could be integrated with the proposed approach?
The proposed approach may also be integrated with recent work, which deals with vessel segmentation from placenta images1,34, stitching of fetoscopy images to build placental panoramic image12,44 and classification of TTTS surgical phases38.
Q13. What is the definition of the segmentor loss in the SAN framework?
The segmentor loss (LSSAN ) (Eq. 5) in their framework, consists of two terms: a common overlap metrics based on Dice similarity coefficient (LDSC) and an additional term derived from the critic (LL1 ).
Q14. What are the main reasons why the fetoscopic images may look different?
The high level of noise, the blurred vision due to amniotic fluid with suspended particulate matter, the wide range of illumination and the variation of the fetoscope pose to the recorded tissues further increase the complexity of the structures segmentation.
Q15. Why is the dataset size a strong limitation of this study?
Despite their efforts, due to the limited amount of available videos and the complexity of the task, the dataset size remains a strong limitation of this study.
Q16. What is the way to train a TTTS dataset?
The achievement of such a large dataset, as recommended to avoid overfitting, was difficult because: (i) data manual annotation is a complex and time-consuming task, (ii) the data availability is limited, since TTTS is a rare pathology.
Q17. What are the limitations of supervised machine learning?
To tackle some of the limitations of these approaches (e.g., needs for parameter tuning and long processing time), supervised machine learning algorithms have been proposed to provide fast and accurate segmentation36.
Q18. What is the combination between segmentation performance and robustness?
An ablation study was performed, showing that the S network with 5 encoding-decoding layers was the best combination between segmentation performance and robustness.