scispace - formally typeset
Book ChapterDOI

Visual-Assisted Probe Movement Guidance for Obstetric Ultrasound Scanning Using Landmark Retrieval

Reads0
Chats0
TLDR
In this paper, a Transformer-VLAD network is proposed to learn a global descriptor to represent each US image, and anchor-positive-negative US image pairs are automatically constructed through a KD-tree search of 3D probe positions.
Abstract
Automated ultrasound (US)-probe movement guidance is desirable to assist inexperienced human operators during obstetric US scanning. In this paper, we present a new visual-assisted probe movement technique using automated landmark retrieval for assistive obstetric US scanning. In a first step, a set of landmarks is constructed uniformly around a virtual 3D fetal model. Then, during obstetric scanning, a deep neural network (DNN) model locates the nearest landmark through descriptor search between the current observation and landmarks. The global position cues are visualised in real-time on a monitor to assist the human operator in probe movement. A Transformer-VLAD network is proposed to learn a global descriptor to represent each US image. This method abandons the need for deep parameter regression to enhance the generalization ability of the network. To avoid prohibitively expensive human annotation, anchor-positive-negative US image-pairs are automatically constructed through a KD-tree search of 3D probe positions. This leads to an end-to-end network trained in a self-supervised way through contrastive learning.

read more

Citations
More filters
Journal ArticleDOI

A Review on Deep-Learning Algorithms for Fetal Ultrasound-Image Analysis

TL;DR: A detailed survey of the most recent work in the field can be found in this paper , with a total of 145 research papers published after 2017 and each paper is analyzed and commented on from both the methodology and application perspective.
Journal ArticleDOI

DATR: Domain-adaptive transformer for multi-domain landmark detection

Heqin Zhu, +2 more
- 12 Mar 2022 - 
TL;DR: A universal model for multi-domain landmark detection by taking advantage of transformer for modeling long dependencies and developing a domain-adaptive transformer model, named as DATR, which is trained on multiple mixed datasets from different anatomies and capable of detecting landmarks of any image from those anatomie.
Journal ArticleDOI

Improving Classification of Tetanus Severity for Patients in Low-Middle Income Countries Wearing ECG Sensors by Using a CNN-Transformer Network

TL;DR: A novel hybrid CNN-Transformer model is proposed to automatically classify tetanus severity using tetanus monitoring from low-cost wearable sensors that outperforms state-of-the-art methods in tetanus classification and finds that Random Forest with enough manually selected features can be comparable with the proposed CNN- Transformer model.
Journal ArticleDOI

Improving Classification of Tetanus Severity for Patients in Low-Middle Income Countries Wearing ECG Sensors by Using a CNN-Transformer Network

TL;DR: In this article , a hybrid CNN-Transformer model was proposed to automatically classify tetanus severity using tetanus monitoring from low-cost wearable sensors, which can capture the local features from CNN and the global features from the Transformer.
References
More filters
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition

TL;DR: A convolutional neural network architecture that is trainable in an end-to-end manner directly for the place recognition task and an efficient training procedure which can be applied on very large-scale weakly labelled tasks are developed.
Posted Content

Image Transformer

TL;DR: In this article, a self-attention mechanism is used to attend to local neighborhoods to increase the size of images generated by the model, despite maintaining significantly larger receptive fields per layer than typical CNNs.
Proceedings Article

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

TL;DR: The Vision Transformer (ViT) as discussed by the authors uses a pure transformer applied directly to sequences of image patches to perform very well on image classification tasks, achieving state-of-the-art results on ImageNet, CIFAR-100, VTAB, etc.
Related Papers (5)