scispace - formally typeset
Search or ask a question
Author

Amy Zhao

Bio: Amy Zhao is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Convolutional neural network & Image registration. The author has an hindex of 8, co-authored 16 publications receiving 1881 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: VoxelMorph promises to speed up medical image analysis and processing pipelines while facilitating novel directions in learning-based registration and its applications and demonstrates that the unsupervised model’s accuracy is comparable to the state-of-the-art methods while operating orders of magnitude faster.
Abstract: We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach, and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network (CNN), and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this work, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model's accuracy is comparable to state-of-the-art methods, while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time, and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is freely available at this http URL.

860 citations

Proceedings ArticleDOI
07 Feb 2018
TL;DR: The proposed method uses a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field, and demonstrates registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice.
Abstract: We present a fast learning-based algorithm for deformable, pairwise 3D medical image registration. Current registration methods optimize an objective function independently for each pair of images, which can be time-consuming for large data. We define registration as a parametric function, and optimize its parameters given a set of images from a collection of interest. Given a new pair of scans, we can quickly compute a registration field by directly evaluating the function using the learned parameters. We model this function using a CNN, and use a spatial transform layer to reconstruct one image from another while imposing smoothness constraints on the registration field. The proposed method does not require supervised information such as ground truth registration fields or anatomical landmarks. We demonstrate registration accuracy comparable to state-of-the-art 3D image registration, while operating orders of magnitude faster in practice. Our method promises to significantly speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is available at https://github.com/balakg/voxelmorph.

549 citations

Journal ArticleDOI
TL;DR: Zhou et al. as mentioned in this paper proposed VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration, which parameterizes the function via a convolutional neural network and optimizes the parameters of the neural network on a set of images.
Abstract: We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this paper, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model’s accuracy is comparable to the state-of-the-art methods while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines while facilitating novel directions in learning-based registration and its applications. Our code is freely available at https://github.com/voxelmorph/voxelmorph .

486 citations

Proceedings ArticleDOI
25 Feb 2019
TL;DR: This work learns a model of transformations from the images, and uses the model along with the labeled example to synthesize additional labeled examples, enabling the synthesis of complex effects such as variations in anatomy and image acquisition procedures.
Abstract: Image segmentation is an important task in many medical applications. Methods based on convolutional neural networks attain state-of-the-art accuracy; however, they typically rely on supervised training with large labeled datasets. Labeling medical images requires significant expertise and time, and typical hand-tuned approaches for data augmentation fail to capture the complex variations in such images. We present an automated data augmentation method for synthesizing labeled medical images. We demonstrate our method on the task of segmenting magnetic resonance imaging (MRI) brain scans. Our method requires only a single segmented scan, and leverages other unlabeled scans in a semi-supervised approach. We learn a model of transformations from the images, and use the model along with the labeled example to synthesize additional labeled examples. Each transformation is comprised of a spatial deformation field and an intensity change, enabling the synthesis of complex effects such as variations in anatomy and image acquisition procedures. We show that training a supervised segmenter with these new examples provides significant improvements over state-of-the-art methods for one-shot biomedical image segmentation.

298 citations

Proceedings ArticleDOI
20 Apr 2018
TL;DR: In this article, a generative neural network is proposed to synthesize unseen human poses from human action videos. But their work is limited to three action classes: golf, yoga/workouts and tennis.
Abstract: We address the computational problem of novel human pose synthesis. Given an image of a person and a desired pose, we produce a depiction of that person in that pose, retaining the appearance of both the person and background. We present a modular generative neural network that synthesizes unseen poses using training pairs of images and poses taken from human action videos. Our network separates a scene into different body part and background layers, moves body parts to new locations and refines their appearances, and composites the new foreground with a hole-filled background. These subtasks, implemented with separate modules, are trained jointly using only a single target image as a supervised label. We use an adversarial discriminator to force our network to synthesize realistic details conditioned on pose. We demonstrate image synthesis results on three action classes: golf, yoga/workouts and tennis, and show that our method produces accurate results within action classes as well as across action classes. Given a sequence of desired poses, we also produce coherent videos of actions.

263 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, the authors provide a short overview of recent advances and some associated challenges in machine learning applied to medical image processing and image analysis, and provide a starting point for people interested in experimenting and perhaps contributing to the field of machine learning for medical imaging.
Abstract: What has happened in machine learning lately, and what does it mean for the future of medical image analysis? Machine learning has witnessed a tremendous amount of attention over the last few years. The current boom started around 2009 when so-called deep artificial neural networks began outperforming other established models on a number of important benchmarks. Deep neural networks are now the state-of-the-art machine learning models across a variety of areas, from image analysis to natural language processing, and widely deployed in academia and industry. These developments have a huge potential for medical imaging technology, medical data analysis, medical diagnostics and healthcare in general, slowly being realized. We provide a short overview of recent advances and some associated challenges in machine learning applied to medical image processing and image analysis. As this has become a very broad and fast expanding field we will not survey the entire landscape of applications, but put particular focus on deep learning in MRI. Our aim is threefold: (i) give a brief introduction to deep learning with pointers to core references; (ii) indicate how deep learning has been applied to the entire MRI processing chain, from acquisition to image retrieval, from segmentation to disease prediction; (iii) provide a starting point for people interested in experimenting and perhaps contributing to the field of machine learning for medical imaging by pointing out good educational resources, state-of-the-art open-source code, and interesting sources of data and problems related medical imaging.

991 citations

Journal ArticleDOI
TL;DR: VoxelMorph promises to speed up medical image analysis and processing pipelines while facilitating novel directions in learning-based registration and its applications and demonstrates that the unsupervised model’s accuracy is comparable to the state-of-the-art methods while operating orders of magnitude faster.
Abstract: We present VoxelMorph, a fast learning-based framework for deformable, pairwise medical image registration. Traditional registration methods optimize an objective function for each pair of images, which can be time-consuming for large datasets or rich deformation models. In contrast to this approach, and building on recent learning-based methods, we formulate registration as a function that maps an input image pair to a deformation field that aligns these images. We parameterize the function via a convolutional neural network (CNN), and optimize the parameters of the neural network on a set of images. Given a new pair of scans, VoxelMorph rapidly computes a deformation field by directly evaluating the function. In this work, we explore two different training strategies. In the first (unsupervised) setting, we train the model to maximize standard image matching objective functions that are based on the image intensities. In the second setting, we leverage auxiliary segmentations available in the training data. We demonstrate that the unsupervised model's accuracy is comparable to state-of-the-art methods, while operating orders of magnitude faster. We also show that VoxelMorph trained with auxiliary data improves registration accuracy at test time, and evaluate the effect of training set size on registration. Our method promises to speed up medical image analysis and processing pipelines, while facilitating novel directions in learning-based registration and its applications. Our code is freely available at this http URL.

860 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, it is shown that it can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.
Abstract: This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We approach this problem as video-to-video translation using pose as an intermediate representation. To transfer the motion, we extract poses from the source subject and apply the learned pose-to-appearance mapping to generate the target subject. We predict two consecutive frames for temporally coherent video results and introduce a separate pipeline for realistic face synthesis. Although our method is quite simple, it produces surprisingly compelling results (see video). This motivates us to also provide a forensics tool for reliable synthetic content detection, which is able to distinguish videos synthesized by our system from real data. In addition, we release a first-of-its-kind open-source dataset of videos that can be legally used for training and motion transfer.

585 citations

Journal ArticleDOI
TL;DR: This article provides a detailed review of the solutions above, summarizing both the technical novelties and empirical results, and compares the benefits and requirements of the surveyed methodologies and provides recommended solutions.

487 citations