scispace - formally typeset
Search or ask a question

Showing papers on "Image registration published in 2017"


Book ChapterDOI
10 Sep 2017
TL;DR: Wang et al. as mentioned in this paper trained a fully convolutional network (FCN) to generate CT given the MR image, and applied Auto-Context Model (ACM) to implement a context-aware generative adversarial network.
Abstract: Computed tomography (CT) is critical for various clinical applications, e.g., radiation treatment planning and also PET attenuation correction in MRI/PET scanner. However, CT exposes radiation during acquisition, which may cause side effects to patients. Compared to CT, magnetic resonance imaging (MRI) is much safer and does not involve radiations. Therefore, recently researchers are greatly motivated to estimate CT image from its corresponding MR image of the same subject for the case of radiation planning. In this paper, we propose a data-driven approach to address this challenging problem. Specifically, we train a fully convolutional network (FCN) to generate CT given the MR image. To better model the nonlinear mapping from MRI to CT and produce more realistic images, we propose to use the adversarial training strategy to train the FCN. Moreover, we propose an image-gradient-difference based loss function to alleviate the blurriness of the generated CT. We further apply Auto-Context Model (ACM) to implement a context-aware generative adversarial network. Experimental results show that our method is accurate and robust for predicting CT images from MR images, and also outperforms three state-of-the-art methods under comparison.

555 citations


Journal ArticleDOI
TL;DR: The Therapy Physics Committee of the American Association of Physicists in Medicine commissioned Task Group 132 to review current approaches and solutions for image registration (both rigid and deformable) in radiotherapy and to provide recommendations for quality assurance and quality control of these clinical processes.
Abstract: Image registration and fusion algorithms exist in almost every software system that creates or uses images in radiotherapy. Most treatment planning systems support some form of image registration and fusion to allow the use of multimodality and time-series image data and even anatomical atlases to assist in target volume and normal tissue delineation. Treatment delivery systems perform registration and fusion between the planning images and the in-room images acquired during the treatment to assist patient positioning. Advanced applications are beginning to support daily dose assessment and enable adaptive radiotherapy using image registration and fusion to propagate contours and accumulate dose between image data taken over the course of therapy to provide up-to-date estimates of anatomical changes and delivered dose. This information aids in the detection of anatomical and functional changes that might elicit changes in the treatment plan or prescription. As the output of the image registration process is always used as the input of another process for planning or delivery, it is important to understand and communicate the uncertainty associated with the software in general and the result of a specific registration. Unfortunately, there is no standard mathematical formalism to perform this for real-world situations where noise, distortion, and complex anatomical variations can occur. Validation of the software systems performance is also complicated by the lack of documentation available from commercial systems leading to use of these systems in undesirable ‘black-box’ fashion. In view of this situation and the central role that image registration and fusion play in treatment planning and delivery, the Therapy Physics Committee of the American Association of Physicists in Medicine commissioned Task Group 132 to review current approaches and solutions for image registration (both rigid and deformable) in radiotherapy and to provide recommendations for quality assurance and quality control of these clinical processes.

501 citations


Journal ArticleDOI
TL;DR: Quicksilver as mentioned in this paper predicts the momentum-parameterization of LDDMM, which facilitates a patch-wise prediction strategy while maintaining the theoretical properties of LDFMM, such as guaranteed diffeomorphic mappings for sufficiently strong regularization.

484 citations


Book ChapterDOI
14 Sep 2017
TL;DR: The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with short execution times.
Abstract: In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with short execution times.

372 citations


Journal ArticleDOI
TL;DR: The proposed algorithm for fast Non-Rigid Motion Correction (NoRMCorre) based on template matching can be useful for solving large scale image registration problems in calcium imaging, especially in the presence of non-rigid deformations.

351 citations


Proceedings ArticleDOI
21 Jul 2017
TL;DR: This work proposes a new mesh registration method that uses both 3D geometry and texture information to register all scans in a sequence to a common reference topology, and shows how using geometry alone results in significant errors in alignment when the motions are fast and non-rigid.
Abstract: While the ready availability of 3D scan data has influenced research throughout computer vision, less attention has focused on 4D data, that is 3D scans of moving non-rigid objects, captured over time. To be useful for vision research, such 4D scans need to be registered, or aligned, to a common topology. Consequently, extending mesh registration methods to 4D is important. Unfortunately, no ground-truth datasets are available for quantitative evaluation and comparison of 4D registration methods. To address this we create a novel dataset of high-resolution 4D scans of human subjects in motion, captured at 60 fps. We propose a new mesh registration method that uses both 3D geometry and texture information to register all scans in a sequence to a common reference topology. The approach exploits consistency in texture over both short and long time intervals and deals with temporal offsets between shape and texture capture. We show how using geometry alone results in significant errors in alignment when the motions are fast and non-rigid. We evaluate the accuracy of our registration and provide a dataset of 40,000 raw and aligned meshes. Dynamic FAUST extends the popular FAUST dataset to dynamic 4D data, and is available for research purposes at http://dfaust.is.tue.mpg.de.

330 citations


Book ChapterDOI
10 Sep 2017
TL;DR: The proposed RegNet is trained using a large set of artificially generated DVFs, does not explicitly define a dissimilarity metric, and integrates image content at multiple scales to equip the network with contextual information, thereby greatly simplifying the training problem.
Abstract: In this paper we propose a method to solve nonrigid image registration through a learning approach, instead of via iterative optimization of a predefined dissimilarity metric. We design a Convolutional Neural Network (CNN) architecture that, in contrast to all other work, directly estimates the displacement vector field (DVF) from a pair of input images. The proposed RegNet is trained using a large set of artificially generated DVFs, does not explicitly define a dissimilarity metric, and integrates image content at multiple scales to equip the network with contextual information. At testing time nonrigid registration is performed in a single shot, in contrast to current iterative methods. We tested RegNet on 3D chest CT follow-up data. The results show that the accuracy of RegNet is on par with a conventional B-spline registration, for anatomy within the capture range. Training RegNet with artificially generated DVFs is therefore a promising approach for obtaining good results on real clinical data, thereby greatly simplifying the training problem. Deformable image registration can therefore be successfully casted as a learning problem.

324 citations


Book ChapterDOI
10 Sep 2017
TL;DR: An innovative approach for registration based on the deterministic prediction of the parameters from both images instead of the optimization of a energy criteria is proposed and shows an important improvement over a state of the art optimization based algorithm.
Abstract: In this paper, we propose an innovative approach for registration based on the deterministic prediction of the parameters from both images instead of the optimization of a energy criteria. The method relies on a fully convolutional network whose architecture consists of contracting layers to detect relevant features and a symmetric expanding path that matches them together and outputs the transformation parametrization. Whereas convolutional networks have seen a widespread expansion and have been already applied to many medical imaging problems such as segmentation and classification, its application to registration has so far faced the challenge of defining ground truth data on which to train the algorithm. Here, we present a novel training strategy to build reference deformations which relies on the registration of segmented regions of interest. We apply this methodology to the problem of inter-patient heart registration and show an important improvement over a state of the art optimization based algorithm. Not only our method is more accurate but it is also faster - registration of two 3D-images taking less than 30 ms second on a GPU - and more robust to outliers.

302 citations


Book ChapterDOI
TL;DR: DIRNet as discussed by the authors consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler, which analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer.
Abstract: In this work we propose a deep learning network for deformable image registration (DIRNet). The DIRNet consists of a convolutional neural network (ConvNet) regressor, a spatial transformer, and a resampler. The ConvNet analyzes a pair of fixed and moving images and outputs parameters for the spatial transformer, which generates the displacement vector field that enables the resampler to warp the moving image to the fixed image. The DIRNet is trained end-to-end by unsupervised optimization of a similarity metric between input image pairs. A trained DIRNet can be applied to perform registration on unseen image pairs in one pass, thus non-iteratively. Evaluation was performed with registration of images of handwritten digits (MNIST) and cardiac cine MR scans (Sunnybrook Cardiac Data). The results demonstrate that registration with DIRNet is as accurate as a conventional deformable image registration method with substantially shorter execution times.

249 citations


Journal ArticleDOI
Wenping Ma1, Wen Zelian1, Yue Wu1, Licheng Jiao1, Maoguo Gong1, Yafei Zheng1, Liang Liu1 
TL;DR: A new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs and an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduction to increase the number of correct correspondences.
Abstract: The scale-invariant feature transform algorithm and its many variants are widely used in feature-based remote sensing image registration. However, it may be difficult to find enough correct correspondences for remote image pairs in some cases that exhibit a significant difference in intensity mapping. In this letter, a new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs. Then, an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduced to increase the number of correct correspondences. The proposed algorithm is tested on multispectral and multisensor remote sensing images. The experimental results show that the proposed method improves the matching performance compared with several state-of-the-art methods in terms of the number of correct correspondences and aligning accuracy.

243 citations


Journal ArticleDOI
TL;DR: The results show that HOPCncc is robust against complex nonlinear radiometric differences and outperforms the state-of-the-art similarities metrics (i.e., NCC and mutual information) in matching performance.
Abstract: Automatic registration of multimodal remote sensing data [e.g., optical, light detection and ranging (LiDAR), and synthetic aperture radar (SAR)] is a challenging task due to the significant nonlinear radiometric differences between these data. To address this problem, this paper proposes a novel feature descriptor named the histogram of orientated phase congruency (HOPC), which is based on the structural properties of images. Furthermore, a similarity metric named HOPCncc is defined, which uses the normalized correlation coefficient (NCC) of the HOPC descriptors for multimodal registration. In the definition of the proposed similarity metric, we first extend the phase congruency model to generate its orientation representation and use the extended model to build HOPCncc. Then, a fast template matching scheme for this metric is designed to detect the control points between images. The proposed HOPCncc aims to capture the structural similarity between images and has been tested with a variety of optical, LiDAR, SAR, and map data. The results show that HOPCncc is robust against complex nonlinear radiometric differences and outperforms the state-of-the-art similarities metrics (i.e., NCC and mutual information) in matching performance. Moreover, a robust registration method is also proposed in this paper based on HOPCncc, which is evaluated using six pairs of multimodal remote sensing images. The experimental results demonstrate the effectiveness of the proposed method for multimodal image registration.

Book ChapterDOI
10 Sep 2017
TL;DR: This paper investigates in this paper how DL could help organ-specific (ROI-specific) deformable registration, to solve motion compensation or atlas-based segmentation problems for instance in prostate diagnosis and presents a training scheme with a large number of synthetically deformed image pairs requiring only a small number of real inter-subject pairs.
Abstract: Robust image registration in medical imaging is essential for comparison or fusion of images, acquired from various perspectives, modalities or at different times. Typically, an objective function needs to be minimized assuming specific a priori deformation models and predefined or learned similarity measures. However, these approaches have difficulties to cope with large deformations or a large variability in appearance. Using modern deep learning (DL) methods with automated feature design, these limitations could be resolved by learning the intrinsic mapping solely from experience. We investigate in this paper how DL could help organ-specific (ROI-specific) deformable registration, to solve motion compensation or atlas-based segmentation problems for instance in prostate diagnosis. An artificial agent is trained to solve the task of non-rigid registration by exploring the parametric space of a statistical deformation model built from training data. Since it is difficult to extract trustworthy ground-truth deformation fields, we present a training scheme with a large number of synthetically deformed image pairs requiring only a small number of real inter-subject pairs. Our approach was tested on inter-subject registration of prostate MR data and reached a median DICE score of .88 in 2-D and .76 in 3-D, therefore showing improved results compared to state-of-the-art registration algorithms.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: An algorithm for aligning two colored point clouds is presented to optimize a joint photometric and geometric objective that locks the alignment along both the normal direction and the tangent plane.
Abstract: We present an algorithm for aligning two colored point clouds. The key idea is to optimize a joint photometric and geometric objective that locks the alignment along both the normal direction and the tangent plane. We extend a photometric objective for aligning RGB-D images to point clouds, by locally parameterizing the point cloud with a virtual camera. Experiments demonstrate that our algorithm is more accurate and more robust than prior point cloud registration algorithms, including those that utilize color information. We use the presented algorithms to enhance a state-of-the-art scene reconstruction system. The precision of the resulting system is demonstrated on real-world scenes with accurate ground-truth models.

Proceedings ArticleDOI
21 Jul 2017
TL;DR: An algorithm for registration between a large-scale point cloud and a close-proximity scanned point cloud is presented, providing a localization solution that is fully independent of prior information about the initial positions of the two point cloud coordinate systems.
Abstract: We present an algorithm for registration between a large-scale point cloud and a close-proximity scanned point cloud, providing a localization solution that is fully independent of prior information about the initial positions of the two point cloud coordinate systems. The algorithm, denoted LORAX, selects super-points–local subsets of points–and describes the geometric structure of each with a low-dimensional descriptor. These descriptors are then used to infer potential matching regions for an efficient coarse registration process, followed by a fine-tuning stage. The set of super-points is selected by covering the point clouds with overlapping spheres, and then filtering out those of low-quality or nonsalient regions. The descriptors are computed using state-of-the-art unsupervised machine learning, utilizing the technology of deep neural network based auto-encoders. Abstract This novel framework provides a strong alternative to the common practice of using manually designed key-point descriptors for coarse point cloud registration. Utilizing super-points instead of key-points allows the available geometrical data to be better exploited to find the correct transformation. Encoding local 3D geometric structures using a deep neural network auto-encoder instead of traditional descriptors continues the trend seen in other computer vision applications and indeed leads to superior results. The algorithm is tested on challenging point cloud registration datasets, and its advantages over previous approaches as well as its robustness to density changes, noise, and missing data are shown.

Book ChapterDOI
10 Sep 2017
TL;DR: A convolutional neural network (CNN) based regression model to directly learn the complex mapping from the input image pair to their corresponding deformation field, and it is found that the trained CNN model from one dataset can be successfully transferred to another dataset, although brain appearances across datasets are quite variable.
Abstract: Existing deformable registration methods require exhaustively iterative optimization, along with careful parameter tuning, to estimate the deformation field between images. Although some learning-based methods have been proposed for initiating deformation estimation, they are often template-specific and not flexible in practical use. In this paper, we propose a convolutional neural network (CNN) based regression model to directly learn the complex mapping from the input image pair (i.e., a pair of template and subject) to their corresponding deformation field. Specifically, our CNN architecture is designed in a patch-based manner to learn the complex mapping from the input patch pairs to their respective deformation field. First, the equalized active-points guided sampling strategy is introduced to facilitate accurate CNN model learning upon a limited image dataset. Then, the similarity-steered CNN architecture is designed, where we propose to add the auxiliary contextual cue, i.e., the similarity between input patches, to more directly guide the learning process. Experiments on different brain image datasets demonstrate promising registration performance based on our CNN model. Furthermore, it is found that the trained CNN model from one dataset can be successfully transferred to another dataset, although brain appearances across datasets are quite variable.

Journal ArticleDOI
TL;DR: This paper presents a technique based on an auto-context convolutional neural network (CNN), in which intrinsic local and global image features are learned through 2-D patches of different window sizes, and evaluates the performance of the algorithm in the challenging problem of extracting arbitrarily oriented fetal brains in reconstructed fetal brain magnetic resonance imaging (MRI) data sets.
Abstract: Brain extraction or whole brain segmentation is an important first step in many of the neuroimage analysis pipelines. The accuracy and the robustness of brain extraction, therefore, are crucial for the accuracy of the entire brain analysis process. The state-of-the-art brain extraction techniques rely heavily on the accuracy of alignment or registration between brain atlases and query brain anatomy, and/or make assumptions about the image geometry, and therefore have limited success when these assumptions do not hold or image registration fails. With the aim of designing an accurate, learning-based, geometry-independent, and registration-free brain extraction tool, in this paper, we present a technique based on an auto-context convolutional neural network (CNN), in which intrinsic local and global image features are learned through 2-D patches of different window sizes. We consider two different architectures: 1) a voxelwise approach based on three parallel 2-D convolutional pathways for three different directions (axial, coronal, and sagittal) that implicitly learn 3-D image information without the need for computationally expensive 3-D convolutions and 2) a fully convolutional network based on the U-net architecture. Posterior probability maps generated by the networks are used iteratively as context information along with the original image patches to learn the local shape and connectedness of the brain to extract it from non-brain tissue. The brain extraction results we have obtained from our CNNs are superior to the recently reported results in the literature on two publicly available benchmark data sets, namely, LPBA40 and OASIS, in which we obtained the Dice overlap coefficients of 97.73% and 97.62%, respectively. Significant improvement was achieved via our auto-context algorithm. Furthermore, we evaluated the performance of our algorithm in the challenging problem of extracting arbitrarily oriented fetal brains in reconstructed fetal brain magnetic resonance imaging (MRI) data sets. In this application, our voxelwise auto-context CNN performed much better than the other methods (Dice coefficient: 95.97%), where the other methods performed poorly due to the non-standard orientation and geometry of the fetal brain in MRI. Through training, our method can provide accurate brain extraction in challenging applications. This, in turn, may reduce the problems associated with image registration in segmentation tasks.

Journal ArticleDOI
TL;DR: In isotropic Total Variation (TV) regularization is used to enable accurate registration near sliding interfaces in breathing motion databases and is robust to parameter selection, allowing the use of the same parameters for all tested databases.
Abstract: Spatial regularization is essential in image registration, which is an ill-posed problem. Regularization can help to avoid both physically implausible displacement fields and local minima during optimization. Tikhonov regularization (squared l2 -norm) is unable to correctly represent non-smooth displacement fields, that can, for example, occur at sliding interfaces in the thorax and abdomen in image time-series during respiration. In this paper, isotropic Total Variation (TV) regularization is used to enable accurate registration near such interfaces. We further develop the TV-regularization for parametric displacement fields and provide an efficient numerical solution scheme using the Alternating Directions Method of Multipliers (ADMM). The proposed method was successfully applied to four clinical databases which capture breathing motion, including CT lung and MR liver images. It provided accurate registration results for the whole volume. A key strength of our proposed method is that it does not depend on organ masks that are conventionally required by many algorithms to avoid errors at sliding interfaces. Furthermore, our method is robust to parameter selection, allowing the use of the same parameters for all tested databases. The average target registration error (TRE) of our method is superior (10% to 40%) to other techniques in the literature. It provides precise motion quantification and sliding detection with sub-pixel accuracy on the publicly available breathing motion databases (mean TREs of 0.95 mm for DIR 4D CT, 0.96 mm for DIR COPDgene, 0.91 mm for POPI databases).

Journal ArticleDOI
TL;DR: This paper introduces the first comprehensive survey of the literature about slice-to-volume registration, presenting a categorical study of the algorithms according to an ad-hoc taxonomy and analyzing advantages and disadvantages of every category.

Posted Content
TL;DR: This paper introduces Quicksilver, a fast deformable image registration method that accurately predicts registrations obtained by numerical optimization, is very fast, achieves state-of-the-art registration results on four standard validation datasets, and can jointly learn an image similarity measure.
Abstract: This paper introduces Quicksilver, a fast deformable image registration method. Quicksilver registration for image-pairs works by patch-wise prediction of a deformation model based directly on image appearance. A deep encoder-decoder network is used as the prediction model. While the prediction strategy is general, we focus on predictions for the Large Deformation Diffeomorphic Metric Mapping (LDDMM) model. Specifically, we predict the momentum-parameterization of LDDMM, which facilitates a patch-wise prediction strategy while maintaining the theoretical properties of LDDMM, such as guaranteed diffeomorphic mappings for sufficiently strong regularization. We also provide a probabilistic version of our prediction network which can be sampled during the testing time to calculate uncertainties in the predicted deformations. Finally, we introduce a new correction network which greatly increases the prediction accuracy of an already existing prediction network. We show experimental results for uni-modal atlas-to-image as well as uni- / multi- modal image-to-image registrations. These experiments demonstrate that our method accurately predicts registrations obtained by numerical optimization, is very fast, achieves state-of-the-art registration results on four standard validation datasets, and can jointly learn an image similarity measure. Quicksilver is freely available as an open-source software.

Journal ArticleDOI
TL;DR: A multi-viewpoint remote sensing image registration method which contains a geometric constraint term introduced into the L2E-based energy function for better behaving the non-rigid transformation and compared with five state-of-the-art methods.
Abstract: Remote sensing image registration plays an important role in military and civilian fields, such as natural disaster damage assessment, military damage assessment and ground targets identification, etc. However, due to the ground relief variations and imaging viewpoint changes, non-rigid geometric distortion occurs between remote sensing images with different viewpoint, which further increases the difficulty of remote sensing image registration. To address the problem, we propose a multi-viewpoint remote sensing image registration method which contains the following contributions. (i) A multiple features based finite mixture model is constructed for dealing with different types of image features. (ii) Three features are combined and substituted into the mixture model to form a feature complementation, i.e., the Euclidean distance and shape context are used to measure the similarity of geometric structure, and the SIFT (scale-invariant feature transform) distance which is endowed with the intensity information is used to measure the scale space extrema. (iii) To prevent the ill-posed problem, a geometric constraint term is introduced into the L2E-based energy function for better behaving the non-rigid transformation. We evaluated the performances of the proposed method by three series of remote sensing images obtained from the unmanned aerial vehicle (UAV) and Google Earth, and compared with five state-of-the-art methods where our method shows the best alignments in most cases.

Journal ArticleDOI
TL;DR: Clinical applications, validation, and algorithms of DIR techniques discussed include dose accumulation, mathematical modeling, automatic segmentation, and functional imaging, and DIR algorithms are reviewed with respect to two algorithmic components: similarity index and deformation models.
Abstract: The number of imaging data sets has significantly increased during radiation treatment after introducing a diverse range of advanced techniques into the field of radiation oncology. As a consequence, there have been many studies proposing meaningful applications of imaging data set use. These applications commonly require a method to align the data sets at a reference. Deformable image registration (DIR) is a process which satisfies this requirement by locally registering image data sets into a reference image set. DIR identifies the spatial correspondence in order to minimize the differences between two or among multiple sets of images. This article describes clinical applications, validation, and algorithms of DIR techniques. Applications of DIR in radiation treatment include dose accumulation, mathematical modeling, automatic segmentation, and functional imaging. Validation methods discussed are based on anatomical landmarks, physical phantoms, digital phantoms, and per application purpose. DIR algorithms are also briefly reviewed with respect to two algorithmic components: similarity index and deformation models.

Posted Content
TL;DR: A novel non-rigid image registration algorithm that is built upon fully convolutional networks (FCNs) to optimize and learn spatial transformations between pairs of images to be registered that has been evaluated for registering 3D structural brain magnetic resonance (MR) images and obtained better performance than state-of-the-art image registration algorithms.
Abstract: We propose a novel non-rigid image registration algorithm that is built upon fully convolutional networks (FCNs) to optimize and learn spatial transformations between pairs of images to be registered. Different from most existing deep learning based image registration methods that learn spatial transformations from training data with known corresponding spatial transformations, our method directly estimates spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and deformed moving images, similar to conventional image registration algorithms. At the same time, our method also learns FCNs for encoding the spatial transformations at the same spatial resolution of images to be registered, rather than learning coarse-grained spatial transformation information. The image registration is implemented in a multi-resolution image registration framework to jointly optimize and learn spatial transformations and FCNs at different resolutions with deep self-supervision through typical feedforward and backpropagation computation. Since our method simultaneously optimizes and learns spatial transformations for the image registration, our method can be directly used to register a pair of images, and the registration of a set of images is also a training procedure for FCNs so that the trained FCNs can be directly adopted to register new images by feedforward computation of the learned FCNs without any optimization. The proposed method has been evaluated for registering 3D structural brain magnetic resonance (MR) images and obtained better performance than state-of-the-art image registration algorithms.

Patent
04 May 2017
TL;DR: In this paper, an intelligent artificial agent based registration method is described, in which a current state observation of an artificial agent is determined based on the medical images to be registered and current transformation parameters.
Abstract: Methods and systems for image registration using an intelligent artificial agent are disclosed. In an intelligent artificial agent based registration method, a current state observation of an artificial agent is determined based on the medical images to be registered and current transformation parameters. Action-values are calculated for a plurality of actions available to the artificial agent based on the current state observation using a machine learning based model, such as a trained deep neural network (DNN). The actions correspond to predetermined adjustments of the transformation parameters. An action having a highest action-value is selected from the plurality of actions and the transformation parameters are adjusted by the predetermined adjustment corresponding to the selected action. The determining, calculating, and selecting steps are repeated for a plurality of iterations, and the medical images are registered using final transformation parameters resulting from the plurality of iterations.

Book ChapterDOI
10 Sep 2017
TL;DR: This paper presents a novel approach for learning highly expressive appearance models from few training samples, and shows that this approach can be used to synthesize huge amounts of realistic ground truth training data for CNN-based medical image registration.
Abstract: Convolutional neural networks (CNNs) have been successfully used for fast and accurate estimation of dense correspondences between images in computer vision applications. However, much of their success is based on the availability of large training datasets with dense ground truth correspondences, which are only rarely available in medical applications. In this paper, we, therefore, address the problem of CNNs learning from few training data for medical image registration. Our contributions are threefold: (1) We present a novel approach for learning highly expressive appearance models from few training samples, (2) we show that this approach can be used to synthesize huge amounts of realistic ground truth training data for CNN-based medical image registration, and (3) we adapt the FlowNet architecture for CNN-based optical flow estimation to the medical image registration problem. This pipeline is applied to two medical data sets with less than 40 training images. We show that CNNs learned from the proposed generative model outperform those trained on random deformations or displacement fields estimated via classical image registration.

Journal ArticleDOI
07 Jul 2017
TL;DR: A dataset comprised of retinal image pairs annotated with ground truth and an evaluation protocol for registration methods is proposed to enable quantitative and comparative evaluation ofretinal registration methods under a variety of conditions and to select the registration method that is most appropriate given a specific target use.
Abstract: Purpose: Retinal image registration is a useful tool for medical professionals. However, performance evaluation of registration methods has not been consistently assessed in the literature. To address that, a dataset comprised of retinal image pairs annotated with ground truth and an evaluation protocol for registration methods is proposed.Methods: The dataset is comprised by 134 retinal fundus image pairs. These pairs are classified into three categories, according to characteristics that are relevant to indicative registration applications. Such characteristics are the degree of overlap between images and the presence/absence of anatomical differences. Ground truth in the form of corresponding image points and a protocol to evaluate registration performance are provided.Results: The proposed protocol is shown to enable quantitative and comparative evaluation of retinal registration methods under a variety of conditions.Conclusion: This work enables the fair comparison of retinal registration methods. It also helps researchers to select the registration method that is most appropriate given a specific target use.

Journal ArticleDOI
TL;DR: Oral and maxillofacial surgery has not been benefitting from image guidance techniques owing to the limitations in image registration.
Abstract: Background Oral and maxillofacial surgery has not been benefitting from image guidance techniques owing to the limitations in image registration. Methods A real-time markerless image registration method is proposed by integrating a shape matching method into a 2D tracking framework. The image registration is performed by matching the patient's teeth model with intraoperative video to obtain its pose. The resulting pose is used to overlay relevant models from the same CT space on the camera video for augmented reality. Results The proposed system was evaluated on mandible/maxilla phantoms, a volunteer and clinical data. Experimental results show that the target overlay error is about 1 mm, and the frame rate of registration update yields 3–5 frames per second with a 4 K camera. Conclusions The significance of this work lies in its simplicity in clinical setting and the seamless integration into the current medical procedure with satisfactory response time and overlay accuracy. Copyright © 2016 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: Imaging is a crucial tool in medicine and biomedical research and is routinely used not only to diagnose disease but also to plan and guide surgical interventions, track disease progression, measure the response of the body to treatment, and understand how genetic and environmental factors relate to anatomical and functional phenotypes.
Abstract: Imaging is a crucial tool in medicine and biomedical research. Magnetic resonance imaging (MRI), computational tomography (CT), proton emission tomography (PET), and ultrasound are routinely used not only to diagnose disease but also to plan and guide surgical interventions, track disease progression, measure the response of the body to treatment, and understand how genetic and environmental factors relate to anatomical and functional phenotypes.

Journal ArticleDOI
TL;DR: A deep learning-based approach for the geo-localization accuracy improvement of optical satellite images through SAR reference data is investigated and it is confirmed that accurate and reliable matching points can be generated with higher matching accuracy and precision with respect to state-of-the-art approaches.
Abstract: Improving the geo-localization of optical satellite images is an important pre-processing step for many remote sensing tasks like monitoring by image time series or scene analysis after sudden events. These tasks require geo-referenced and precisely co-registered multi-sensor data. Images captured by the high resolution synthetic aperture radar (SAR) satellite TerraSAR-X exhibit an absolute geo-location accuracy within a few decimeters. These images represent therefore a reliable source to improve the geo-location accuracy of optical images, which is in the order of tens of meters. In this paper, a deep learning-based approach for the geo-localization accuracy improvement of optical satellite images through SAR reference data is investigated. Image registration between SAR and optical images requires few, but accurate and reliable matching points. These are derived from a Siamese neural network. The network is trained using TerraSAR-X and PRISM image pairs covering greater urban areas spread over Europe, in order to learn the two-dimensional spatial shifts between optical and SAR image patches. Results confirm that accurate and reliable matching points can be generated with higher matching accuracy and precision with respect to state-of-the-art approaches.

Journal ArticleDOI
TL;DR: The proposed algorithm ranks first in the EMPIRE10 challenge on pulmonary image registration and is the first to reach the inter-observer variability in landmark annotation on this dataset, thereby improving upon the state of the art in accuracy by 15%.
Abstract: We present a novel algorithm for the registration of pulmonary CT scans. Our method is designed for large respiratory motion by integrating sparse keypoint correspondences into a dense continuous optimization framework. The detection of keypoint correspondences enables robustness against large deformations by jointly optimizing over a large number of potential discrete displacements, whereas the dense continuous registration achieves subvoxel alignment with smooth transformations. Both steps are driven by the same normalized gradient fields data term. We employ curvature regularization and a volume change control mechanism to prevent foldings of the deformation grid and restrict the determinant of the Jacobian to physiologically meaningful values. Keypoint correspondences are integrated into the dense registration by a quadratic penalty with adaptively determined weight. Using a parallel matrix-free derivative calculation scheme, a runtime of about 5 min was realized on a standard PC. The proposed algorithm ranks first in the EMPIRE10 challenge on pulmonary image registration. Moreover, it achieves an average landmark distance of 0.82 mm on the DIR-Lab COPD database, thereby improving upon the state of the art in accuracy by 15%. Our algorithm is the first to reach the inter-observer variability in landmark annotation on this dataset.

Journal ArticleDOI
TL;DR: A new algorithm built on wavelets transforms and mathematical morphology for detecting the optic disc and the tubular characteristic of the blood vessels is explored to segment the retinal veins and arteries.