scispace - formally typeset
Search or ask a question
Author

Benjamin Hou

Bio: Benjamin Hou is an academic researcher from Imperial College London. The author has contributed to research in topics: Computer science & Transformation (function). The author has an hindex of 11, co-authored 38 publications receiving 465 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Novel deep reinforcement learning (RL) strategies to train agents that can precisely and robustly localize target landmarks in medical scans are evaluated and the performance of these agents surpasses state‐of‐the‐art supervised and RL methods.

126 citations

Journal Article
TL;DR: It is shown that Geomstats provides reliable building blocks to foster research in differential geometry and statistics, and to democratize the use of Riemannian geometry in machine learning applications.
Abstract: We introduce Geomstats, an open-source Python toolbox for computations and statistics on nonlinear manifolds, such as hyperbolic spaces, spaces of symmetric positive definite matrices, Lie groups of transformations, and many more. We provide object-oriented and extensively unit-tested implementations. Among others, manifolds come equipped with families of Riemannian metrics, with associated exponential and logarithmic maps, geodesics and parallel transport. Statistics and learning algorithms provide methods for estimation, clustering and dimension reduction on manifolds. All associated operations are vectorized for batch computation and provide support for different execution backends, namely NumPy, PyTorch and TensorFlow, enabling GPU acceleration. This paper presents the package, compares it with related libraries and provides relevant code examples. We show that Geomstats provides reliable building blocks to foster research in differential geometry and statistics, and to democratize the use of Riemannian geometry in machine learning applications. The source code is freely available under the MIT license at http://geomstats.ai.

70 citations

Journal ArticleDOI
TL;DR: A learning-based image registration method capable of predicting 3-D rigid transformations of arbitrarily oriented 2-D image slices, with respect to a learned canonical atlas co-ordinate system is presented.
Abstract: Limited capture range, and the requirement to provide high quality initialization for optimization-based 2-D/3-D image registration methods, can significantly degrade the performance of 3-D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion, such as fetal in-utero imaging, complicate the 3-D image and volume reconstruction process. In this paper, we present a learning-based image registration method capable of predicting 3-D rigid transformations of arbitrarily oriented 2-D image slices, with respect to a learned canonical atlas co-ordinate system. Only image slice intensity information is used to perform registration and canonical alignment, no spatial transform initialization is required. To find image transformations, we utilize a convolutional neural network architecture to learn the regression function capable of mapping 2-D image slices to a 3-D canonical atlas space. We extensively evaluate the effectiveness of our approach quantitatively on simulated magnetic resonance imaging (MRI), fetal brain imagery with synthetic motion and further demonstrate qualitative results on real fetal MRI data where our method is integrated into a full reconstruction and motion compensation pipeline. Our learning based registration achieves an average spatial prediction error of 7 mm on simulated data and produces qualitatively improved reconstructions for heavily moving fetuses with gestational ages of approximately 20 weeks. Our model provides a general and computationally efficient solution to the 2-D/3-D registration initialization problem and is suitable for real-time scenarios.

67 citations

Book ChapterDOI
10 Sep 2017
TL;DR: A regression approach that learns to predict rotations and translations of arbitrary 2D image slices from 3D volumes, with respect to a learned canonical atlas co-ordinate system, which is a general solution to the 2D/3D initialization problem.
Abstract: This paper aims to solve a fundamental problem in intensity-based 2D/3D registration, which concerns the limited capture range and need for very good initialization of state-of-the-art image registration methods. We propose a regression approach that learns to predict rotations and translations of arbitrary 2D image slices from 3D volumes, with respect to a learned canonical atlas co-ordinate system. To this end, we utilize Convolutional Neural Networks (CNNs) to learn the highly complex regression function that maps 2D image slices into their correct position and orientation in 3D space. Our approach is attractive in challenging imaging scenarios, where significant subject motion complicates reconstruction performance of 3D volumes from 2D slice data. We extensively evaluate the effectiveness of our approach quantitatively on simulated MRI brain data with extreme random motion. We further demonstrate qualitative results on fetal MRI where our method is integrated into a full reconstruction and motion compensation pipeline. With our CNN regression approach we obtain an average prediction error of 7 mm on simulated data, and convincing reconstruction quality of images of very young fetuses where previous methods fail. We further discuss applications to Computed Tomography (CT) and X-Ray projections. Our approach is a general solution to the 2D/3D initialization problem. It is computationally efficient, with prediction times per slice of a few milliseconds, making it suitable for real-time scenarios.

60 citations

Book ChapterDOI
16 Sep 2018
TL;DR: In this article, a multi-scale RL agent framework was employed to find standardized view planes in 3D image acquisitions, which can be used to mimic experienced operators and achieve an accuracy of 1.53 mm, 1.98 mm and 4.84 mm.
Abstract: We propose a fully automatic method to find standardized view planes in 3D image acquisitions. Standard view images are important in clinical practice as they provide a means to perform biometric measurements from similar anatomical regions. These views are often constrained to the native orientation of a 3D image acquisition. Navigating through target anatomy to find the required view plane is tedious and operator-dependent. For this task, we employ a multi-scale reinforcement learning (RL) agent framework and extensively evaluate several Deep Q-Network (DQN) based strategies. RL enables a natural learning paradigm by interaction with the environment, which can be used to mimic experienced operators. We evaluate our results using the distance between the anatomical landmarks and detected planes, and the angles between their normal vector and target. The proposed algorithm is assessed on the mid-sagittal and anterior-posterior commissure planes of brain MRI, and the 4-chamber long-axis plane commonly used in cardiac MRI, achieving accuracy of 1.53 mm, 1.98 mm and 4.84 mm, respectively.

49 citations


Cited by
More filters
Posted Content
TL;DR: It is shown that some existing saliency methods are independent both of the model and of the data generating process, and methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model.
Abstract: Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. Several saliency methods have been proposed, often guided by visual appeal on image data. In this work, we propose an actionable methodology to evaluate what kinds of explanations a given method can and cannot provide. We find that reliance, solely, on visual assessment can be misleading. Through extensive experiments we show that some existing saliency methods are independent both of the model and of the data generating process. Consequently, methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model, such as, finding outliers in the data, explaining the relationship between inputs and outputs that the model learned, and debugging the model. We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings.

927 citations

Journal ArticleDOI
TL;DR: In this article, a review of deep learning-based segmentation methods for cardiac image segmentation is provided, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound.
Abstract: Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US) and major anatomical structures of interest (ventricles, atria and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research.

254 citations

Posted Content
TL;DR: This survey provides an extensive overview of RL applications in a variety of healthcare domains, ranging from dynamic treatment regimes in chronic diseases and critical care, automated medical diagnosis, and many other control or scheduling problems that have infiltrated every aspect of the healthcare system.
Abstract: As a subfield of machine learning, reinforcement learning (RL) aims at empowering one's capabilities in behavioural decision making by using interaction experience with the world and an evaluative feedback. Unlike traditional supervised learning methods that usually rely on one-shot, exhaustive and supervised reward signals, RL tackles with sequential decision making problems with sampled, evaluative and delayed feedback simultaneously. Such distinctive features make RL technique a suitable candidate for developing powerful solutions in a variety of healthcare domains, where diagnosing decisions or treatment regimes are usually characterized by a prolonged and sequential procedure. This survey discusses the broad applications of RL techniques in healthcare domains, in order to provide the research community with systematic understanding of theoretical foundations, enabling methods and techniques, existing challenges, and new insights of this emerging paradigm. By first briefly examining theoretical foundations and key techniques in RL research from efficient and representational directions, we then provide an overview of RL applications in healthcare domains ranging from dynamic treatment regimes in chronic diseases and critical care, automated medical diagnosis from both unstructured and structured clinical data, as well as many other control or scheduling domains that have infiltrated many aspects of a healthcare system. Finally, we summarize the challenges and open issues in current research, and point out some potential solutions and directions for future research.

245 citations

Journal ArticleDOI
TL;DR: To develop a super‐resolution technique using convolutional neural networks for generating thin‐slice knee MR images from thicker input slices, and compare this method with alternative through‐plane interpolation methods.
Abstract: PURPOSE To develop a super-resolution technique using convolutional neural networks for generating thin-slice knee MR images from thicker input slices, and compare this method with alternative through-plane interpolation methods. METHODS We implemented a 3D convolutional neural network entitled DeepResolve to learn residual-based transformations between high-resolution thin-slice images and lower-resolution thick-slice images at the same center locations. DeepResolve was trained using 124 double echo in steady-state (DESS) data sets with 0.7-mm slice thickness and tested on 17 patients. Ground-truth images were compared with DeepResolve, clinically used tricubic interpolation, and Fourier interpolation methods, along with state-of-the-art single-image sparse-coding super-resolution. Comparisons were performed using structural similarity, peak SNR, and RMS error image quality metrics for a multitude of thin-slice downsampling factors. Two musculoskeletal radiologists ranked the 3 data sets and reviewed the diagnostic quality of the DeepResolve, tricubic interpolation, and ground-truth images for sharpness, contrast, artifacts, SNR, and overall diagnostic quality. Mann-Whitney U tests evaluated differences among the quantitative image metrics, reader scores, and rankings. Cohen's Kappa (κ) evaluated interreader reliability. RESULTS DeepResolve had significantly better structural similarity, peak SNR, and RMS error than tricubic interpolation, Fourier interpolation, and sparse-coding super-resolution for all downsampling factors (p < .05, except 4 × and 8 × sparse-coding super-resolution downsampling factors). In the reader study, DeepResolve significantly outperformed (p < .01) tricubic interpolation in all image quality categories and overall image ranking. Both readers had substantial scoring agreement (κ = 0.73). CONCLUSION DeepResolve was capable of resolving high-resolution thin-slice knee MRI from lower-resolution thicker slices, achieving superior quantitative and qualitative diagnostic performance to both conventionally used and state-of-the-art methods.

243 citations