scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Medical Imaging in 2017"


Journal ArticleDOI
TL;DR: This work combines the autoencoder, deconvolution network, and shortcut connections into the residual encoder–decoder convolutional neural network (RED-CNN) for low-dose CT imaging and achieves a competitive performance relative to the-state-of-art methods in both simulated and clinical cases.
Abstract: Given the potential risk of X-ray radiation to the patient, low-dose CT has attracted a considerable interest in the medical imaging field. Currently, the main stream low-dose CT methods include vendor-specific sinogram domain filtration and iterative reconstruction algorithms, but they need to access raw data, whose formats are not transparent to most users. Due to the difficulty of modeling the statistical characteristics in the image domain, the existing methods for directly processing reconstructed images cannot eliminate image noise very well while keeping structural details. Inspired by the idea of deep learning, here we combine the autoencoder, deconvolution network, and shortcut connections into the residual encoder–decoder convolutional neural network (RED-CNN) for low-dose CT imaging. After patch-based training, the proposed RED-CNN achieves a competitive performance relative to the-state-of-art methods in both simulated and clinical cases. Especially, our method has been favorably evaluated in terms of noise suppression, structural preservation, and lesion detection.

1,161 citations


Journal ArticleDOI
TL;DR: This study corroborates that very deep CNNs with effective training mechanisms can be employed to solve complicated medical image analysis tasks, even with limited training data.
Abstract: Automated melanoma recognition in dermoscopy images is a very challenging task due to the low contrast of skin lesions, the huge intraclass variation of melanomas, the high degree of visual similarity between melanoma and non-melanoma lesions, and the existence of many artifacts in the image. In order to meet these challenges, we propose a novel method for melanoma recognition by leveraging very deep convolutional neural networks (CNNs). Compared with existing methods employing either low-level hand-crafted features or CNNs with shallower architectures, our substantially deeper networks (more than 50 layers) can acquire richer and more discriminative features for more accurate recognition. To take full advantage of very deep networks, we propose a set of schemes to ensure effective training and learning under limited training data. First, we apply the residual learning to cope with the degradation and overfitting problems when a network goes deeper. This technique can ensure that our networks benefit from the performance gains achieved by increasing network depth. Then, we construct a fully convolutional residual network (FCRN) for accurate skin lesion segmentation, and further enhance its capability by incorporating a multi-scale contextual information integration scheme. Finally, we seamlessly integrate the proposed FCRN (for segmentation) and other very deep residual networks (for classification) to form a two-stage framework. This framework enables the classification network to extract more representative and specific features based on segmented results instead of the whole dermoscopy images, further alleviating the insufficiency of training data. The proposed framework is extensively evaluated on ISBI 2016 Skin Lesion Analysis Towards Melanoma Detection Challenge dataset. Experimental results demonstrate the significant performance gains of the proposed framework, ranking the first in classification and the second in segmentation among 25 teams and 28 teams, respectively. This study corroborates that very deep CNNs with effective training mechanisms can be employed to solve complicated medical image analysis tasks, even with limited training data.

843 citations


Journal ArticleDOI
TL;DR: Noise reduction improved quantification of low-density calcified inserts in phantom CT images and allowed coronary calcium scoring in low-dose patient CT images with high noise levels.
Abstract: Noise is inherent to low-dose CT acquisition We propose to train a convolutional neural network (CNN) jointly with an adversarial CNN to estimate routine-dose CT images from low-dose CT images and hence reduce noise A generator CNN was trained to transform low-dose CT images into routine-dose CT images using voxelwise loss minimization An adversarial discriminator CNN was simultaneously trained to distinguish the output of the generator from routine-dose CT images The performance of this discriminator was used as an adversarial loss for the generator Experiments were performed using CT images of an anthropomorphic phantom containing calcium inserts, as well as patient non-contrast-enhanced cardiac CT images The phantom and patients were scanned at 20% and 100% routine clinical dose Three training strategies were compared: the first used only voxelwise loss, the second combined voxelwise loss and adversarial loss, and the third used only adversarial loss The results showed that training with only voxelwise loss resulted in the highest peak signal-to-noise ratio with respect to reference routine-dose images However, CNNs trained with adversarial loss captured image statistics of routine-dose images better Noise reduction improved quantification of low-density calcified inserts in phantom CT images and allowed coronary calcium scoring in low-dose patient CT images with high noise levels Testing took less than 10 s per CT volume CNN-based low-dose CT noise reduction in the image domain is feasible Training with an adversarial network improves the CNNs ability to generate images with an appearance similar to that of reference routine-dose CT images

781 citations


Journal ArticleDOI
TL;DR: A large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries is introduced, whose quality was validated by a medical doctor.
Abstract: Nuclear segmentation in digital microscopic tissue images can enable extraction of high-quality features for nuclear morphometrics and other analysis in computational pathology. Conventional image processing techniques, such as Otsu thresholding and watershed segmentation, do not work effectively on challenging cases, such as chromatin-sparse and crowded nuclei. In contrast, machine learning-based segmentation can generalize across various nuclear appearances. However, training machine learning algorithms requires data sets of images, in which a vast number of nuclei have been annotated. Publicly accessible and annotated data sets, along with widely agreed upon metrics to compare techniques, have catalyzed tremendous innovation and progress on other image classification problems, particularly in object recognition. Inspired by their success, we introduce a large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries, whose quality was validated by a medical doctor. Because our data set is taken from multiple hospitals and includes a diversity of nuclear appearances from several patients, disease states, and organs, techniques trained on it are likely to generalize well and work right out-of-the-box on other H&E-stained images. We also propose a new metric to evaluate nuclear segmentation results that penalizes object- and pixel-level errors in a unified manner, unlike previous metrics that penalize only one type of error. We also propose a segmentation technique based on deep learning that lays a special emphasis on identifying the nuclear boundaries, including those between the touching or overlapping nuclei, and works well on a diverse set of test images.

679 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information.
Abstract: Surgical workflow recognition has numerous potential medical applications, such as the automatic indexing of surgical video databases and the optimization of real-time operating room scheduling, among others. As a result, surgical phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. In the literature, two types of features are typically used to perform this task: visual features and tool usage signals. However, the used visual features are mostly handcrafted. Furthermore, the tool usage signals are usually collected via a manual annotation process or by using additional equipment. In this paper, we propose a novel method for phase recognition that uses a convolutional neural network (CNN) to automatically learn features from cholecystectomy videos and that relies uniquely on visual information. In previous studies, it has been shown that the tool usage signals can provide valuable information in performing the phase recognition task. Thus, we present a novel CNN architecture, called EndoNet, that is designed to carry out the phase recognition and tool presence detection tasks in a multi-task manner. To the best of our knowledge, this is the first work proposing to use a CNN for multiple recognition tasks on laparoscopic videos. Experimental comparisons to other methods show that EndoNet yields state-of-the-art results for both tasks.

555 citations


Journal ArticleDOI
TL;DR: A fully automatic method for skin lesion segmentation by leveraging 19-layer deep convolutional neural networks that is trained end-to-end and does not rely on prior knowledge of the data to ensure effective and efficient learning with limited training data is presented.
Abstract: Automatic skin lesion segmentation in dermoscopic images is a challenging task due to the low contrast between lesion and the surrounding skin, the irregular and fuzzy lesion borders, the existence of various artifacts, and various imaging acquisition conditions. In this paper, we present a fully automatic method for skin lesion segmentation by leveraging 19-layer deep convolutional neural networks that is trained end-to-end and does not rely on prior knowledge of the data. We propose a set of strategies to ensure effective and efficient learning with limited training data. Furthermore, we design a novel loss function based on Jaccard distance to eliminate the need of sample re-weighting, a typical procedure when using cross entropy as the loss function for image segmentation due to the strong imbalance between the number of foreground and background pixels. We evaluated the effectiveness, efficiency, as well as the generalization capability of the proposed framework on two publicly available databases. One is from ISBI 2016 skin lesion analysis towards melanoma detection challenge, and the other is the PH2 database. Experimental results showed that the proposed method outperformed other state-of-the-art algorithms on these two databases. Our method is general enough and only needs minimum pre- and post-processing, which allows its adoption in a variety of medical image segmentation tasks.

510 citations


Journal ArticleDOI
TL;DR: Results show that convolutional neural networks are the state of the art in polyp detection and it is also demonstrated that combining different methodologies can lead to an improved overall performance.
Abstract: Colonoscopy is the gold standard for colon cancer screening though some polyps are still missed, thus preventing early disease detection and treatment. Several computational systems have been proposed to assist polyp detection during colonoscopy but so far without consistent evaluation. The lack of publicly available annotated databases has made it difficult to compare methods and to assess if they achieve performance levels acceptable for clinical use. The Automatic Polyp Detection sub-challenge, conducted as part of the Endoscopic Vision Challenge ( http://endovis.grand-challenge.org ) at the international conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2015, was an effort to address this need. In this paper, we report the results of this comparative evaluation of polyp detection methods, as well as describe additional experiments to further explore differences between methods. We define performance metrics and provide evaluation databases that allow comparison of multiple methodologies. Results show that convolutional neural networks are the state of the art. Nevertheless, it is also demonstrated that combining different methodologies can lead to an improved overall performance.

331 citations


Journal ArticleDOI
TL;DR: DeepCut as discussed by the authors proposes a method to obtain pixelwise object segmentations given an image dataset labeled weak annotations, in our case bounding boxes, by training a neural network classifier from bounding box annotations.
Abstract: In this paper, we propose DeepCut, a method to obtain pixelwise object segmentations given an image dataset labelled weak annotations, in our case bounding boxes. It extends the approach of the well-known GrabCut[1] method to include machine learning by training a neural network classifier from bounding box annotations. We formulate the problem as an energy minimisation problem over a densely-connected conditional random field and iteratively update the training targets to obtain pixelwise object segmentations. Additionally, we propose variants of the DeepCut method and compare those to a naive approach to CNN training under weak supervision. We test its applicability to solve brain and lung segmentation problems on a challenging fetal magnetic resonance dataset and obtain encouraging results in terms of accuracy.

320 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a novel method based on convolutional neural networks, which can automatically detect 13 fetal standard views in freehand 2D ultrasound data as well as provide a localization of the fetal structures via a bounding box.
Abstract: Identifying and interpreting fetal standard scan planes during 2-D ultrasound mid-pregnancy examinations are highly complex tasks, which require years of training. Apart from guiding the probe to the correct location, it can be equally difficult for a non-expert to identify relevant structures within the image. Automatic image processing can provide tools to help experienced as well as inexperienced operators with these tasks. In this paper, we propose a novel method based on convolutional neural networks, which can automatically detect 13 fetal standard views in freehand 2-D ultrasound data as well as provide a localization of the fetal structures via a bounding box. An important contribution is that the network learns to localize the target anatomy using weak supervision based on image-level labels only. The network architecture is designed to operate in real-time while providing optimal output for the localization task. We present results for real-time annotation, retrospective frame retrieval from saved videos, and localization on a very large and challenging dataset consisting of images and video recordings of full clinical anomaly screenings. We found that the proposed method achieved an average F1-score of 0.798 in a realistic classification experiment modeling real-time detection, and obtained a 90.09% accuracy for retrospective frame retrieval. Moreover, an accuracy of 77.8% was achieved on the localization task.

260 citations


Journal ArticleDOI
TL;DR: New border features are proposed, which are able to effectively characterize border irregularities on both complete lesions and incomplete lesions, and are designed that combines back propagation (BP) neural networks with fuzzy neural networks to achieve improved performance.
Abstract: We develop a novel method for classifying melanocytic tumors as benign or malignant by the analysis of digital dermoscopy images. The algorithm follows three steps: first, lesions are extracted using a self-generating neural network (SGNN); second, features descriptive of tumor color, texture and border are extracted; and third, lesion objects are classified using a classifier based on a neural network ensemble model. In clinical situations, lesions occur that are too large to be entirely contained within the dermoscopy image. To deal with this difficult presentation, new border features are proposed, which are able to effectively characterize border irregularities on both complete lesions and incomplete lesions. In our model, a network ensemble classifier is designed that combines back propagation (BP) neural networks with fuzzy neural networks to achieve improved performance. Experiments are carried out on two diverse dermoscopy databases that include images of both the xanthous and caucasian races. The results show that classification accuracy is greatly enhanced by the use of the new border features and the proposed classifier model.

212 citations


Journal ArticleDOI
TL;DR: K-sparse auto encoder was used for unsupervised feature learning and a manifold was learned from normal-dose images and the distance between the reconstructed image and the manifold was minimized along with data fidelity during reconstruction.
Abstract: Dose reduction in computed tomography (CT) is essential for decreasing radiation risk in clinical applications. Iterative reconstruction algorithms are one of the most promising way to compensate for the increased noise due to reduction of photon flux. Most iterative reconstruction algorithms incorporate manually designed prior functions of the reconstructed image to suppress noises while maintaining structures of the image. These priors basically rely on smoothness constraints and cannot exploit more complex features of the image. The recent development of artificial neural networks and machine learning enabled learning of more complex features of image, which has the potential to improve reconstruction quality. In this letter, K-sparse auto encoder was used for unsupervised feature learning. A manifold was learned from normal-dose images and the distance between the reconstructed image and the manifold was minimized along with data fidelity during reconstruction. Experiments on 2016 Low-dose CT Grand Challenge were used for the method verification, and results demonstrated the noise reduction and detail preservation abilities of the proposed method.

Journal ArticleDOI
TL;DR: A learning-based method with robust shape priors to segment individual cell in Pap smear images to support automatic monitoring of changes in cells, which is a vital prerequisite of early detection of cervical cancer.
Abstract: Accurate segmentation of cervical cells in Pap smear images is an important step in automatic pre-cancer identification in the uterine cervix. One of the major segmentation challenges is overlapping of cytoplasm, which has not been well-addressed in previous studies. To tackle the overlapping issue, this paper proposes a learning-based method with robust shape priors to segment individual cell in Pap smear images to support automatic monitoring of changes in cells, which is a vital prerequisite of early detection of cervical cancer. We define this splitting problem as a discrete labeling task for multiple cells with a suitable cost function. The labeling results are then fed into our dynamic multi-template deformation model for further boundary refinement. Multi-scale deep convolutional networks are adopted to learn the diverse cell appearance features. We also incorporated high-level shape information to guide segmentation where cell boundary might be weak or lost due to cell overlapping. An evaluation carried out using two different datasets demonstrates the superiority of our proposed method over the state-of-the-art methods in terms of segmentation accuracy.

Journal ArticleDOI
TL;DR: This paper presents a technique based on an auto-context convolutional neural network (CNN), in which intrinsic local and global image features are learned through 2-D patches of different window sizes, and evaluates the performance of the algorithm in the challenging problem of extracting arbitrarily oriented fetal brains in reconstructed fetal brain magnetic resonance imaging (MRI) data sets.
Abstract: Brain extraction or whole brain segmentation is an important first step in many of the neuroimage analysis pipelines. The accuracy and the robustness of brain extraction, therefore, are crucial for the accuracy of the entire brain analysis process. The state-of-the-art brain extraction techniques rely heavily on the accuracy of alignment or registration between brain atlases and query brain anatomy, and/or make assumptions about the image geometry, and therefore have limited success when these assumptions do not hold or image registration fails. With the aim of designing an accurate, learning-based, geometry-independent, and registration-free brain extraction tool, in this paper, we present a technique based on an auto-context convolutional neural network (CNN), in which intrinsic local and global image features are learned through 2-D patches of different window sizes. We consider two different architectures: 1) a voxelwise approach based on three parallel 2-D convolutional pathways for three different directions (axial, coronal, and sagittal) that implicitly learn 3-D image information without the need for computationally expensive 3-D convolutions and 2) a fully convolutional network based on the U-net architecture. Posterior probability maps generated by the networks are used iteratively as context information along with the original image patches to learn the local shape and connectedness of the brain to extract it from non-brain tissue. The brain extraction results we have obtained from our CNNs are superior to the recently reported results in the literature on two publicly available benchmark data sets, namely, LPBA40 and OASIS, in which we obtained the Dice overlap coefficients of 97.73% and 97.62%, respectively. Significant improvement was achieved via our auto-context algorithm. Furthermore, we evaluated the performance of our algorithm in the challenging problem of extracting arbitrarily oriented fetal brains in reconstructed fetal brain magnetic resonance imaging (MRI) data sets. In this application, our voxelwise auto-context CNN performed much better than the other methods (Dice coefficient: 95.97%), where the other methods performed poorly due to the non-standard orientation and geometry of the fetal brain in MRI. Through training, our method can provide accurate brain extraction in challenging applications. This, in turn, may reduce the problems associated with image registration in segmentation tasks.

Journal ArticleDOI
TL;DR: The results suggest that deep learning can be used effectively to develop an automated system for BAC detection inmammograms to help identify and assess patients with cardiovascular risks.
Abstract: Coronary artery disease is a major cause of death in women. Breast arterial calcifications (BACs), detected inmammograms, can be useful riskmarkers associated with the disease. We investigate the feasibility of automated and accurate detection ofBACsinmammograms for risk assessment of coronary artery disease. We develop a 12-layer convolutional neural network to discriminate BAC from non-BAC and apply a pixelwise, patch-based procedure for BAC detection. To assess the performance of the system, we conduct a reader study to provide ground-truth information using the consensus of human expert radiologists. We evaluate the performance using a set of 840 full-field digital mammograms from 210 cases, using both free-responsereceiveroperatingcharacteristic (FROC) analysis and calcium mass quantification analysis. The FROC analysis shows that the deep learning approach achieves a level of detection similar to the human experts. The calcium mass quantification analysis shows that the inferred calcium mass is close to the ground truth, with a linear regression between them yielding a coefficient of determination of 96.24%. Taken together, these results suggest that deep learning can be used effectively to develop an automated system for BAC detection inmammograms to help identify and assess patients with cardiovascular risks.

Journal ArticleDOI
TL;DR: A novel block-wise adaptive singular value decomposition (SVD) based clutter filtering technique for in vivo human native kidney study that showed substantial improvement in suppression of the depth-dependent background noise and better rejection of near field tissue clutter.
Abstract: Robust clutter filtering is essential for ultrasound small vessel imaging. Eigen-based clutter filtering techniques have recently shown great improvement in clutter rejection over conventional clutter filters in small animals. However, for in vivo human imaging, eigen-based clutter filtering can be challenging due to the complex spatially-varying tissue and noise characteristics. To address this challenge, we present a novel block-wise adaptive singular value decomposition (SVD) based clutter filtering technique. The proposed method divides the global plane wave data into overlapped local spatial segments, within which tissue signals are assumed to be locally coherent and noise locally stationary. This, in turn, enables effective separation of tissue, blood and noise via SVD. For each block, the proposed method adaptively determines the singular value cutoff thresholds based on local data statistics. Processing results from each block are redundantly combined to improve both the signal-to-noise-ratio (SNR) and the contrast-to-noise-ratio (CNR) of the small vessel perfusion image. Experimental results show that the proposed method achieved more than two-fold increase in SNR and more than three-fold increase in CNR in dB scale over the conventional global SVD filtering technique for an in vivo human native kidney study. The proposed method also showed substantial improvement in suppression of the depth-dependent background noise and better rejection of near field tissue clutter. The effects of different processing block size and block overlap percentage were systematically investigated as well as the tradeoff between imaging quality and computational cost.

Journal ArticleDOI
TL;DR: This approach will be the first to incorporate deep neural networks for tool detection and localization in RAS videos, and applies a region proposal network (RPN) and a multimodal two stream convolutional network for object detection to jointly predict objectness and localization on a fusion of image and temporal motion cues.
Abstract: Video understanding of robot-assisted surgery (RAS) videos is an active research area. Modeling the gestures and skill level of surgeons presents an interesting problem. The insights drawn may be applied in effective skill acquisition, objective skill assessment, real-time feedback, and human–robot collaborative surgeries. We propose a solution to the tool detection and localization open problem in RAS video understanding, using a strictly computer vision approach and the recent advances of deep learning. We propose an architecture using multimodal convolutional neural networks for fast detection and localization of tools in RAS videos. To the best of our knowledge, this approach will be the first to incorporate deep neural networks for tool detection and localization in RAS videos. Our architecture applies a region proposal network (RPN) and a multimodal two stream convolutional network for object detection to jointly predict objectness and localization on a fusion of image and temporal motion cues. Our results with an average precision of 91% and a mean computation time of 0.1 s per test frame detection indicate that our study is superior to conventionally used methods for medical imaging while also emphasizing the benefits of using RPN for precision and efficiency. We also introduce a new data set, ATLAS Dione, for RAS video understanding. Our data set provides video data of ten surgeons from Roswell Park Cancer Institute, Buffalo, NY, USA, performing six different surgical tasks on the daVinci Surgical System (dVSS) with annotations of robotic tools per frame.

Journal ArticleDOI
TL;DR: The results demonstrate that the proposed tensor-based method generally produces superior image quality, and leads to more accurate material decomposition than the currently popular popular methods.
Abstract: Spectral computed tomography (CT) produces an energy-discriminative attenuation map of an object, extending a conventional image volume with a spectral dimension. In spectral CT, an image can be sparsely represented in each of multiple energy channels, and are highly correlated among energy channels. According to this characteristics, we propose a tensor-based dictionary learning method for spectral CT reconstruction. In our method, tensor patches are extracted from an image tensor, which is reconstructed using the filtered backprojection (FBP), to form a training dataset. With the Candecomp/Parafac decomposition, a tensor-based dictionary is trained, in which each atom is a rank-one tensor. Then, the trained dictionary is used to sparsely represent image tensor patches during an iterative reconstruction process, and the alternating minimization scheme is adapted for optimization. The effectiveness of our proposed method is validated with both numerically simulated and real preclinical mouse datasets. The results demonstrate that the proposed tensor-based method generally produces superior image quality, and leads to more accurate material decomposition than the currently popular popular methods.

Journal ArticleDOI
TL;DR: A new weakly supervised learning algorithm to learn to segment cancerous regions in histopathology images with fully convolutional networks and an effective way to introduce constraints to the authors' neural networks to assist the learning process is proposed.
Abstract: In this paper, we develop a new weakly supervised learning algorithm to learn to segment cancerous regions in histopathology images. This paper is under a multiple instance learning (MIL) framework with a new formulation, deep weak supervision (DWS); we also propose an effective way to introduce constraints to our neural networks to assist the learning process. The contributions of our algorithm are threefold: 1) we build an end-to-end learning system that segments cancerous regions with fully convolutional networks (FCNs) in which image-to-image weakly-supervised learning is performed; 2) we develop a DWS formulation to exploit multi-scale learning under weak supervision within FCNs; and 3) constraints about positive instances are introduced in our approach to effectively explore additional weakly supervised information that is easy to obtain and enjoy a significant boost to the learning process. The proposed algorithm, abbreviated as DWS-MIL, is easy to implement and can be trained efficiently. Our system demonstrates the state-of-the-art results on large-scale histopathology image data sets and can be applied to various applications in medical imaging beyond histopathology images, such as MRI, CT, and ultrasound images.

Journal ArticleDOI
TL;DR: An automated methodology for the analysis of unregistered cranio-caudal and medio-lateral oblique mammography views in order to estimate the patient’s risk of developing breast cancer is described.
Abstract: We describe an automated methodology for the analysis of unregistered cranio-caudal (CC) and medio-lateral oblique (MLO) mammography views in order to estimate the patient’s risk of developing breast cancer. The main innovation behind this methodology lies in the use of deep learning models for the problem of jointly classifying unregistered mammogram views and respective segmentation maps of breast lesions (i.e., masses and micro-calcifications). This is a holistic methodology that can classify a whole mammographic exam, containing the CC and MLO views and the segmentation maps, as opposed to the classification of individual lesions, which is the dominant approach in the field. We also demonstrate that the proposed system is capable of using the segmentation maps generated by automated mass and micro-calcification detection systems, and still producing accurate results. The semi-automated approach (using manually defined mass and micro-calcification segmentation maps) is tested on two publicly available data sets (INbreast and DDSM), and results show that the volume under ROC surface (VUS) for a 3-class problem (normal tissue, benign, and malignant) is over 0.9, the area under ROC curve (AUC) for the 2-class “benign versus malignant” problem is over 0.9, and for the 2-class breast screening problem (malignancy versus normal/benign) is also over 0.9. For the fully automated approach, the VUS results on INbreast is over 0.7, and the AUC for the 2-class “benign versus malignant” problem is over 0.78, and the AUC for the 2-class breast screening is 0.86.

Journal ArticleDOI
TL;DR: In isotropic Total Variation (TV) regularization is used to enable accurate registration near sliding interfaces in breathing motion databases and is robust to parameter selection, allowing the use of the same parameters for all tested databases.
Abstract: Spatial regularization is essential in image registration, which is an ill-posed problem. Regularization can help to avoid both physically implausible displacement fields and local minima during optimization. Tikhonov regularization (squared l2 -norm) is unable to correctly represent non-smooth displacement fields, that can, for example, occur at sliding interfaces in the thorax and abdomen in image time-series during respiration. In this paper, isotropic Total Variation (TV) regularization is used to enable accurate registration near such interfaces. We further develop the TV-regularization for parametric displacement fields and provide an efficient numerical solution scheme using the Alternating Directions Method of Multipliers (ADMM). The proposed method was successfully applied to four clinical databases which capture breathing motion, including CT lung and MR liver images. It provided accurate registration results for the whole volume. A key strength of our proposed method is that it does not depend on organ masks that are conventionally required by many algorithms to avoid errors at sliding interfaces. Furthermore, our method is robust to parameter selection, allowing the use of the same parameters for all tested databases. The average target registration error (TRE) of our method is superior (10% to 40%) to other techniques in the literature. It provides precise motion quantification and sliding detection with sub-pixel accuracy on the publicly available breathing motion databases (mean TREs of 0.95 mm for DIR 4D CT, 0.96 mm for DIR COPDgene, 0.91 mm for POPI databases).

Journal ArticleDOI
TL;DR: The proposed residual deconvolutional network consists of two information pathways that capture full-resolution features and contextual information, respectively, and it was shown that the proposed model is very effective in achieving the conflicting goals in dense output prediction; namely preserving full- resolution predictions and including sufficient contextual information.
Abstract: Accurate reconstruction of anatomical connections between neurons in the brain using electron microscopy (EM) images is considered to be the gold standard for circuit mapping. A key step in obtaining the reconstruction is the ability to automatically segment neurons with a precision close to human-level performance. Despite the recent technical advances in EM image segmentation, most of them rely on hand-crafted features to some extent that are specific to the data, limiting their ability to generalize. Here, we propose a simple yet powerful technique for EM image segmentation that is trained end-to-end and does not rely on prior knowledge of the data. Our proposed residual deconvolutional network consists of two information pathways that capture full-resolution features and contextual information, respectively. We showed that the proposed model is very effective in achieving the conflicting goals in dense output prediction; namely preserving full-resolution predictions and including sufficient contextual information. We applied our method to the ongoing open challenge of 3D neurite segmentation in EM images. Our method achieved one of the top results on this open challenge. We demonstrated the generality of our technique by evaluating it on the 2D neurite segmentation challenge dataset where consistently high performance was obtained. We thus expect our method to generalize well to other dense output prediction problems.

Journal ArticleDOI
TL;DR: The experimental results suggest that the predicted semantic scores from the three MTL schemes are closer to the radiologists’ ratings than the scores from single-task LASSO and elastic net regression methods, which may provide richer quantitative assessments of nodules for better support of diagnostic decision and management.
Abstract: The gap between the computational and semantic features is the one of major factors that bottlenecks the computer-aided diagnosis (CAD) performance from clinical usage. To bridge this gap, we exploit three multi-task learning (MTL) schemes to leverage heterogeneous computational features derived from deep learning models of stacked denoising autoencoder (SDAE) and convolutional neural network (CNN), as well as hand-crafted Haar-like and HoG features, for the description of 9 semantic features for lung nodules in CT images. We regard that there may exist relations among the semantic features of “spiculation”, “texture”, “margin”, etc., that can be explored with the MTL. The Lung Image Database Consortium (LIDC) data is adopted in this study for the rich annotation resources. The LIDC nodules were quantitatively scored w.r.t. 9 semantic features from 12 radiologists of several institutes in U.S.A. By treating each semantic feature as an individual task, the MTL schemes select and map the heterogeneous computational features toward the radiologists’ ratings with cross validation evaluation schemes on the randomly selected 2400 nodules from the LIDC dataset. The experimental results suggest that the predicted semantic scores from the three MTL schemes are closer to the radiologists’ ratings than the scores from single-task LASSO and elastic net regression methods. The proposed semantic attribute scoring scheme may provide richer quantitative assessments of nodules for better support of diagnostic decision and management. Meanwhile, the capability of the automatic association of medical image contents with the clinical semantic terms by our method may also assist the development of medical search engine.

Journal ArticleDOI
TL;DR: A new multi-modality reconstruction framework using second order Total Generalized Variation as a dedicated multi-channel regularization functional that jointly reconstructs images from both modalities is proposed.
Abstract: While current state of the art MR-PET scanners enable simultaneous MR and PET measurements, the acquired data sets are still usually reconstructed separately. We propose a new multi-modality reconstruction framework using second order Total Generalized Variation (TGV) as a dedicated multi-channel regularization functional that jointly reconstructs images from both modalities. In this way, information about the underlying anatomy is shared during the image reconstruction process while unique differences are preserved. Results from numerical simulations and in-vivo experiments using a range of accelerated MR acquisitions and different MR image contrasts demonstrate improved PET image quality, resolution, and quantitative accuracy.

Journal ArticleDOI
TL;DR: This work presents a set of methods and a workflow to enable autonomous MRI-guided ultrasound acquisitions and uses a structured-light 3D scanner for patient- to-robot and image-to-patient calibration, which in turn is used to plan 3D ultrasound trajectories.
Abstract: Robotic ultrasound has the potential to assist and guide physicians during interventions. In this work, we present a set of methods and a workflow to enable autonomous MRI-guided ultrasound acquisitions. Our approach uses a structured-light 3D scanner for patient-to-robot and image-to-patient calibration, which in turn is used to plan 3D ultrasound trajectories. These MRI-based trajectories are followed autonomously by the robot and are further refined online using automatic MRI/US registration. Despite the low spatial resolution of structured light scanners, the initial planned acquisition path can be followed with an accuracy of 2.46 ± 0.96 mm. This leads to a good initialization of the MRI/US registration: the 3D-scan-based alignment for planning and acquisition shows an accuracy (distance between planned ultrasound and MRI) of 4.47 mm, and 0.97 mm after an online-update of the calibration based on a closed loop registration.

Journal ArticleDOI
TL;DR: The usefulness of utilizing a segmentation step for improving the performance of sparsity based image reconstruction algorithms is demonstrated and the proposed SSR method for both denoising and interpolation of OCT images is demonstrated.
Abstract: We demonstrate the usefulness of utilizing a segmentation step for improving the performance of sparsity based image reconstruction algorithms. In specific, we will focus on retinal optical coherence tomography (OCT) reconstruction and propose a novel segmentation based reconstruction framework with sparse representation, termed segmentation based sparse reconstruction (SSR). The SSR method uses automatically segmented retinal layer information to construct layer-specific structural dictionaries. In addition, the SSR method efficiently exploits patch similarities within each segmented layer to enhance the reconstruction performance. Our experimental results on clinical-grade retinal OCT images demonstrate the effectiveness and efficiency of the proposed SSR method for both denoising and interpolation of OCT images.

Journal ArticleDOI
TL;DR: A method for automatic localization of one or more anatomical structures in 3-D medical images through detection of their presence in 2-D image slices using a convolutional neural network (ConvNet) was more robust and accurate in localization multiple structures.
Abstract: Localization of anatomical structures is a prerequisite for many tasks in a medical image analysis. We propose a method for automatic localization of one or more anatomical structures in 3-D medical images through detection of their presence in 2-D image slices using a convolutional neural network (ConvNet). A single ConvNet is trained to detect the presence of the anatomical structure of interest in axial, coronal, and sagittal slices extracted from a 3-D image. To allow the ConvNet to analyze slices of different sizes, spatial pyramid pooling is applied. After detection, 3-D bounding boxes are created by combining the output of the ConvNet in all slices. In the experiments, 200 chest CT, 100 cardiac CT angiography (CTA), and 100 abdomen CT scans were used. The heart, ascending aorta, aortic arch, and descending aorta were localized in chest CT scans, the left cardiac ventricle in cardiac CTA scans, and the liver in abdomen CT scans. Localization was evaluated using the distances between automatically and manually defined reference bounding box centroids and walls. The best results were achieved in the localization of structures with clearly defined boundaries (e.g., aortic arch) and the worst when the structure boundary was not clearly visible (e.g., liver). The method was more robust and accurate in localization multiple structures.

Journal ArticleDOI
TL;DR: A novel CNN architecture is designed, that takes volumetric images as the inputs and their voxel-wise segmentation maps as the outputs, that allows it to train and predict using large microscopy images in an end-to-end manner.
Abstract: Digital reconstruction, or tracing, of 3-D neuron structure from microscopy images is a critical step toward reversing engineering the wiring and anatomy of a brain. Despite a number of prior attempts, this task remains very challenging, especially when images are contaminated by noises or have discontinued segments of neurite patterns. An approach for addressing such problems is to identify the locations of neuronal voxels using image segmentation methods, prior to applying tracing or reconstruction techniques. This preprocessing step is expected to remove noises in the data, thereby leading to improved reconstruction results. In this paper, we proposed to use 3-D convolutional neural networks (CNNs) for segmenting the neuronal microscopy images. Specifically, we designed a novel CNN architecture, that takes volumetric images as the inputs and their voxel-wise segmentation maps as the outputs. The developed architecture allows us to train and predict using large microscopy images in an end-to-end manner. We evaluated the performance of our model on a variety of challenging 3-D microscopy images from different organisms. Results showed that the proposed methods improved the tracing performance significantly when combined with different reconstruction algorithms.

Journal ArticleDOI
TL;DR: In this paper, the authors introduce the concept of reverse classification accuracy (RCA) as a framework for predicting the performance of a segmentation method on new data, in which they take the predicted segmentation from a new image to train a reverse classifier, which is evaluated on a set of reference images with available ground truth.
Abstract: When integrating computational tools, such as automatic segmentation, into clinical practice, it is of utmost importance to be able to assess the level of accuracy on new data and, in particular, to detect when an automatic method fails. However, this is difficult to achieve due to the absence of ground truth. Segmentation accuracy on clinical data might be different from what is found through cross validation, because validation data are often used during incremental method development, which can lead to overfitting and unrealistic performance expectations. Before deployment, performance is quantified using different metrics, for which the predicted segmentation is compared with a reference segmentation, often obtained manually by an expert. But little is known about the real performance after deployment when a reference is unavailable. In this paper, we introduce the concept of reverse classification accuracy (RCA) as a framework for predicting the performance of a segmentation method on new data. In RCA, we take the predicted segmentation from a new image to train a reverse classifier, which is evaluated on a set of reference images with available ground truth. The hypothesis is that if the predicted segmentation is of good quality, then the reverse classifier will perform well on at least some of the reference images. We validate our approach on multi-organ segmentation with different classifiers and segmentation methods. Our results indicate that it is indeed possible to predict the quality of individual segmentations, in the absence of ground truth. Thus, RCA is ideal for integration into automatic processing pipelines in clinical routine and as a part of large-scale image analysis studies.

Journal ArticleDOI
TL;DR: A new sinogram restoration approach that works through a 3-D representation-based feature decomposition of the projected attenuation component and the noise component using a well-designed composite dictionary containing atoms with discriminative features, demonstrating that the S-DFR method offers a sound alternative in LDCT.
Abstract: In low dose computed tomography (LDCT) imaging, the data inconsistency of measured noisy projections can significantly deteriorate reconstruction images. To deal with this problem, we propose here a new sinogram restoration approach, the sinogram- discriminative feature representation (S-DFR) method. Different from other sinogram restoration methods, the proposed method works through a 3-D representation-based feature decomposition of the projected attenuation component and the noise component using a well-designed composite dictionary containing atoms with discriminative features. This method can be easily implemented with good robustness in parameter setting. Its comparison to other competing methods through experiments on simulated and real data demonstrated that the S-DFR method offers a sound alternative in LDCT.

Journal ArticleDOI
TL;DR: In this paper, the authors applied the tools used in super-resolution optical fluctuation imaging (SOFI) to contrastenhance ultrasound (CEUS) plane-wave scans.
Abstract: Ultrasound super-localization microscopy techniques presented in the last few years enable non-invasive imaging of vascular structures at the capillary level by tracking the flow of ultrasound contrast agents (gas microbubbles). However, these techniques are currently limited by low temporal resolution and long acquisition times. Super-resolution optical fluctuation imaging (SOFI) is a fluorescence microscopy technique enabling sub-diffraction limit imaging with high temporal resolution by calculating high order statistics of the fluctuating optical signal. The aim of this work is to achieve fast acoustic imaging with enhanced resolution by applying the tools used in SOFI to contrast-enhance ultrasound (CEUS) plane-wave scans. The proposed method was tested using numerical simulations and evaluated using two in-vivo rabbit models: scans of healthy kidneys and VX-2 tumor xenografts. Improved spatial resolution was observed with a reduction of up to 50% in the full width half max of the point spread function. In addition, substantial reduction in the background level was achieved compared to standard mean amplitude persistence images, revealing small vascular structures within tumors. The scan duration of the proposed method is less than a second while current super-localization techniques require acquisition duration of several minutes. As a result, the proposed technique may be used to obtain scans with enhanced spatial resolution and high temporal resolution, facilitating flow-dynamics monitoring. Our method can also be applied during a breath-hold, reducing the sensitivity to motion artifacts.