(Open Access) EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos (2017) | Andru Putra Twinanda

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A survey on deep learning in medical image analysis

[...]

Geert Litjens¹, Thijs Kooi¹, Babak Ehteshami Bejnordi¹, Arnaud Arindra Adiyoso Setio¹, Francesco Ciompi¹, Mohsen Ghafoorian¹, Jeroen van der Laak¹, Bram van Ginneken¹, Clara I. Sánchez¹ - Show less +5 more•Institutions (1)

Radboud University Nijmegen¹

01 Dec 2017-Medical Image Analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.

...read moreread less

8,730 citations

Journal Article•DOI•

Deep learning-enabled medical computer vision.

[...]

Andre Esteva¹, Katherine Chou², Serena Yeung³, Nikhil Naik¹, Ali Madani¹, Ali Mottaghi³, Yun Liu², Eric J. Topol⁴, Jeffrey Dean², Richard Socher¹ - Show less +6 more•Institutions (4)

Salesforce.com¹, Google², Stanford University³, Scripps Health⁴

08 Jan 2021

TL;DR: In this paper, the authors survey recent progress in the development of modern computer vision techniques-powered by deep learning-for medical applications, focusing on medical imaging, medical video, and clinical deployment.

...read moreread less

Abstract: A decade of unprecedented progress in artificial intelligence (AI) has demonstrated the potential for many fields-including medicine-to benefit from the insights that AI techniques can extract from data. Here we survey recent progress in the development of modern computer vision techniques-powered by deep learning-for medical applications, focusing on medical imaging, medical video, and clinical deployment. We start by briefly summarizing a decade of progress in convolutional neural networks, including the vision tasks they enable, in the context of healthcare. Next, we discuss several example medical imaging applications that stand to benefit-including cardiology, pathology, dermatology, ophthalmology-and propose new avenues for continued work. We then expand into general medical video, highlighting ways in which clinical workflows can integrate computer vision to enhance care. Finally, we discuss the challenges and hurdles required for real-world clinical deployment of these technologies.

...read moreread less

296 citations

Journal Article•DOI•

SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network

[...]

Yueming Jin¹, Qi Dou¹, Hao Chen¹, Lequan Yu¹, Jing Qin², Chi-Wing Fu¹, Pheng-Ann Heng¹ - Show less +3 more•Institutions (2)

The Chinese University of Hong Kong¹, Hong Kong Polytechnic University²

01 May 2018-IEEE Transactions on Medical Imaging

TL;DR: Based on the phase transition-sensitive predictions from the SV-RCNet, a simple yet effective inference scheme, namely the prior knowledge inference (PKI), by leveraging the natural characteristic of surgical video is proposed, which improves the consistency of results and largely boosts the recognition performance.

...read moreread less

Abstract: We propose an analysis of surgical videos that is based on a novel recurrent convolutional network (SV-RCNet), specifically for automatic workflow recognition from surgical videos online, which is a key component for developing the context-aware computer-assisted intervention systems. Different from previous methods which harness visual and temporal information separately, the proposed SV-RCNet seamlessly integrates a convolutional neural network (CNN) and a recurrent neural network (RNN) to form a novel recurrent convolutional architecture in order to take full advantages of the complementary information of visual and temporal features learned from surgical videos. We effectively train the SV-RCNet in an end-to-end manner so that the visual representations and sequential dynamics can be jointly optimized in the learning process. In order to produce more discriminative spatio-temporal features, we exploit a deep residual network (ResNet) and a long short term memory (LSTM) network, to extract visual features and temporal dependencies, respectively, and integrate them into the SV-RCNet. Moreover, based on the phase transition-sensitive predictions from the SV-RCNet, we propose a simple yet effective inference scheme, namely the prior knowledge inference (PKI), by leveraging the natural characteristic of surgical video. Such a strategy further improves the consistency of results and largely boosts the recognition performance. Extensive experiments have been conducted with the MICCAI 2016 Modeling and Monitoring of Computer Assisted Interventions Workflow Challenge dataset and Cholec80 dataset to validate SV-RCNet. Our approach not only achieves superior performance on these two datasets but also outperforms the state-of-the-art methods by a significant margin.

...read moreread less

206 citations

Proceedings Article•DOI•

Tool Detection and Operative Skill Assessment in Surgical Videos Using Region-Based Convolutional Neural Networks

[...]

Amy Jin¹, Serena Yeung¹, Jeffrey K. Jopling¹, Jonathan Krause¹, Dan E. Azagury¹, Arnold Milstein¹, Li Fei-Fei¹ - Show less +3 more•Institutions (1)

Stanford University¹

12 Mar 2018

TL;DR: This work introduces an approach to automatically assess surgeon performance by tracking and analyzing tool movements in surgical videos, leveraging region-based convolutional neural networks, and is the first to not only detect presence but also spatially localize surgical tools in real-world laparoscopic surgical videos.

...read moreread less

Abstract: Five billion people in the world lack access to quality surgical care. Surgeon skill varies dramatically, and many surgical patients suffer complications and avoidable harm. Improving surgical training and feedback would help to reduce the rate of complications—half of which have been shown to be preventable. To do this, it is essential to assess operative skill, a process that currently requires experts and is manual, time consuming, and subjective. In this work, we introduce an approach to automatically assess surgeon performance by tracking and analyzing tool movements in surgical videos, leveraging region-based convolutional neural networks. In order to study this problem, we also introduce a new dataset, m2cai16-tool-locations, which extends the m2cai16-tool dataset with spatial bounds of tools. While previous methods have addressed tool presence detection, ours is the first to not only detect presence but also spatially localize surgical tools in real-world laparoscopic surgical videos. We show that our method both effectively detects the spatial bounds of tools as well as significantly outperforms existing methods on tool presence detection. We further demonstrate the ability of our method to assess surgical quality through analysis of tool usage patterns, movement range, and economy of motion.

...read moreread less

203 citations

Journal Article•DOI•

Learning laparoscopic video shot classification for gynecological surgery

[...]

Stefan Petscharnig¹, Klaus Schöffmann¹•Institutions (1)

Alpen-Adria-Universität Klagenfurt¹

01 Apr 2018-Multimedia Tools and Applications

TL;DR: The main conclusion of this work is that advances in general image classification methods transfer to the domain of endoscopic surgery videos in gynecology, relevant as this domain is different from natural images, e.g. it is distinguished by smoke, reflections, or a limited amount of colors.

...read moreread less

Abstract: Videos of endoscopic surgery are used for education of medical experts, analysis in medical research, and documentation for everyday clinical life. Hand-crafted image descriptors lack the capabilities of a semantic classification of surgical actions and video shots of anatomical structures. In this work, we investigate how well single-frame convolutional neural networks (CNN) for semantic shot classification in gynecologic surgery work. Together with medical experts, we manually annotate hours of raw endoscopic gynecologic surgery videos showing endometriosis treatment and myoma resection of over 100 patients. The cleaned ground truth dataset comprises 9 h of annotated video material (from 111 different recordings). We use the well-known CNN architectures AlexNet and GoogLeNet and train these architectures for both, surgical actions and anatomy, from scratch. Furthermore, we extract high-level features from AlexNet with weights from a pre-trained model from the Caffe model zoo and feed them to an SVM classifier. Our evaluation shows that we reach an average recall of .697 and .515 for classification of anatomical structures and surgical actions respectively using off-the-shelf CNN features. Using GoogLeNet, we achieve a mean recall of .782 and .617 for classification of anatomical structures and surgical actions respectively. With AlexNet the achieved recall is .615 for anatomical structures and .469 for surgical action classification respectively. The main conclusion of our work is that advances in general image classification methods transfer to the domain of endoscopic surgery videos in gynecology. This is relevant as this domain is different from natural images, e.g. it is distinguished by smoke, reflections, or a limited amount of colors.

...read moreread less

171 citations

Collapse

EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos

Citations

References

Related Papers (5)