scispace - formally typeset
Open accessJournal ArticleDOI: 10.1109/ACCESS.2021.3063716

Real-Time Polyp Detection, Localization and Segmentation in Colonoscopy Using Deep Learning

04 Mar 2021-IEEE Access (IEEE)-Vol. 9, pp 40496-40510
Abstract: Computer-aided detection, localisation, and segmentation methods can help improve colonoscopy procedures. Even though many methods have been built to tackle automatic detection and segmentation of polyps, benchmarking of state-of-the-art methods still remains an open problem. This is due to the increasing number of researched computer vision methods that can be applied to polyp datasets. Benchmarking of novel methods can provide a direction to the development of automated polyp detection and segmentation tasks. Furthermore, it ensures that the produced results in the community are reproducible and provide a fair comparison of developed methods. In this paper, we benchmark several recent state-of-the-art methods using Kvasir-SEG, an open-access dataset of colonoscopy images for polyp detection, localisation, and segmentation evaluating both method accuracy and speed. Whilst, most methods in literature have competitive performance over accuracy, we show that the proposed ColonSegNet achieved a better trade-off between an average precision of 0.8000 and mean IoU of 0.8100, and the fastest speed of 180 frames per second for the detection and localisation task. Likewise, the proposed ColonSegNet achieved a competitive dice coefficient of 0.8206 and the best average speed of 182.38 frames per second for the segmentation task. Our comprehensive comparison with various state-of-the-art methods reveals the importance of benchmarking the deep learning methods for automated real-time polyp identification and delineations that can potentially transform current clinical practices and minimise miss-detection rates.

... read more

Topics: Image segmentation (57%), Segmentation (52%)
Citations
  More

41 results found



Open access
01 Jan 2017-
Abstract: The prevalence of cancer in small and diminutive polyps is relevant to "resect and discard" and CT colonography reporting recommendations. We evaluated a prospectively collected colonoscopy polyp database to identify polyps <10mm and those with cancer or advanced histology (high-grade dysplasia or villous elements). Of 32,790 colonoscopies, 15,558 colonoscopies detected 42,630 polyps <10mm in size. A total of 4790 lesions were excluded as they were not conventional adenomas or serrated class lesions. There were 23,524 conventional adenomas <10mm of which 22,952 were tubular adenomas. There were 14,316 serrated class lesions of which 13,589 were hyperplastic polyps and the remainder were sessile serrated polyps. Of all conventional adenomas, 96 had high-grade dysplasia including 0.3% of adenomas ≤5mm in size and 0.8% of adenomas 6-9mm in size. Of all conventional adenomas, 2.1% of those ≤5mm in size and 5.6% of those 6-9mm in size were advanced. Among 36,107 polyps ≤5mm in size and 6523 polyps 6-9mm in size, there were no cancers. These results support the safety of resect and discard as well as current CT colonography reporting recommendations for small and diminutive polyps.

... read more

Topics: Hyperplastic Polyp (57%), Colonoscopy (53%)

57 Citations


Open accessPosted Content
Abstract: We propose a new convolution neural network called HarDNet-MSEG for polyp segmentation. It achieves SOTA in both accuracy and inference speed on five popular datasets. For Kvasir-SEG, HarDNet-MSEG delivers 0.904 mean Dice running at 86.7 FPS on a GeForce RTX 2080 Ti GPU. It consists of a backbone and a decoder. The backbone is a low memory traffic CNN called HarDNet68, which has been successfully applied to various CV tasks including image classification, object detection, multi-object tracking and semantic segmentation, etc. The decoder part is inspired by the Cascaded Partial Decoder, known for fast and accurate salient object detection. We have evaluated HarDNet-MSEG using those five popular datasets. The code and all experiment details are available at Github. this https URL

... read more

13 Citations


Open accessJournal ArticleDOI: 10.1016/J.BSPC.2021.102654
Abstract: Colonic polyps detection remains an unsolved issue because of the wide variation in the appearance, texture, color, size, and appearance of the multiple polyp-like imitators during the colonoscopy process. In this paper, a deep convolutional neural network (CNN) based model for the computerized detection of polyps within colonoscopy images is proposed. The proposed deep CNN model employs a unique way of adopting different convolutional kernels having different window sizes within the same hidden layer for deeper feature extraction. A lightweight model comprising 16 convolutional layers with 2 fully connected layers (FCN), and a Softmax layer as output layer is implemented. For achieving a deeper propagation of information, self-regularized smooth non-monotonicity, and to avoid saturation during training, MISH as an activation function is used in the first 15 layers followed by the rectified linear unit activation (ReLU) function. Moreover, a generalized intersection of the union (GIoU) approach is employed, overcoming issues such as scale invariance, rotation, and shape encountering with IoU. Data augmentation techniques such as photometric and geometric distortions are employed to overcome the scarcity of the data set of the colonic polyp. Detailed experimental results are provided that are bench-marked with the MICCAI 2015 challenge and other publicly available data set reflecting better performance in terms of precision, sensitivity, F1-score, F2-score, and Dice-coefficient, thus proving the efficacy of the proposed model.

... read more

9 Citations


Journal ArticleDOI: 10.1016/J.COMPBIOMED.2021.104519
Ishak Pacal1, Dervis Karaboga2Institutions (2)
Abstract: Colorectal cancer (CRC) is globally the third most common type of cancer. Colonoscopy is considered the gold standard in colorectal cancer screening and allows for the removal of polyps before they become cancerous. Computer-aided detection systems (CADs) have been developed to detect polyps. Unfortunately, these systems have limited sensitivity and specificity. In contrast, deep learning architectures provide better detection by extracting the different properties of polyps. However, the desired success has not yet been achieved in real-time polyp detection. Here, we propose a new structure for real-time polyp detection by scaling the YOLOv4 algorithm to overcome these obstacles. For this, we first replace the whole structure with Cross Stage Partial Networks (CSPNet), then substitute the Mish activation function for the Leaky ReLu activation function and also substituted the Distance Intersection over Union (DIoU) loss for the Complete Intersection over Union (CIoU) loss. We improved performance of YOLOv3 and YOLOv4 architectures using different structures such as ResNet, VGG, DarkNet53, and Transformers. To increase success of the proposed method, we utilized a variety of data augmentation approaches for preprocessing, an ensemble learning model, and NVIDIA TensorRT for post processing. In order to compare our study with other studies more objectively, we only employed public data sets and followed MICCAI Sub-Challenge on Automatic Polyp Detection in Colonoscopy. The proposed method differs from other methods with its real-time performance and state-of-the-art detection accuracy. The proposed method (without ensemble learning) achieved higher results than those found in the literature, precision: 91.62%, recall: 82.55%, F1-score: 86.85% on public ETIS-LARIB data set and precision: 96.04%, recall: 96.68%, F1-score: 96.36% on public CVC-ColonDB data set, respectively.

... read more

Topics: Ensemble learning (55%)

6 Citations


References
  More

87 results found


Open accessProceedings ArticleDOI: 10.1109/CVPR.2016.90
Kaiming He1, Xiangyu Zhang1, Shaoqing Ren1, Jian Sun1Institutions (1)
27 Jun 2016-
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

... read more

Topics: Deep learning (53%), Residual (53%), Convolutional neural network (53%) ... show more

93,356 Citations


Open accessProceedings Article
Karen Simonyan1, Andrew Zisserman1Institutions (1)
01 Jan 2015-
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

... read more

49,857 Citations


Open accessProceedings Article
Karen Simonyan1, Andrew Zisserman1Institutions (1)
04 Sep 2014-
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

... read more

38,283 Citations


Proceedings ArticleDOI: 10.1109/CVPR.2009.5206848
Jia Deng1, Wei Dong1, Richard Socher1, Li-Jia Li1  +2 moreInstitutions (1)
20 Jun 2009-
Abstract: The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called “ImageNet”, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

... read more

Topics: WordNet (57%), Image retrieval (54%)

31,274 Citations


Open accessJournal ArticleDOI: 10.3156/JSOFT.29.5_177_2
Ian Goodfellow1, Jean Pouget-Abadie1, Mehdi Mirza1, Bing Xu1  +4 moreInstitutions (2)
08 Dec 2014-
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

... read more

Topics: Generative model (64%), Discriminative model (54%), Approximate inference (53%) ... show more

29,410 Citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20222
202135
20202
20171
20021
Network Information
Related Papers (5)
U-Net: Convolutional Networks for Biomedical Image Segmentation05 Oct 2015

Olaf Ronneberger, Philipp Fischer +1 more

100% related
Deep Residual Learning for Image Recognition27 Jun 2016

Kaiming He, Xiangyu Zhang +2 more

99% related
Kvasir-SEG: A Segmented Polyp Dataset05 Jan 2020

Debesh Jha, Pia H. Smedsrud +5 more

99% related
ResUNet++: An Advanced Architecture for Medical Image Segmentation01 Dec 2019

Debesh Jha, Pia H. Smedsrud +5 more

98% related