scispace - formally typeset
Search or ask a question
Book ChapterDOI

Detector-SegMentor Network for Skin Lesion Localization and Segmentation

TL;DR: In this article, a Faster Region-based CNN (Faster RCNN) is used for preprocessing to predict bounding boxes of the lesions in the whole image which are subsequently cropped and fed into the segmentation network to obtain the lesion mask.
Abstract: Melanoma is a life-threatening form of skin cancer when left undiagnosed at the early stages. Although there are more cases of non-melanoma cancer than melanoma cancer, melanoma cancer is more deadly. Early detection of melanoma is crucial for the timely diagnosis of melanoma cancer and prohibit its spread to distant body parts. Segmentation of skin lesion is a crucial step in the classification of melanoma cancer from the cancerous lesions in dermoscopic images. Manual segmentation of dermoscopic skin images is very time consuming and error-prone resulting in an urgent need for an intelligent and accurate algorithm. In this study, we propose a simple yet novel network-in-network convolution neural network (CNN) based approach for segmentation of the skin lesion. A Faster Region-based CNN (Faster RCNN) is used for preprocessing to predict bounding boxes of the lesions in the whole image which are subsequently cropped and fed into the segmentation network to obtain the lesion mask. The segmentation network is a combination of the UNet and Hourglass networks. We trained and evaluated our models on ISIC 2018 dataset and also cross-validated on PH2 and ISBI 2017 datasets. Our proposed method surpassed the state-of-the-art with Dice Similarity Coefficient of 0.915 and Accuracy 0.959 on ISIC 2018 dataset and Dice Similarity Coefficient of 0.947 and Accuracy 0.971 on ISBI 2017 dataset.
Citations
More filters
Proceedings ArticleDOI
08 Apr 2021
TL;DR: In this paper, the authors proposed an end-to-end convolution neural network (CNN) for precise and robust skin lesion localization and segmentation, which consists of three sub-encoders branching out from the main encoder.
Abstract: Melanoma is the most common form of cancer in the world. Early diagnosis of the disease and an accurate estimation of its size and shape are crucial in preventing its spread to other body parts. Manual segmentation of these lesions by a radiologist however is time consuming and error-prone. It is clinically desirable to have an automatic tool to detect malignant skin lesions from dermoscopic skin images. We propose a novel end-to-end convolution neural network(CNN) for a precise and robust skin lesion localization and segmentation. The proposed network has 3 sub-encoders branching out from the main encoder. The 3 sub-encoders are inspired from Coordinate Convolution, Hourglass and Octave Convolutional blocks: each sub-encoder summarizes different patterns and yet collectively aims to achieve a precise segmentation. We trained our segmentation model just on the ISIC 2018 dataset. To demonstrate the generalizability of our model, we evaluated our model on the ISIC 2018 and unseen datasets including ISIC 2017 and PH2. Our approach showed an average 5% improvement in performance over different datasets, while having less than half of the number of parameters when compared to other state-of-the-arts segmentation models.

6 citations

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed an improved skin lesion segmentation model based on deformable 3D convolution and ResU-NeXt++ (D3DC-resu-next++).
Abstract: Melanoma is one of the most dangerous skin cancers. The current melanoma segmentation is mainly based on FCNs (fully connected networks) and U-Net. Nevertheless, these two kinds of neural networks are prone to parameter redundancy, and the gradient of neural networks disappears that occurs when the neural network backpropagates as the neural network gets deeper, which will reduce the Jaccard index of the skin lesion image segmentation model. To solve the above problems and improve the survival rate of melanoma patients, an improved skin lesion segmentation model based on deformable 3D convolution and ResU-NeXt++ (D3DC- ResU-NeXt++) is proposed in this paper. The new modules in D3DC-ResU-NeXt++ can replace ordinary modules in the existing 2D convolutional neural networks (CNNs) that can be trained efficiently through standard backpropagation with high segmentation accuracy. In particular, we introduce a new data preprocessing method with dilation, crop operation, resizing, and hair removal (DCRH), which improves the Jaccard index of skin lesion image segmentation. Because rectified Adam (RAdam) does not easily fall into a local optimal solution and can converge quickly in segmentation model training, we also introduce RAdam as the training optimizer. The experiments show that our model has excellent performance on the segmentation of the ISIC2018 Task I dataset, and the Jaccard index achieves 86.84%. The proposed method improves the Jaccard index of segmentation of skin lesion images and can also assist dermatological doctors in determining and diagnosing the types of skin lesions and the boundary between lesions and normal skin, so as to improve the survival rate of skin cancer patients. Overview of the proposed model. An improved skin lesion segmentation model based on deformable 3D convolution and ResU-NeXt++ (D3DC- ResU-NeXt++) is proposed in this paper. D3DC-ResU-NeXt++ has strong spatial geometry processing capabilities, it is used to segment the skin lesion sample image; DCRH and transfer learning are used to preprocess the data set and D3DC-ResU-NeXt++ respectively, which can highlight the difference between the lesion area and the normal skin, and enhance the segmentation efficiency and robustness of the neural network; RAdam is used to speed up the convergence speed of neural network and improve the efficiency of segmentation.

4 citations

Proceedings ArticleDOI
11 Feb 2022
TL;DR: In this paper , a skin lesion image is sent into the system, which is then examined using novel image processing algorithms to infer the existence of skin cancer. But, it is not shown how to identify melanoma skin cancer using machine learning and technological tools.
Abstract: Skin cancer has become one of the most serious kinds of cancer for people in recent years. Melanoma, basal cell carcinoma, and squamous cell carcinoma are all kinds of skin cancer, with melanoma being the most unexpected. Melanoma can be cured if detected in its early stages. Computer vision can be useful in medical imaging and has already been shown in a number of systems. This study calculates how to identify melanoma skin cancer using machine learning and technological tools. A skin lesion image is sent into the system, which is then examined using novel image processing algorithms to infer the existence of skin cancer. By segmenting the picture and assessing the texture, size, and form of the tumor, the lesion image analysis tool checks for the existence of melanoma (a kind of skin cancer). The picture is classified as a malignant lesion of normal skin or melanoma using the derived feature characteristics. In a nutshell, DermaGenics is a web application integrated with the YOLOv5 model that allows users to input stain's photos. The model takes care of it and evaluates if the stain is cancerous or benign.

1 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Book ChapterDOI
05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .

49,590 citations

Proceedings ArticleDOI
27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

27,256 citations

Journal ArticleDOI
18 Jun 2018
TL;DR: This work proposes a novel architectural unit, which is term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and finds that SE blocks produce significant performance improvements for existing state-of-the-art deep architectures at minimal additional computational cost.
Abstract: The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251 percent, surpassing the winning entry of 2016 by a relative improvement of ${\sim }$ ∼ 25 percent. Models and code are available at https://github.com/hujie-frank/SENet .

14,807 citations

Proceedings ArticleDOI
20 Mar 2017
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Abstract: We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without tricks, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code will be made available.

14,299 citations