DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Citations
10,189 citations
7,113 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...We refer interested readers to [9] for more details....
[...]
...Spatial pyramid pooling: Models, such as PSPNet [81] or DeepLab [9, 10], perform spatial pyramid pooling [23, 40] at several grid scales (including image-level pooling [47]) or apply several parallel atrous convolution with different rates (called Atrous Spatial Pyramid Pooling, or ASPP)....
[...]
...In this subsection, we evaluate the segmentation accuracy with the trimap experiment [36, 37, 9] to quantify the accuracy of the proposed decoder module near object boundaries....
[...]
..., image pyramid) [18, 16, 58, 44, 11, 9] or those that adopt probabilistic graphical models (such as DenseCRF [37] with efficient inference algorithm [2]) [8, 4, 82, 44, 48, 55, 63, 34, 72, 6, 7, 9]....
[...]
5,709 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...The difference in operation between ASPP [5] module and improved SPP module is mainly from the original k×k kernel size, max-pooling of stride equals to 1 to several 3 × 3 kernel size, dilated ratio equals to k, and stride equals to 1 in dilated convolution operation....
[...]
...RFB module is to use several dilated convolutions of k×k kernel, dilated ratio equals to k, and stride equals to 1 to obtain a more comprehensive spatial coverage than ASPP....
[...]
...To sum up, an ordinary object detector is composed of several parts: • Input: Image, Patches, Image Pyramid • Backbones: VGG16 [68], ResNet-50 [26], SpineNet [12], EfficientNet-B0/B7 [75], CSPResNeXt50 [81], CSPDarknet53 [81] • Neck: • Additional blocks: SPP [25], ASPP [5], RFB [47], SAM [85] • Path-aggregation blocks: FPN [44], PAN [49], NAS-FPN [17], Fully-connected FPN, BiFPN [77], ASFF [48], SFAM [98] • Heads:: • Dense Prediction (one-stage): ◦ RPN [64], SSD [50], YOLO [61], RetinaNet [45] (anchor based) ◦ CornerNet [37], CenterNet [13], MatrixNet [60], FCOS [78] (anchor free) • Sparse Prediction (two-stage): ◦ Faster R-CNN [64], R-FCN [9], Mask R- CNN [23] (anchor based) ◦ RepPoints [87] (anchor free)...
[...]
...Common modules that can be used to enhance receptive field are SPP [25], ASPP [5], and RFB [47]....
[...]
...• Additional blocks: SPP [25], ASPP [5], RFB [47], SAM [85]...
[...]
4,327 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...We employ a pretrained residual network with the dilated strategy [3] as the backbone....
[...]
...First, Deeplabv2 [3] and Deeplabv3 [4] adopt atrous spatial pyramid pooling to embed contextual information, which consist of parallel dilated convolutions with different dilated rates....
[...]
...For example, some works [3, 4, 29] aggregate multi-scale contexts via combining feature maps generated by different dilated convolutions and pooling operations....
[...]
...Following [3], we adopt multi-loss on the end of the network when both two attention modules are used....
[...]
3,318 citations
References
334 citations
"DeepLab: Semantic Image Segmentatio..." refers background or methods in this paper
...In a different direction, [63] replace the bilateral filtering module used in mean field inference with a faster domain transform module [67], improving the speed and lowering the memory requirements of the overall system, while [18], [68] combine semantic segmentation with edge detection....
[...]
...This direction has been extended by several follow-up papers [17], [40], [58], [59], [60], [61], [62], [63], [65], since the first version of our work was published [38]....
[...]
...Multiple groups have made important advances, significantly raising the bar on the PASCAL VOC 2012 semantic segmentation benchmark, as reflected to the high level of activity in the benchmark’s leaderboard(1) [17], [40], [58], [59], [60], [61], [62], [63]....
[...]
331 citations
"DeepLab: Semantic Image Segmentatio..." refers background in this paper
...We refer the interested reader to [74] for early references from the wavelet literature....
[...]
324 citations
"DeepLab: Semantic Image Segmentatio..." refers background or methods in this paper
...While we employ the CRF as a post-processing method, [40], [59], [62], [64], [65] have successfully pursued joint learning of the DCNN and CRF....
[...]
...This direction has been extended by several follow-up papers [17], [40], [58], [59], [60], [61], [62], [63], [65], since the first version of our work was published [38]....
[...]
...In particular, [59], [65] unroll the CRF mean-field inference steps to convert the whole system into an end-to-end trainable feed-forward network, while [62] approximates one iteration of the dense CRF mean field inference [22] by convolutional layers with learnable filters....
[...]
312 citations
285 citations
"DeepLab: Semantic Image Segmentatio..." refers methods in this paper
...Interestingly, the atrous convolution technique has also been adopted for a broader set of tasks, such as object detection [12], [77], instance-level segmentation [78], visual question answering [79], and optical flow [80]....
[...]