DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Citations
10,189 citations
7,113 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...We refer interested readers to [9] for more details....
[...]
...Spatial pyramid pooling: Models, such as PSPNet [81] or DeepLab [9, 10], perform spatial pyramid pooling [23, 40] at several grid scales (including image-level pooling [47]) or apply several parallel atrous convolution with different rates (called Atrous Spatial Pyramid Pooling, or ASPP)....
[...]
...In this subsection, we evaluate the segmentation accuracy with the trimap experiment [36, 37, 9] to quantify the accuracy of the proposed decoder module near object boundaries....
[...]
..., image pyramid) [18, 16, 58, 44, 11, 9] or those that adopt probabilistic graphical models (such as DenseCRF [37] with efficient inference algorithm [2]) [8, 4, 82, 44, 48, 55, 63, 34, 72, 6, 7, 9]....
[...]
5,709 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...The difference in operation between ASPP [5] module and improved SPP module is mainly from the original k×k kernel size, max-pooling of stride equals to 1 to several 3 × 3 kernel size, dilated ratio equals to k, and stride equals to 1 in dilated convolution operation....
[...]
...RFB module is to use several dilated convolutions of k×k kernel, dilated ratio equals to k, and stride equals to 1 to obtain a more comprehensive spatial coverage than ASPP....
[...]
...To sum up, an ordinary object detector is composed of several parts: • Input: Image, Patches, Image Pyramid • Backbones: VGG16 [68], ResNet-50 [26], SpineNet [12], EfficientNet-B0/B7 [75], CSPResNeXt50 [81], CSPDarknet53 [81] • Neck: • Additional blocks: SPP [25], ASPP [5], RFB [47], SAM [85] • Path-aggregation blocks: FPN [44], PAN [49], NAS-FPN [17], Fully-connected FPN, BiFPN [77], ASFF [48], SFAM [98] • Heads:: • Dense Prediction (one-stage): ◦ RPN [64], SSD [50], YOLO [61], RetinaNet [45] (anchor based) ◦ CornerNet [37], CenterNet [13], MatrixNet [60], FCOS [78] (anchor free) • Sparse Prediction (two-stage): ◦ Faster R-CNN [64], R-FCN [9], Mask R- CNN [23] (anchor based) ◦ RepPoints [87] (anchor free)...
[...]
...Common modules that can be used to enhance receptive field are SPP [25], ASPP [5], and RFB [47]....
[...]
...• Additional blocks: SPP [25], ASPP [5], RFB [47], SAM [85]...
[...]
4,327 citations
Cites background or methods from "DeepLab: Semantic Image Segmentatio..."
...We employ a pretrained residual network with the dilated strategy [3] as the backbone....
[...]
...First, Deeplabv2 [3] and Deeplabv3 [4] adopt atrous spatial pyramid pooling to embed contextual information, which consist of parallel dilated convolutions with different dilated rates....
[...]
...For example, some works [3, 4, 29] aggregate multi-scale contexts via combining feature maps generated by different dilated convolutions and pooling operations....
[...]
...Following [3], we adopt multi-loss on the end of the network when both two attention modules are used....
[...]
3,318 citations
References
123,388 citations
73,978 citations
49,914 citations
42,067 citations
40,257 citations