Journal ArticleDOI
Matrix Capsule Convolutional Projection for Deep Feature Learning
Reads0
Chats0
TLDR
A matrix capsule convolution projection (MCCP) module is proposed by replacing the feature vector with a feature matrix, of which each column represents a local feature, and the CapDetNet is designed to explore the structural information encoding of the MCCP module based on object detection task.Abstract:
Capsule projection network (CapProNet) has shown its ability to obtain semantic information, and spatial structural information from the raw images. However, the vector capsule of CapProNet has limitations in representing semantic information due to ignoring local information. Besides, the number of trainable parameters also increases greatly with the dimension of the feature vector. To that end, we propose a matrix capsule convolution projection (MCCP) module by replacing the feature vector with a feature matrix, of which each column represents a local feature. The feature matrix is then convoluted by columns into capsule subspaces to decrease the number of trainable parameters effectively. Furthermore, the CapDetNet is designed to explore the structural information encoding of the MCCP module based on object detection task. Experimental results demonstrate that the proposed MCCP outperforms the baselines in image classification, and CapDetNet achieves the 2.3% performance gain in object detection.read more
Citations
More filters
Journal ArticleDOI
Robust Federated Averaging via Outlier Pruning
TL;DR: This letter proposes a robust aggregation strategy that first pruning the node-wise outlier updates from the local trained models and then performing the aggregation on the selected effective weights-set at each node, which outperforms the state-of-the-art methods in terms of communication speedup, test-set performance and training convergence.
Proceedings ArticleDOI
SPNet: Utilizing Subspace Projection to Achieve Feature Interaction for Click-Through Rate
Xu Zhang,Z. Ou,Meina Song +2 more
TL;DR: Wang et al. as discussed by the authors proposed a novel subspace projection network (SPNet) to make all features interact with each other in one subspace by stacking multiple layers, which achieved state-of-the-art performance.
Posted Content
ASPCNet: A Deep Adaptive Spatial Pattern Capsule Network for Hyperspectral Image Classification.
Abstract: Previous studies have shown the great potential of capsule networks for the spatial contextual feature extraction from {hyperspectral images (HSIs)}. However, the sampling locations of the convolutional kernels of capsules are fixed and cannot be adaptively changed according to the inconsistent semantic information of HSIs. Based on this observation, this paper proposes an adaptive spatial pattern capsule network (ASPCNet) architecture by developing an adaptive spatial pattern (ASP) unit, that can rotate the sampling location of convolutional kernels on the basis of an enlarged receptive field. Note that this unit can learn more discriminative representations of HSIs with fewer parameters. Specifically, two cascaded ASP-based convolution operations (ASPConvs) are applied to input images to learn relatively high-level semantic features, transmitting hierarchical structures among capsules more accurately than the use of the most fundamental features. Furthermore, the semantic features are fed into ASP-based conv-capsule operations (ASPCaps) to explore the shapes of objects among the capsules in an adaptive manner, further exploring the potential of capsule networks. Finally, the class labels of image patches centered on test samples can be determined according to the fully connected capsule layer. Experiments on three public datasets demonstrate that ASPCNet can yield competitive performance with higher accuracies than state-of-the-art methods.
Book ChapterDOI
Learnable Subspace Orthogonal Projection for Semi-supervised Image Classification
TL;DR: Wang et al. as mentioned in this paper employed an auto-encoder to construct a scalable and learnable subspace orthogonal projection network, thus enjoying lower computational consumption of subspace acquisition and smooth cooperation with deep neural networks.
Posted Content
DPR-CAE: Capsule Autoencoder with Dynamic Part Representation for Image Parsing.
TL;DR: Zhang et al. as discussed by the authors proposed a simple and effective capsule autoencoder to address this issue, called DPR-CAE, which parses the input into a set of part capsules, including pose, intensity, and dynamic vector.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Proceedings ArticleDOI
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Proceedings ArticleDOI
Densely Connected Convolutional Networks
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
Related Papers (5)
Image robust recognition based on feature-entropy-oriented differential fusion capsule network
Deep convolutional neural network-based three-dimensional model retrieval algorithm
An Boqing,Shi Weifeng +1 more