Mining Deep Semantic Representations for Scene Classification of High-Resolution Remote Sensing Imagery

doi:10.1109/TBDATA.2019.2916880

Journal ArticleDOI

Mining Deep Semantic Representations for Scene Classification of High-Resolution Remote Sensing Imagery

Fan Hu, +3 more

- 01 Sep 2020 -

IEEE Transactions on Big Data

- Vol. 6, Iss: 3, pp 522-536

TLDR

This paper proposes to build powerful semantic features using the probabilistic latent semantic analysis (pLSA) model, by employing the pre-trained deep convolutional neural networks (CNNs) as feature extractors rather than relying on the hand-crafted features.

Abstract:

Scene classification is one of the most fundamental task in interpretation of high-resolution remote sensing (HRRS) images. Many recent works show that the probabilistic topic models which are capable of mining latent semantics of images can be effectively applied to HRRS scene classification. However, the existing approaches based on topic models simply utilize low-level hand-crafted features to form semantic features, which severely limit the representative capability of the semantic features derived from topic models. To alleviate this problem, this paper propose to build powerful semantic features using the probabilistic latent semantic analysis (pLSA) model, by employing the pre-trained deep convolutional neural networks (CNNs) as feature extractors rather than relying on the hand-crafted features. Specifically, we develop two methods to generate semantic features, called multi-scale deep semantic representation (MSDS) and multi-level deep semantic representation (MLDS), by extracting CNN features from different layers: (1) in MSDS, the final semantic features are learned by the pLSA with multi-scale features extracted from the convolutional layer of a pre-trained CNN; (2) in MLDS, we extract CNN features for densely sampled image patches at different size level from the fully-connected layer of a pre-trained CNN, and concatenate the sematic features learned by the pLSA at each level. We comprehensively evaluate the two methods on two public HRRS scene datasets, and achieve significant performance improvement over the state-of-the-art. The outstanding results demonstrate that the pLSA model is capable of discovering considerably discriminative semantic features from the deep CNN features.

Mining Deep Semantic Representations for Scene Classification of High-Resolution Remote Sensing Imagery

Citations

Relation Network for Multilabel Aerial Image Classification

SCViT: A Spatial-Channel Feature Preserving Vision Transformer for Remote Sensing Image Scene Classification

Deep Feature Aggregation Framework Driven by Graph Convolutional Network for Scene Classification in Remote Sensing

Object-Scale Adaptive Convolutional Neural Networks for High-Spatial Resolution Remote Sensing Image Classification

Two-stream feature aggregation deep neural network for scene classification of remote sensing images

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet: A large-scale hierarchical image database

Distinctive Image Features from Scale-Invariant Keypoints

Related Papers (5)

Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery

Integrating Multilayer Features of Convolutional Neural Networks for Remote Sensing Scene Classification

Adaptive Deep Sparse Semantic Modeling Framework for High Spatial Resolution Image Scene Classification

A Feature Aggregation Convolutional Neural Network for Remote Sensing Scene Classification

A Deep Neural Network Combined CNN and GCN for Remote Sensing Scene Classification