scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Geoscience and Remote Sensing in 2018"


Journal ArticleDOI
TL;DR: An end-to-end spectral–spatial residual network that takes raw 3-D cubes as input data without feature engineering for hyperspectral image classification and achieves the state-of-the-art HSI classification accuracy in agricultural, rural–urban, and urban data sets.
Abstract: In this paper, we designed an end-to-end spectral–spatial residual network (SSRN) that takes raw 3-D cubes as input data without feature engineering for hyperspectral image classification. In this network, the spectral and spatial residual blocks consecutively learn discriminative features from abundant spectral signatures and spatial contexts in hyperspectral imagery (HSI). The proposed SSRN is a supervised deep learning framework that alleviates the declining-accuracy phenomenon of other deep learning models. Specifically, the residual blocks connect every other 3-D convolutional layer through identity mapping, which facilitates the backpropagation of gradients. Furthermore, we impose batch normalization on every convolutional layer to regularize the learning process and improve the classification performance of trained models. Quantitative and qualitative results demonstrate that the SSRN achieved the state-of-the-art HSI classification accuracy in agricultural, rural–urban, and urban data sets: Indian Pines, Kennedy Space Center, and University of Pavia.

1,105 citations


Journal ArticleDOI
TL;DR: This paper proposes a simple but effective method to learn discriminative CNNs (D-CNNs) to boost the performance of remote sensing image scene classification and comprehensively evaluates the proposed method on three publicly available benchmark data sets using three off-the-shelf CNN models.
Abstract: Remote sensing image scene classification is an active and challenging task driven by many applications. More recently, with the advances of deep learning models especially convolutional neural networks (CNNs), the performance of remote sensing image scene classification has been significantly improved due to the powerful feature representations learnt through CNNs. Although great success has been obtained so far, the problems of within-class diversity and between-class similarity are still two big challenges. To address these problems, in this paper, we propose a simple but effective method to learn discriminative CNNs (D-CNNs) to boost the performance of remote sensing image scene classification. Different from the traditional CNN models that minimize only the cross entropy loss, our proposed D-CNN models are trained by optimizing a new discriminative objective function. To this end, apart from minimizing the classification error, we also explicitly impose a metric learning regularization term on the CNN features. The metric learning regularization enforces the D-CNN models to be more discriminative so that, in the new D-CNN feature spaces, the images from the same scene class are mapped closely to each other and the images of different classes are mapped as farther apart as possible. In the experiments, we comprehensively evaluate the proposed method on three publicly available benchmark data sets using three off-the-shelf CNN models. Experimental results demonstrate that our proposed D-CNN methods outperform the existing baseline methods and achieve state-of-the-art results on all three data sets.

1,001 citations


Journal ArticleDOI
TL;DR: The usefulness and effectiveness of GAN for classification of hyperspectral images (HSIs) are explored for the first time and the proposed models provide competitive results compared to the state-of-the-art methods.
Abstract: A generative adversarial network (GAN) usually contains a generative network and a discriminative network in competition with each other. The GAN has shown its capability in a variety of applications. In this paper, the usefulness and effectiveness of GAN for classification of hyperspectral images (HSIs) are explored for the first time. In the proposed GAN, a convolutional neural network (CNN) is designed to discriminate the inputs and another CNN is used to generate so-called fake inputs. The aforementioned CNNs are trained together: the generative CNN tries to generate fake inputs that are as real as possible, and the discriminative CNN tries to classify the real and fake inputs. This kind of adversarial training improves the generalization capability of the discriminative CNN, which is really important when the training samples are limited. Specifically, we propose two schemes: 1) a well-designed 1D-GAN as a spectral classifier and 2) a robust 3D-GAN as a spectral–spatial classifier. Furthermore, the generated adversarial samples are used with real training samples to fine-tune the discriminative CNN, which improves the final classification performance. The proposed classifiers are carried out on three widely used hyperspectral data sets: Salinas, Indiana Pines, and Kennedy Space Center. The obtained results reveal that the proposed models provide competitive results compared to the state-of-the-art methods. In addition, the proposed GANs open new opportunities in the remote sensing community for the challenging task of HSI classification and also reveal the huge potential of GAN-based methods for the analysis of such complex and inherently nonlinear data.

501 citations


Journal ArticleDOI
TL;DR: A concept of spatial dependency system that involves pixel dependency and label dependency, with two main factors: neighborhood covering and neighborhood importance is developed, and several representative spectral–spatial classification methods are applied on real-world hyperspectral data.
Abstract: Imaging spectroscopy, also known as hyperspectral imaging, has been transformed in the last four decades from being a sparse research tool into a commodity product available to a broad user community. Specially, in the last 10 years, a large number of new techniques able to take into account the special properties of hyperspectral data have been introduced for hyperspectral data processing, where hyperspectral image classification, as one of the most active topics, has drawn massive attentions. Spectral–spatial hyperspectral image classification can achieve better classification performance than its pixel-wise counterpart, since the former utilizes not only the information of spectral signature but also that from spatial domain. In this paper, we provide a comprehensive overview on the methods belonging to the category of spectral–spatial classification in a relatively unified context. First, we develop a concept of spatial dependency system that involves pixel dependency and label dependency, with two main factors: neighborhood covering and neighborhood importance. In terms of the way that the neighborhood information is used, the spatial dependency systems can be classified into fixed, adaptive, and global systems, which can accommodate various kinds of existing spectral–spatial methods. Based on such, the categorizations of single-dependency, bilayer-dependency, and multiple-dependency systems are further introduced. Second, we categorize the performings of existing spectral–spatial methods into four paradigms according to the different fusion stages wherein spatial information takes effect, i.e., preprocessing-based, integrated, postprocessing-based, and hybrid classifications. Then, typical methodologies are outlined. Finally, several representative spectral–spatial classification methods are applied on real-world hyperspectral data in our experiments.

470 citations


Journal ArticleDOI
TL;DR: A deep feature fusion network (DFFN) is proposed for HSI classification that fuses the outputs of different hierarchical layers, which can further improve the classification accuracy and outperforms other competitive classifiers.
Abstract: Recently, deep learning has been introduced to classify hyperspectral images (HSIs) and achieved good performance. In general, deep models adopt a large number of hierarchical layers to extract features. However, excessively increasing network depth will result in some negative effects (e.g., overfitting, gradient vanishing, and accuracy degrading) for conventional convolutional neural networks. In addition, the previous networks used in HSI classification do not consider the strong complementary yet correlated information among different hierarchical layers. To address the above two issues, a deep feature fusion network (DFFN) is proposed for HSI classification. On the one hand, the residual learning is introduced to optimize several convolutional layers as the identity mapping, which can ease the training of deep network and benefit from increasing depth. As a result, we can build a very deep network to extract more discriminative features of HSIs. On the other hand, the proposed DFFN model fuses the outputs of different hierarchical layers, which can further improve the classification accuracy. Experimental results on three real HSIs demonstrate that the proposed method outperforms other competitive classifiers.

378 citations


Journal ArticleDOI
TL;DR: The classification fusion of hyperspectral imagery (HSI) and data from other multiple sensors, such as light detection and ranging (LiDAR) data, is investigated with the state-of-the-art deep learning, named the two-branch convolution neural network (CNN).
Abstract: As a list of remotely sensed data sources is available, how to efficiently exploit useful information from multisource data for better Earth observation becomes an interesting but challenging problem. In this paper, the classification fusion of hyperspectral imagery (HSI) and data from other multiple sensors, such as light detection and ranging (LiDAR) data, is investigated with the state-of-the-art deep learning, named the two-branch convolution neural network (CNN). More specific, a two-tunnel CNN framework is first developed to extract spectral-spatial features from HSI; besides, the CNN with cascade block is designed for feature extraction from LiDAR or high-resolution visual image. In the feature fusion stage, the spatial and spectral features of HSI are first integrated in a dual-tunnel branch, and then combined with other data features extracted from a cascade network. Experimental results based on several multisource data demonstrate the proposed two-branch CNN that can achieve more excellent classification performance than some existing methods.

373 citations


Journal ArticleDOI
TL;DR: The aim of this paper is first to explore the performance of DL architectures for the RS hyperspectral data set classification and second to introduce a new 3-D DL approach that enables a joint spectral and spatial information process.
Abstract: Recently, a variety of approaches have been enriching the field of remote sensing (RS) image processing and analysis. Unfortunately, existing methods remain limited to the rich spatiospectral content of today’s large data sets. It would seem intriguing to resort to deep learning (DL)-based approaches at this stage with regard to their ability to offer accurate semantic interpretation of the data. However, the specificity introduced by the coexistence of spectral and spatial content in the RS data sets widens the scope of the challenges presented to adapt DL methods to these contexts. Therefore, the aim of this paper is first to explore the performance of DL architectures for the RS hyperspectral data set classification and second to introduce a new 3-D DL approach that enables a joint spectral and spatial information process. A set of 3-D schemes is proposed and evaluated. Experimental results based on well-known hyperspectral data sets demonstrate that the proposed method is able to achieve a better classification rate than state-of-the-art methods with lower computational costs.

365 citations


Journal ArticleDOI
TL;DR: This paper advocates four new deep learning models, namely, 2-D convolutional neural network, 3-D-CNN, recurrent 2- D CNN, recurrent R-2-D CNN, and recurrent 3- D-CNN for hyperspectral image classification.
Abstract: Deep learning has achieved great successes in conventional computer vision tasks. In this paper, we exploit deep learning techniques to address the hyperspectral image classification problem. In contrast to conventional computer vision tasks that only examine the spatial context, our proposed method can exploit both spatial context and spectral correlation to enhance hyperspectral image classification. In particular, we advocate four new deep learning models, namely, 2-D convolutional neural network (2-D-CNN), 3-D-CNN, recurrent 2-D CNN (R-2-D-CNN), and recurrent 3-D-CNN (R-3-D-CNN) for hyperspectral image classification. We conducted rigorous experiments based on six publicly available data sets. Through a comparative evaluation with other state-of-the-art methods, our experimental results confirm the superiority of the proposed deep learning models, especially the R-3-D-CNN and the R-2-D-CNN deep learning models.

307 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel deep-learning-based object detection framework including region proposal network (RPN) and local-contextual feature fusion network designed for remote sensing images that can deal with the multiangle and multiscale characteristics of geospatial objects.
Abstract: Most of the existing deep-learning-based methods are difficult to effectively deal with the challenges faced for geospatial object detection such as rotation variations and appearance ambiguity. To address these problems, this paper proposes a novel deep-learning-based object detection framework including region proposal network (RPN) and local-contextual feature fusion network designed for remote sensing images. Specifically, the RPN includes additional multiangle anchors besides the conventional multiscale and multiaspect-ratio ones, and thus can deal with the multiangle and multiscale characteristics of geospatial objects. To address the appearance ambiguity problem, we propose a double-channel feature fusion network that can learn local and contextual properties along two independent pathways. The two kinds of features are later combined in the final layers of processing in order to form a powerful joint representation. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.

296 citations


Journal ArticleDOI
Qiang Zhang1, Qiangqiang Yuan1, Chao Zeng1, Xinghua Li1, Yancong Wei1 
TL;DR: In this paper, a unified spatial-temporal-spectral framework based on a deep convolutional neural network (CNN) was proposed for missing information reconstruction in remote sensing images.
Abstract: Because of the internal malfunction of satellite sensors and poor atmospheric conditions such as thick cloud, the acquired remote sensing data often suffer from missing information, i.e., the data usability is greatly reduced. In this paper, a novel method of missing information reconstruction in remote sensing images is proposed. The unified spatial–temporal–spectral framework based on a deep convolutional neural network (CNN) employs a unified deep CNN combined with spatial–temporal–spectral supplementary information. In addition, to address the fact that most methods can only deal with a single missing information reconstruction task, the proposed approach can solve three typical missing information reconstruction tasks: 1) dead lines in Aqua Moderate Resolution Imaging Spectroradiometer band 6; 2) the Landsat Enhanced Thematic Mapper Plus scan line corrector-off problem; and 3) thick cloud removal. It should be noted that the proposed model can use multisource data (spatial, spectral, and temporal) as the input of the unified framework. The results of both simulated and real-data experiments demonstrate that the proposed model exhibits high effectiveness in the three missing information reconstruction tasks listed above.

260 citations


Journal ArticleDOI
TL;DR: A band grouping-based long short-term memory model and a multiscale convolutional neural network are proposed as the spectral and spatial feature extractors, respectively, for the hyperspectral image (HSI) classification.
Abstract: In this paper, we propose a spectral–spatial unified network (SSUN) with an end-to-end architecture for the hyperspectral image (HSI) classification. Different from traditional spectral–spatial classification frameworks where the spectral feature extraction (FE), spatial FE, and classifier training are separated, these processes are integrated into a unified network in our model. In this way, both FE and classifier training will share a uniform objective function and all the parameters in the network can be optimized at the same time. In the implementation of the SSUN, we propose a band grouping-based long short-term memory model and a multiscale convolutional neural network as the spectral and spatial feature extractors, respectively. In the experiments, three benchmark HSIs are utilized to evaluate the performance of the proposed method. The experimental results demonstrate that the SSUN can yield a competitive performance compared with existing methods.

Journal ArticleDOI
TL;DR: A simple yet effective method to extract hierarchical deep spatial feature for HSI classification by exploring the power of off-the-shelf CNN models, without any additional retraining or fine-tuning on the target data set is proposed.
Abstract: Hyperspectral image (HSI) classification is an active and important research task driven by many practical applications. To leverage deep learning models especially convolutional neural networks (CNNs) for HSI classification, this paper proposes a simple yet effective method to extract hierarchical deep spatial feature for HSI classification by exploring the power of off-the-shelf CNN models, without any additional retraining or fine-tuning on the target data set. To obtain better classification accuracy, we further propose a unified metric learning-based framework to alternately learn discriminative spectral–spatial features, which have better representation capability and train support vector machine (SVM) classifiers. To this end, we design a new objective function that explicitly embeds a metric learning regularization term into SVM training. The metric learning regularization term is used to learn a powerful spectral–spatial feature representation by fusing spectral feature and deep spatial feature, which has small intraclass scatter but big between class separation. By transforming HSI data into new spectral–spatial feature space through CNN and metric learning, we can pull the pixels from the same class closer, while pushing the different class pixels farther away. In the experiments, we comprehensively evaluate the proposed method on three commonly used HSI benchmark data sets. State-of-the-art results are achieved when compared with the existing HSI classification methods.

Journal ArticleDOI
TL;DR: This paper proposes a simple yet surprisingly effective approach, termed as guided locality preserving matching, for robust feature matching of remote sensing images, and formulate it into a mathematical model, and derive a simple closed-form solution with linearithmic time and linear space complexities.
Abstract: Feature matching, which refers to establishing reliable correspondences between two sets of feature points, is a critical prerequisite in feature-based image registration. This paper proposes a simple yet surprisingly effective approach, termed as guided locality preserving matching, for robust feature matching of remote sensing images. The key idea of our approach is merely to preserve the neighborhood structures of potential true matches between two images. We formulate it into a mathematical model, and derive a simple closed-form solution with linearithmic time and linear space complexities. This enables our method to accomplish the mismatch removal from thousands of putative correspondences in only a few milliseconds. To handle extremely large proportions of outliers, we further design a guided matching strategy based on the proposed method, using the matching result on a small putative set with a high inlier ratio to guide the matching on a large putative set. This strategy can also significantly boost the true matches without sacrifice in accuracy. Experiments on various real remote sensing image pairs demonstrate the generality of our method for handling both rigid and nonrigid image deformations, and it is more than two orders of magnitude faster than the state-of-the-art methods with better accuracy, making it practical for real-time applications.

Journal ArticleDOI
TL;DR: A novel network architecture, fully Conv–Deconv network for unsupervised spectral–spatial feature learning of hyperspectral images, which is able to be trained in an end-to-end manner and an in-depth investigation of learned features is introduced.
Abstract: Supervised approaches classify input data using a set of representative samples for each class, known as training samples . The collection of such samples is expensive and time demanding. Hence, unsupervised feature learning, which has a quick access to arbitrary amounts of unlabeled data, is conceptually of high interest. In this paper, we propose a novel network architecture, fully Conv–Deconv network, for unsupervised spectral–spatial feature learning of hyperspectral images, which is able to be trained in an end-to-end manner. Specifically, our network is based on the so-called encoder–decoder paradigm, i.e., the input 3-D hyperspectral patch is first transformed into a typically lower dimensional space via a convolutional subnetwork (encoder), and then expanded to reproduce the initial data by a deconvolutional subnetwork (decoder). However, during the experiment, we found that such a network is not easy to be optimized. To address this problem, we refine the proposed network architecture by incorporating: 1) residual learning and 2) a new unpooling operation that can use memorized max-pooling indexes. Moreover, to understand the “black box,” we make an in-depth study of the learned feature maps in the experimental analysis. A very interesting discovery is that some specific “neurons” in the first residual block of the proposed network own good description power for semantic visual patterns in the object level, which provide an opportunity to achieve “free” object detection. This paper, for the first time in the remote sensing community, proposes an end-to-end fully Conv–Deconv network for unsupervised spectral–spatial feature learning. Moreover, this paper also introduces an in-depth investigation of learned features. Experimental results on two widely used hyperspectral data, Indian Pines and Pavia University, demonstrate competitive performance obtained by the proposed methodology compared with other studied approaches.

Journal ArticleDOI
TL;DR: A target-adaptive usage modality that ensures a very good performance also in the presence of a mismatch with respect to the training set and even across different sensors is proposed.
Abstract: We recently proposed a convolutional neural network (CNN) for remote sensing image pansharpening obtaining a significant performance gain over the state of the art. In this paper, we explore a number of architectural and training variations to this baseline, achieving further performance gains with a lightweight network that trains very fast. Leveraging on this latter property, we propose a target-adaptive usage modality that ensures a very good performance also in the presence of a mismatch with respect to the training set and even across different sensors. The proposed method, published online as an off-the-shelf software tool, allows users to perform fast and high-quality CNN-based pansharpening of their own target images on general-purpose hardware.

Journal ArticleDOI
TL;DR: Extensive experiments show that the proposed remote sensing image retrieval approach based on DHNNs can remarkably outperform state-of-the-art methods under both of the examined conditions.
Abstract: As one of the most challenging tasks of remote sensing big data mining, large-scale remote sensing image retrieval has attracted increasing attention from researchers. Existing large-scale remote sensing image retrieval approaches are generally implemented by using hashing learning methods, which take handcrafted features as inputs and map the high-dimensional feature vector to the low-dimensional binary feature vector to reduce feature-searching complexity levels. As a means of applying the merits of deep learning, this paper proposes a novel large-scale remote sensing image retrieval approach based on deep hashing neural networks (DHNNs). More specifically, DHNNs are composed of deep feature learning neural networks and hashing learning neural networks and can be optimized in an end-to-end manner. Rather than requiring to dedicate expertise and effort to the design of feature descriptors, we can automatically learn good feature extraction operations and feature hashing mapping under the supervision of labeled samples. To broaden the application field, DHNNs are evaluated under two representative remote sensing cases: scarce and sufficient labeled samples. To make up for a lack of labeled samples, DHNNs can be trained via transfer learning for the former case. For the latter case, DHNNs can be trained via supervised learning from scratch with the aid of a vast number of labeled samples. Extensive experiments on one public remote sensing image data set with a limited number of labeled samples and on another public data set with plenty of labeled samples show that the proposed remote sensing image retrieval approach based on DHNNs can remarkably outperform state-of-the-art methods under both of the examined conditions.

Journal ArticleDOI
TL;DR: A new AL-guided classification model is developed that exploits both the spectral information and the spatial-contextual information in the hyperspectral data that makes use of recently developed Bayesian CNNs.
Abstract: Hyperspectral imaging is a widely used technique in remote sensing in which an imaging spectrometer collects hundreds of images (at different wavelength channels) for the same area on the surface of the earth. In the last two decades, several methods (unsupervised, supervised, and semisupervised) have been proposed to deal with the hyperspectral image classification problem. Supervised techniques have been generally more popular, despite the fact that it is difficult to collect labeled samples in real scenarios. In particular, deep neural networks, such as convolutional neural networks (CNNs), have recently shown a great potential to yield high performance in the hyperspectral image classification. However, these techniques require sufficient labeled samples in order to perform properly and generalize well. Obtaining labeled data is expensive and time consuming, and the high dimensionality of hyperspectral data makes it difficult to design classifiers based on limited samples (for instance, CNNs overfit quickly with small training sets). Active learning (AL) can deal with this problem by training the model with a small set of labeled samples that is reinforced by the acquisition of new unlabeled samples. In this paper, we develop a new AL-guided classification model that exploits both the spectral information and the spatial-contextual information in the hyperspectral data. The proposed model makes use of recently developed Bayesian CNNs. Our newly developed technique provides robust classification results when compared with other state-of-the-art techniques for hyperspectral image classification.

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed multilayer stacked covariance pooling method can not only consistently outperform the corresponding single-layer model but also achieve better classification performance than other pretrained CNN-based scene classification methods.
Abstract: This paper proposes a new method, called multilayer stacked covariance pooling (MSCP), for remote sensing scene classification The innovative contribution of the proposed method is that it is able to naturally combine multilayer feature maps, obtained by pretrained convolutional neural network (CNN) models Specifically, the proposed MSCP-based classification framework consists of the following three steps First, a pretrained CNN model is used to extract multilayer feature maps Then, the feature maps are stacked together, and a covariance matrix is calculated for the stacked features Each entry of the resulting covariance matrix stands for the covariance of two different feature maps, which provides a natural and innovative way to exploit the complementary information provided by feature maps coming from different layers Finally, the extracted covariance matrices are used as features for classification by a support vector machine The experimental results, conducted on three challenging data sets, demonstrate that the proposed MSCP method can not only consistently outperform the corresponding single-layer model but also achieve better classification performance than other pretrained CNN-based scene classification methods

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed feature extraction method in conjunction with a linear SVM classifier can obtain better classification performance than that of the conventional methods.
Abstract: Hyperspectral image classification has become a research focus in recent literature. However, well-designed features are still open issues that impact on the performance of classifiers. In this paper, a novel supervised deep feature extraction method based on siamese convolutional neural network (S-CNN) is proposed to improve the performance of hyperspectral image classification. First, a CNN with five layers is designed to directly extract deep features from hyperspectral cube, where the CNN can be intended as a nonlinear transformation function. Then, the siamese network composed by two CNNs is trained to learn features that show a low intraclass and high interclass variability. The important characteristic of the presented approach is that the S-CNN is supervised with a margin ranking loss function, which can extract more discriminative features for classification tasks. To demonstrate the effectiveness of the proposed feature extraction method, the features extracted from three widely used hyperspectral data sets are fed into a linear support vector machine (SVM) classifier. The experimental results demonstrate that the proposed feature extraction method in conjunction with a linear SVM classifier can obtain better classification performance than that of the conventional methods.

Journal ArticleDOI
TL;DR: A large-scale aerial image data set is constructed for remote sensing image caption and extensive experiments demonstrate that the content of theRemote sensing image can be completely described by generating language descriptions.
Abstract: Inspired by recent development of artificial satellite, remote sensing images have attracted extensive attention. Recently, notable progress has been made in scene classification and target detection. However, it is still not clear how to describe the remote sensing image content with accurate and concise sentences. In this paper, we investigate to describe the remote sensing images with accurate and flexible sentences. First, some annotated instructions are presented to better describe the remote sensing images considering the special characteristics of remote sensing images. Second, in order to exhaustively exploit the contents of remote sensing images, a large-scale aerial image data set is constructed for remote sensing image caption. Finally, a comprehensive review is presented on the proposed data set to fully advance the task of remote sensing caption. Extensive experiments on the proposed data set demonstrate that the content of the remote sensing image can be completely described by generating language descriptions. The data set is available at https://github.com/201528014227051/RSICD_optimal .

Journal ArticleDOI
TL;DR: A new approach to do SAR ATR, in which a multiview deep learning framework was employed, which is able to achieve a superior recognition performance, and requires only a small number of raw SAR images for network training samples generation.
Abstract: It is a feasible and promising way to utilize deep neural networks to learn and extract valuable features from synthetic aperture radar (SAR) images for SAR automatic target recognition (ATR). However, it is too difficult to effectively train the deep neural networks with limited raw SAR images. In this paper, we propose a new approach to do SAR ATR, in which a multiview deep learning framework was employed. Based on the multiview SAR ATR pattern, we first present a flexible mean to generate adequate multiview SAR data, which can guarantee a large amount of inputs for network training without needing many raw SAR images. Then, a unique deep convolutional neural network containing a parallel network topology with multiple inputs is adopted. The features of input SAR images from different views will be learned by the proposed network layer by layer; meanwhile, the learned features from the distinct views are fused in different layers progressively. Therefore, the proposed framework is able to achieve a superior recognition performance, and requires only a small number of raw SAR images for network training samples generation. Experimental results have shown the superiority of the proposed framework based on the Moving and Stationary Target Acquisition and Recognition data set.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a new clustering-based band selection framework to solve the problem of suboptimal solutions toward a specific objective function, which can obtain the optimal clustering result for a particular form of objective under a reasonable constraint.
Abstract: Band selection, by choosing a set of representative bands in a hyperspectral image, is an effective method to reduce the redundant information without compromising the original contents. Recently, various unsupervised band selection methods have been proposed, but most of them are based on approximation algorithms which can only obtain suboptimal solutions toward a specific objective function. This paper focuses on clustering-based band selection and proposes a new framework to solve the above dilemma, claiming the following contributions: 1) an optimal clustering framework, which can obtain the optimal clustering result for a particular form of objective function under a reasonable constraint; 2) a rank on clusters strategy, which provides an effective criterion to select bands on existing clustering structure; and 3) an automatic method to determine the number of the required bands, which can better evaluate the distinctive information produced by certain number of bands. In experiments, the proposed algorithm is compared with some state-of-the-art competitors. According to the experimental results, the proposed algorithm is robust and significantly outperforms the other methods on various data sets.

Journal ArticleDOI
TL;DR: This paper investigates the use of micro-Doppler signatures retrieved from a low-power radar device to identify a set of persons based on their gait characteristics and proposes a robust feature learning approach based on deep convolutional neural networks.
Abstract: Contemporary surveillance systems mainly use video cameras as their primary sensor. However, video cameras possess fundamental deficiencies, such as the inability to handle low-light environments, poor weather conditions, and concealing clothing. In contrast, radar devices are able to sense in pitch-dark environments and to see through walls. In this paper, we investigate the use of micro-Doppler (MD) signatures retrieved from a low-power radar device to identify a set of persons based on their gait characteristics. To that end, we propose a robust feature learning approach based on deep convolutional neural networks. Given that we aim at providing a solution for a real-world problem, people are allowed to walk around freely in two different rooms. In this setting, the IDentification with Radar data data set is constructed and published, consisting of 150 min of annotated MD data equally spread over five targets. Through experiments, we investigate the effectiveness of both the Doppler and time dimension, showing that our approach achieves a classification error rate of 24.70% on the validation set and 21.54% on the test set for the five targets used. When experimenting with larger time windows, we are able to further lower the error rate.

Journal ArticleDOI
TL;DR: In this paper, a superpixelwise PCA (SuperPCA) approach is proposed to learn the intrinsic low-dimensional features of hyperspectral image (HSI) processing and analysis tasks.
Abstract: As an unsupervised dimensionality reduction method, the principal component analysis (PCA) has been widely considered as an efficient and effective preprocessing step for hyperspectral image (HSI) processing and analysis tasks. It takes each band as a whole and globally extracts the most representative bands. However, different homogeneous regions correspond to different objects, whose spectral features are diverse. Therefore, it is inappropriate to carry out dimensionality reduction through a unified projection for an entire HSI. In this paper, a simple but very effective superpixelwise PCA (SuperPCA) approach is proposed to learn the intrinsic low-dimensional features of HSIs. In contrast to classical PCA models, the SuperPCA has four main properties: 1) unlike the traditional PCA method based on a whole image, the SuperPCA takes into account the diversity in different homogeneous regions, that is, different regions should have different projections; 2) most of the conventional feature extraction models cannot directly use the spatial information of HSIs, while the SuperPCA is able to incorporate the spatial context information into the unsupervised dimensionality reduction by superpixel segmentation; 3) since the regions obtained by superpixel segmentation have homogeneity, the SuperPCA can extract potential low-dimensional features even under noise; and 4) although the SuperPCA is an unsupervised method, it can achieve a competitive performance when compared with supervised approaches. The resulting features are discriminative, compact, and noise-resistant, leading to an improved HSI classification performance. Experiments on three public data sets demonstrate that the SuperPCA model significantly outperforms the conventional PCA-based dimensionality reduction baselines for HSI classification, and some state-of-the-art feature extraction approaches. The MATLAB source code is available at https://github.com/junjun-jiang/SuperPCA .

Journal ArticleDOI
TL;DR: A semisupervised graph-theoretic method in the framework of multilabel RS image retrieval problems that retrieves images similar to a given query image by a subgraph matching strategy and shows effectiveness when compared with the state-of-the-art RS content-based image retrieval methods.
Abstract: Conventional supervised content-based remote sensing (RS) image retrieval systems require a large number of already annotated images to train a classifier for obtaining high retrieval accuracy. Most systems assume that each training image is annotated by a single label associated to the most significant semantic content of the image. However, this assumption does not fit well with the complexity of RS images, where an image might have multiple land-cover classes (i.e., multilabels). Moreover, annotating images with multilabels is costly and time consuming. To address these issues, in this paper, we introduce a semisupervised graph-theoretic method in the framework of multilabel RS image retrieval problems. The proposed method is based on four main steps. The first step segments each image in the archive and extracts the features of each region. The second step constructs an image neighborhood graph and uses a correlated label propagation algorithm to automatically assign a set of labels to each image in the archive by exploiting only a small number of training images annotated with multilabels. The third step associates class labels with image regions by a novel region labeling strategy, whereas the final step retrieves the images similar to a given query image by a subgraph matching strategy. Experiments carried out on an archive of aerial images show the effectiveness of the proposed method when compared with the state-of-the-art RS content-based image retrieval methods.

Journal ArticleDOI
TL;DR: A novel local covariance matrix (CM) representation method is proposed to fully characterize the correlation among different spectral bands and the spatial–contextual information in the scene when conducting feature extraction from hyperspectral images (HSIs).
Abstract: In this paper, a novel local covariance matrix (CM) representation method is proposed to fully characterize the correlation among different spectral bands and the spatial–contextual information in the scene when conducting feature extraction (FE) from hyperspectral images (HSIs). Specifically, our method first projects the HSI into a subspace, using the maximum noise fraction method. Then, for each test pixel in the subspace, its most similar neighboring pixels (within a local spatial window) are clustered using the cosine distance measurement. The test pixel and its neighbors are used to calculate a local CM for FE purposes. Each nondiagonal entry in the matrix characterizes the correlation between different spectral bands. Finally, these matrices are used as spatial–spectral features and fed to a support vector machine for classification purposes. The proposed method offers a new strategy to characterize the spatial–spectral information in the HSI prior to classification. Experimental results have been conducted using three publicly available hyperspectral data sets for classification, indicating that the proposed method can outperform several state-of-the-art techniques, especially when the training samples available are limited.

Journal ArticleDOI
TL;DR: A multiscale deep feature learning method for high-resolution satellite image scene classification by warp the original satellite image into multiple different scales and developing a multiple kernel learning method to automatically learn the optimal combination of such features.
Abstract: In this paper, we propose a multiscale deep feature learning method for high-resolution satellite image scene classification. Specifically, we first warp the original satellite image into multiple different scales. The images in each scale are employed to train a deep convolutional neural network (DCNN). However, simultaneously training multiple DCNNs is time-consuming. To address this issue, we explore DCNN with spatial pyramid pooling (SPP-net). Since different SPP-nets have the same number of parameters, which share the identical initial values, and only fine-tuning the parameters in fully connected layers ensures the effectiveness of each network, thereby greatly accelerating the training process. Then, the multiscale satellite images are fed into their corresponding SPP-nets, respectively, to extract multiscale deep features. Finally, a multiple kernel learning method is developed to automatically learn the optimal combination of such features. Experiments on two difficult data sets show that the proposed method achieves favorable performance compared with other state-of-the-art methods.

Journal ArticleDOI
TL;DR: This paper focuses on anomaly detection in hyperspectral images (HSIs) and proposes a novel detection algorithm based on spectral unmixing and dictionary-based low-rank decomposition, which achieves high detection rate while maintaining low false alarm rate regardless of the type of images tested.
Abstract: Anomaly detection has been known to be a challenging problem due to the uncertainty of anomaly and the interference of noise. In this paper, we focus on anomaly detection in hyperspectral images (HSIs) and propose a novel detection algorithm based on spectral unmixing and dictionary-based low-rank decomposition. The innovation is threefold. First, due to the highly mixed nature of pixels in HSI data, instead of using the raw pixel directly for anomaly detection, the proposed algorithm applies spectral unmixing to obtain the abundance vectors and uses these vectors for anomaly detection. We show that the abundance vectors possess more distinctive features to identify anomaly from background. Second, to better represent the highly correlated background and the sparse anomaly, we construct a dictionary based on the mean shift clustering of the abundance vectors to improve both the discriminative and representative powers of the algorithm. Finally, a low-rank matrix decomposition method based on the constructed dictionary is proposed to encourage the coefficients of the dictionary, instead of the background itself, to be low rank, and the residual matrix to be sparse. Anomalies can then be extracted by summing up the columns of the residual matrix. The proposed algorithm is evaluated on both synthetic and real data sets. Experimental results show that the proposed approach constantly achieves high detection rate, while maintaining low false alarm rate regardless of the type of images tested.

Journal ArticleDOI
TL;DR: This paper proposes a novel deep feature-based method to detect ships in very high-resolution optical remote sensing images by using a regional proposal network to generate ship candidates from feature maps produced by a deep convolutional neural network.
Abstract: Ship detection is an important and challenging task in remote sensing applications. Most methods utilize specially designed hand-crafted features to detect ships, and they usually work well only on one scale, which lack generalization and impractical to identify ships with various scales from multiresolution images. In this paper, we propose a novel deep feature-based method to detect ships in very high-resolution optical remote sensing images. In our method, a regional proposal network is used to generate ship candidates from feature maps produced by a deep convolutional neural network. To efficiently detect ships with various scales, a hierarchical selective filtering layer is proposed to map features in different scales to the same scale space. The proposed method is an end-to-end network that can detect both inshore and offshore ships ranging from dozens of pixels to thousands. We test our network on a large ship data set which will be released in the future, consisting of Google Earth images, GaoFen-2 images, and unmanned aerial vehicle data. Experiments demonstrate high precision and robustness of our method. Further experiments on aerial images show its good generalization to unseen scenes.

Journal ArticleDOI
TL;DR: A new spectral–spatial weighted sparse unmixing (S2WSU) framework, which uses both spectral and spatial weighting factors, further imposing sparsity on the solution, is developed.
Abstract: Spectral unmixing aims at estimating the fractional abundances of a set of pure spectral materials (endmembers) in each pixel of a hyperspectral image. The wide availability of large spectral libraries has fostered the role of sparse regression techniques in the task of characterizing mixed pixels in remotely sensed hyperspectral images. A general solution for sparse unmixing methods consists of using the $\ell _{1}$ regularizer to control the sparsity, resulting in a very promising performance but also suffering from sensitivity to large and small sparse coefficients. A recent trend to address this issue is to introduce weighting factors to penalize the nonzero coefficients in the unmixing solution. While most methods for this purpose focus on analyzing the hyperspectral data by considering the pixels as independent entities, it is known that there exists a strong spatial correlation among features in hyperspectral images. This information can be naturally exploited in order to improve the representation of pixels in the scene. In order to take advantage of the spatial information for hyperspectral unmixing, in this paper, we develop a new spectral–spatial weighted sparse unmixing (S2WSU) framework, which uses both spectral and spatial weighting factors, further imposing sparsity on the solution. Our experimental results, conducted using both simulated and real hyperspectral data sets, illustrate the good potential of the proposed S2WSU, which can greatly improve the abundance estimation results when compared with other advanced spectral unmixing methods.