scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Geoscience and Remote Sensing Letters in 2017"


Journal ArticleDOI
TL;DR: A multilevel DL architecture that targets land cover and crop type classification from multitemporal multisource satellite imagery outperforms the one with MLPs allowing us to better discriminate certain summer crop types.
Abstract: Deep learning (DL) is a powerful state-of-the-art technique for image processing including remote sensing (RS) images. This letter describes a multilevel DL architecture that targets land cover and crop type classification from multitemporal multisource satellite imagery. The pillars of the architecture are unsupervised neural network (NN) that is used for optical imagery segmentation and missing data restoration due to clouds and shadows, and an ensemble of supervised NNs. As basic supervised NN architecture, we use a traditional fully connected multilayer perceptron (MLP) and the most commonly used approach in RS community random forest, and compare them with convolutional NNs (CNNs). Experiments are carried out for the joint experiment of crop assessment and monitoring test site in Ukraine for classification of crops in a heterogeneous environment using nineteen multitemporal scenes acquired by Landsat-8 and Sentinel-1A RS satellites. The architecture with an ensemble of CNNs outperforms the one with MLPs allowing us to better discriminate certain summer crop types, in particular maize and soybeans, and yielding the target accuracies more than 85% for all major crops (wheat, maize, sunflower, soybeans, and sugar beet).

1,155 citations


Journal ArticleDOI
Yang Zhan1, Kun Fu1, Menglong Yan1, Xian Sun1, Hongqi Wang1, Xiaosong Qiu1 
TL;DR: A novel supervised change detection method based on a deep siamese convolutional network for optical aerial images that is comparable, even better, with the two state-of-the-art methods in terms of F-measure.
Abstract: In this letter, we propose a novel supervised change detection method based on a deep siamese convolutional network for optical aerial images. We train a siamese convolutional network using the weighted contrastive loss. The novelty of the method is that the siamese network is learned to extract features directly from the image pairs. Compared with hand-crafted features used by the conventional change detection method, the extracted features are more abstract and robust. Furthermore, because of the advantage of the weighted contrastive loss function, the features have a unique property: the feature vectors of the changed pixel pair are far away from each other, while the ones of the unchanged pixel pair are close. Therefore, we use the distance of the feature vectors to detect changes between the image pair. Simple threshold segmentation on the distance map can even obtain good performance. For improvement, we use a $k$ -nearest neighbor approach to update the initial result. Experimental results show that the proposed method produces results comparable, even better, with the two state-of-the-art methods in terms of F-measure.

402 citations


Journal ArticleDOI
TL;DR: Through both quantitative and visual assessments on a large number of high-quality MS images from various sources, it is confirmed that the proposed model is superior to all the mainstream algorithms included in the comparison, and achieves the highest spatial–spectral unified accuracy.
Abstract: In the field of multispectral (MS) and panchromatic image fusion (pansharpening), the impressive effectiveness of deep neural networks has recently been employed to overcome the drawbacks of the traditional linear models and boost the fusion accuracy However, the existing methods are mainly based on simple and flat networks with relatively shallow architectures, which severely limits their performance In this letter, the concept of residual learning is introduced to form a very deep convolutional neural network to make the full use of the high nonlinearity of the deep learning models Through both quantitative and visual assessments on a large number of high-quality MS images from various sources, it is confirmed that the proposed model is superior to all the mainstream algorithms included in the comparison, and achieves the highest spatial–spectral unified accuracy

393 citations


Journal ArticleDOI
TL;DR: This letter proposes a novel feature representation method for scene classification, named bag of convolutional features (BoCF), different from the traditional bag of visual words-based methods in which the visual words are usually obtained by using handcrafted feature descriptors, the proposed BoCF generates visual words from deep convolutionAL features using off-the-shelf Convolutional neural networks.
Abstract: More recently, remote sensing image classification has been moving from pixel-level interpretation to scene-level semantic understanding, which aims to label each scene image with a specific semantic class. While significant efforts have been made in developing various methods for remote sensing image scene classification, most of them rely on handcrafted features. In this letter, we propose a novel feature representation method for scene classification, named bag of convolutional features (BoCF). Different from the traditional bag of visual words-based methods in which the visual words are usually obtained by using handcrafted feature descriptors, the proposed BoCF generates visual words from deep convolutional features using off-the-shelf convolutional neural networks. Extensive evaluations on a publicly available remote sensing image scene classification benchmark and comparison with the state-of-the-art methods demonstrate the effectiveness of the proposed BoCF method for remote sensing image scene classification.

276 citations


Journal ArticleDOI
Qin Zhang1, Hui Wang1, Junyu Dong1, Guoqiang Zhong1, Xin Sun1 
TL;DR: This letter adopts long short-term memory (LSTM) to predict sea surface temperature (SST), and makes short- and long-term prediction, including weekly mean and monthly mean, and the model’s online updated characteristics are presented.
Abstract: This letter adopts long short-term memory (LSTM) to predict sea surface temperature (SST), and makes short-term prediction, including one day and three days, and long-term prediction, including weekly mean and monthly mean The SST prediction problem is formulated as a time series regression problem The proposed network architecture is composed of two kinds of layers: an LSTM layer and a full-connected dense layer The LSTM layer is utilized to model the time series relationship The full-connected layer is utilized to map the output of the LSTM layer to a final prediction The optimal setting of this architecture is explored by experiments and the accuracy of coastal seas of China is reported to confirm the effectiveness of the proposed method The prediction accuracy is also tested on the SST anomaly data In addition, the model’s online updated characteristics are presented

265 citations


Journal ArticleDOI
TL;DR: In this letter, a method using a 3-D convolutional neural network to fuse together multispectral and hyperspectral images to obtain a high resolution HS image is proposed.
Abstract: In this letter, we propose a method using a 3-D convolutional neural network to fuse together multispectral and hyperspectral (HS) images to obtain a high resolution HS image. Dimensionality reduction of the HS image is performed prior to fusion in order to significantly reduce the computational time and make the method more robust to noise. Experiments are performed on a data set simulated using a real HS image. The results obtained show that the proposed approach is very promising when compared with conventional methods. This is especially true when the HS image is corrupted by additive noise.

261 citations


Journal ArticleDOI
Wenping Ma1, Wen Zelian1, Yue Wu1, Licheng Jiao1, Maoguo Gong1, Yafei Zheng1, Liang Liu1 
TL;DR: A new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs and an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduction to increase the number of correct correspondences.
Abstract: The scale-invariant feature transform algorithm and its many variants are widely used in feature-based remote sensing image registration. However, it may be difficult to find enough correct correspondences for remote image pairs in some cases that exhibit a significant difference in intensity mapping. In this letter, a new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs. Then, an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduced to increase the number of correct correspondences. The proposed algorithm is tested on multispectral and multisensor remote sensing images. The experimental results show that the proposed method improves the matching performance compared with several state-of-the-art methods in terms of the number of correct correspondences and aligning accuracy.

243 citations


Journal ArticleDOI
TL;DR: Experimental performance demonstrates that the proposed anomaly detection framework with transferred deep convolutional neural network outperforms the classic Reed-Xiaoli and the state-of-the-art representation-based detectors, such as sparse representation-Based detector (SRD) and collaborative representation- based detector.
Abstract: In this letter, a novel anomaly detection framework with transferred deep convolutional neural network (CNN) is proposed. The framework is designed by considering the following facts: 1) a reference data with labeled samples are utilized, because no prior information is available about the image scene for anomaly detection and 2) pixel pairs are generated to enlarge the sample size, since the advantage of CNN can be realized only if the number of training samples is sufficient. A multilayer CNN is trained by using difference between pixel pairs generated from the reference image scene. Then, for each pixel in the image for anomaly detection, difference between pixel pairs, constructed by combining the center pixel and its surrounding pixels, is classified by the trained CNN with the result of similarity measurement. The detection output is simply generated by averaging these similarity scores. Experimental performance demonstrates that the proposed algorithm outperforms the classic Reed-Xiaoli and the state-of-the-art representation-based detectors, such as sparse representation-based detector (SRD) and collaborative representation-based detector.

226 citations


Journal ArticleDOI
TL;DR: A new strategy, which combines Gabor filters with convolutional filters, is proposed for hyperspectral image classification to mitigate the problem of overfitting and results reveal that the proposed model provides competitive results in terms of classification accuracy.
Abstract: Recently, the capability of deep learning-based approaches, especially deep convolutional neural networks (CNNs), has been investigated for hyperspectral remote sensing feature extraction (FE) and classification. Due to the large number of learnable parameters in convolutional filters, lots of training samples are needed in deep CNNs to avoid the overfitting problem. On the other hand, Gabor filtering can effectively extract spatial information including edges and textures, which may reduce the FE burden of the CNNs. In this letter, in order to make the most of deep CNN and Gabor filtering, a new strategy, which combines Gabor filters with convolutional filters, is proposed for hyperspectral image classification to mitigate the problem of overfitting. The obtained results reveal that the proposed model provides competitive results in terms of classification accuracy, especially when only a limited number of training samples are available.

205 citations


Journal ArticleDOI
TL;DR: This letter proposes a new single-image super-resolution algorithm named local–global combined networks (LGCNet) for remote sensing images based on the deep CNNs, elaborately designed with its “multifork” structure to learn multilevel representations ofRemote sensing images including both local details and global environmental priors.
Abstract: Super-resolution is an image processing technology that recovers a high-resolution image from a single or sequential low-resolution images Recently deep convolutional neural networks (CNNs) have made a huge breakthrough in many tasks including super-resolution In this letter, we propose a new single-image super-resolution algorithm named local–global combined networks (LGCNet) for remote sensing images based on the deep CNNs Our LGCNet is elaborately designed with its “multifork” structure to learn multilevel representations of remote sensing images including both local details and global environmental priors Experimental results on a public remote sensing data set (UC Merced) demonstrate an overall improvement of both accuracy and visual performance over several state-of-the-art algorithms

203 citations


Journal ArticleDOI
TL;DR: This letter investigates a variety of fusion techniques to blend multiple DCNN land cover classifiers into a single aggregate classifier, and uses DCNN cross-validation results for the input densities of fuzzy integrals followed by evolutionary optimization to produce state-of-the-art classification results.
Abstract: Deep convolutional neural networks (DCNNs) have recently emerged as a dominant paradigm for machine learning in a variety of domains. However, acquiring a suitably large data set for training DCNN is often a significant challenge. This is a major issue in the remote sensing domain, where we have extremely large collections of satellite and aerial imagery, but lack the rich label information that is often readily available for other image modalities. In this letter, we investigate the use of DCNN for land–cover classification in high-resolution remote sensing imagery. To overcome the lack of massive labeled remote-sensing image data sets, we employ two techniques in conjunction with DCNN: transfer learning (TL) with fine-tuning and data augmentation tailored specifically for remote sensing imagery. TL allows one to bootstrap a DCNN while preserving the deep visual feature extraction learned over an image corpus from a different image domain. Data augmentation exploits various aspects of remote sensing imagery to dramatically expand small training image data sets and improve DCNN robustness for remote sensing image data. Here, we apply these techniques to the well-known UC Merced data set to achieve the land–cover classification accuracies of 97.8 ± 2.3%, 97.6 ± 2.6%, and 98.5 ± 1.4% with CaffeNet, GoogLeNet, and ResNet, respectively.

Journal ArticleDOI
TL;DR: RNNs are competitive compared with the state-of-the-art classifiers, and may outperform classical approaches in the presence of low represented and/or highly mixed classes, and it is shown that the alternative feature representation generated by LSTM can improve the performances of standard classifiers.
Abstract: Nowadays, modern earth observation programs produce huge volumes of satellite images time series that can be useful to monitor geographical areas through time. How to efficiently analyze such a kind of information is still an open question in the remote sensing field. Recently, deep learning methods proved suitable to deal with remote sensing data mainly for scene classification(i.e., convolutional neural networks on single images) while only very few studies exist involving temporal deep learning approaches [i.e., recurrent neural networks (RNNs)] to deal with remote sensing time series. In this letter, we evaluate the ability of RNNs, in particular, the long short-term memory (LSTM) model, to perform land cover classification considering multitemporal spatial data derived from a time series of satellite images. We carried out experiments on two different data sets considering both pixel-based and object-based classifications. The obtained results show that RNNs are competitive compared with the state-of-the-art classifiers, and may outperform classical approaches in the presence of low represented and/or highly mixed classes. We also show that the alternative feature representation generated by LSTM can improve the performances of standard classifiers.

Journal ArticleDOI
Daoyu Lin1, Kun Fu1, Yang Wang1, Guangluan Xu1, Xian Sun1 
TL;DR: An unsupervised model called multiple-layer feature-matching generative adversarial networks (MARTA GANs) to learn a representation using only unlabeled data to improve the classification performance compared with other state-of-the-art methods.
Abstract: With the development of deep learning, supervised learning has frequently been adopted to classify remotely sensed images using convolutional networks. However, due to the limited amount of labeled data available, supervised learning is often difficult to carry out. Therefore, we proposed an unsupervised model called multiple-layer feature-matching generative adversarial networks (MARTA GANs) to learn a representation using only unlabeled data. MARTA GANs consists of both a generative model $G$ and a discriminative model $D$ . We treat $D$ as a feature extractor. To fit the complex properties of remote sensing data, we use a fusion layer to merge the mid-level and global features. $G$ can produce numerous images that are similar to the training data; therefore, $D$ can learn better representations of remotely sensed images using the training data provided by $G$ . The classification results on two widely used remote sensing image databases show that the proposed method significantly improves the classification performance compared with other state-of-the-art methods.

Journal ArticleDOI
TL;DR: Experimental results on the moving and stationary target acquisition and recognition data set indicate that the branched ensemble model based on the unit architecture can achieve 99% classification accuracy with all training data.
Abstract: The deep convolutional neural network (CNN) has been widely used for target classification, because it can learn highly useful representations from data However, it is difficult to apply a CNN for synthetic aperture radar (SAR) target classification directly, for it often requires a large volume of labeled training data, which is impractical for SAR applications The highway network is a newly proposed architecture based on CNN that can be trained with smaller data sets This letter proposes a novel architecture called the convolutional highway unit to train deeper networks with limited SAR data The unit architecture is formed by modified convolutional highway layers, a maxpool layer, and a dropout layer Then, the networks can be flexibly formed by stacking the unit architecture to extract deep feature representations for classification Experimental results on the moving and stationary target acquisition and recognition data set indicate that the branched ensemble model based on the unit architecture can achieve 99% classification accuracy with all training data When the training data are reduced to 30%, the classification accuracy of the ensemble model can still reach 9497%

Journal ArticleDOI
TL;DR: This letter shows the first study of Transfer Learning between a simulated data set and a set of real SAR images, and shows that a Convolutional Neural Network pretrained on simulated data has a great advantage over a Convnet trained only on real data, especially when real data are sparse.
Abstract: Data-driven classification algorithms have proved to do well for automatic target recognition (ATR) in synthetic aperture radar (SAR) data. Collecting data sets suitable for these algorithms is a challenge in itself as it is difficult and expensive. Due to the lack of labeled data sets with real SAR images of sufficient size, simulated data play a big role in SAR ATR development, but the transferability of knowledge learned on simulated data to real data remains to be studied further. In this letter, we show the first study of Transfer Learning between a simulated data set and a set of real SAR images. The simulated data set is obtained by adding a simulated object radar reflectivity to a terrain model of individual point scatters, prior to focusing. Our results show that a Convolutional Neural Network (Convnet) pretrained on simulated data has a great advantage over a Convnet trained only on real data, especially when real data are sparse. The advantages of pretraining the models on simulated data show both in terms of faster convergence during the training phase and on the end accuracy when benchmarked on the Moving and Stationary Target Acquisition and Recognition data set. These results encourage SAR ATR development to continue the improvement of simulated data sets of greater size and complex scenarios in order to build robust algorithms for real life SAR ATR applications.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the trained RSRCNN model is able to advance the state-of-the-art road extraction for aerial images, in terms of precision, recall, F-score, and accuracy.
Abstract: In this letter, we propose a road structure refined convolutional neural network (RSRCNN) approach for road extraction in aerial images. In order to obtain structured output of road extraction, both deconvolutional and fusion layers are designed in the architecture of RSRCNN. For training RSRCNN, a new loss function is proposed to incorporate the geometric information of road structure in cross-entropy loss, thus called road-structure-based loss function. Experimental results demonstrate that the trained RSRCNN model is able to advance the state-of-the-art road extraction for aerial images, in terms of precision, recall, F-score, and accuracy.

Journal ArticleDOI
TL;DR: A drone classification method based on convolutional neural network (CNN) and micro-Doppler signature (MDS) and GoogLeNet, a CNN structure, is utilized for the proposed image data set because of its high performance and optimized computing resources.
Abstract: We propose a drone classification method based on convolutional neural network (CNN) and micro-Doppler signature (MDS). The MDS only presents Doppler information in time domain. The frequency domain representation of MDS is called as cadence-velocity diagram (CVD). To analyze the Doppler information of drone in time and frequency domain, we propose a new image by merging MDS and CVD, as merged Doppler image. GoogLeNet, a CNN structure, is utilized for the proposed image data set because of its high performance and optimized computing resources. The image data set is generated by the returned Ku-band frequency modulation continuous wave radar signal. Proposed approach is tested and verified in two different environments, anechoic chamber and outdoor. First, we tested our approach with different numbers of operating motor and aspect angle of a drone. The proposed method improved the accuracy from 89.3% to 94.7%. Second, two types of drone at the 50 and 100 m height are classified and showed 100% accuracy due to distinct difference in the result images.

Journal ArticleDOI
TL;DR: A novel dimensionality reduction algorithm, locality adaptive discriminant analysis (LADA) for HSI classification that aims to learn a representative subspace of data, and focuses on the data points with close relationship in spectral and spatial domains.
Abstract: Linear discriminant analysis (LDA) is a popular technique for supervised dimensionality reduction, but with less concern about a local data structure. This makes LDA inapplicable to many real-world situations, such as hyperspectral image (HSI) classification. In this letter, we propose a novel dimensionality reduction algorithm, locality adaptive discriminant analysis (LADA) for HSI classification. The proposed algorithm aims to learn a representative subspace of data, and focuses on the data points with close relationship in spectral and spatial domains. An intuitive motivation is that data points of the same class have similar spectral feature and the data points among spatial neighborhood are usually associated with the same class. Compared with traditional LDA and its variants, LADA is able to adaptively exploit the local manifold structure of data. Experiments carried out on several real hyperspectral data sets demonstrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: A new feature fusion framework based on deep neural networks (DNNs) to effectively extract features of multi-/hyperspectral and light detection and ranging data and provides competitive results in terms of classification accuracy.
Abstract: The multisensory fusion of remote sensing data has obtained a great attention in recent years. In this letter, we propose a new feature fusion framework based on deep neural networks (DNNs). The proposed framework employs deep convolutional neural networks (CNNs) to effectively extract features of multi-/hyperspectral and light detection and ranging data. Then, a fully connected DNN is designed to fuse the heterogeneous features obtained by the previous CNNs. Through the aforementioned deep networks, one can extract the discriminant and invariant features of remote sensing data, which are useful for further processing. At last, logistic regression is used to produce the final classification results. Dropout and batch normalization strategies are adopted in the deep fusion framework to further improve classification accuracy. The obtained results reveal that the proposed deep fusion model provides competitive results in terms of classification accuracy. Furthermore, the proposed deep learning idea opens a new window for future remote sensing data fusion.

Journal ArticleDOI
TL;DR: Experiments carried out on a Quickbird image acquired over the city of Dar es Salaam, Tanzania show that the proposed FCN outperforms state-of-the-art convolutional networks and the computational cost of the proposed technique is significantly lower than standard patch-based architectures.
Abstract: This letter investigates fully convolutional networks (FCNs) for the detection of informal settlements in very high resolution (VHR) satellite images. Informal settlements or slums are proliferating in developing countries and their detection and classification provides vital information for decision making and planning urban upgrading processes. Distinguishing different urban structures in VHR images is challenging because of the abstract semantic definition of the classes as opposed to the separation of standard land-cover classes. This task requires extraction of texture and spatial features. To this aim, we introduce deep FCNs to perform pixel-wise image labeling by automatically learning a higher level representation of the data. Deep FCNs can learn a hierarchy of features associated to increasing levels of abstraction, from raw pixel values to edges and corners up to complex spatial patterns. We present a deep FCN using dilated convolutions of increasing spatial support. It is capable of learning informative features capturing long-range pixel dependencies while keeping a limited number of network parameters. Experiments carried out on a Quickbird image acquired over the city of Dar es Salaam, Tanzania, show that the proposed FCN outperforms state-of-the-art convolutional networks. Moreover, the computational cost of the proposed technique is significantly lower than standard patch-based architectures.

Journal ArticleDOI
TL;DR: In this letter, a GA-SVM algorithm was proposed as a method of classifying multifrequency RADARSAT-2 (RS2) SAR images and Thaichote (THEOS) multispectral images and showed improved classification accuracy and demonstrated the advantages of using the GA- SVM algorithm, which provided the best accuracy using fewer features.
Abstract: Multisource remote sensing data have been widely used to improve land-cover classifications. The combination of synthetic aperture radar (SAR) and optical imagery can detect different land-cover types, and the use of genetic algorithms (GAs) and support vector machines (SVMs) can lead to improved classifications. Moreover, SVM kernel parameters and feature selection affect the classification accuracy. Thus, a GA was implemented for feature selection and parameter optimization. In this letter, a GA-SVM algorithm was proposed as a method of classifying multifrequency RADARSAT-2 (RS2) SAR images and Thaichote (THEOS) multispectral images. The results of the GA-SVM algorithm were compared with those of the grid search algorithm, a traditional method of parameter searching. The results showed that the GA-SVM algorithm outperformed the grid search approach and provided higher classification accuracy using fewer input features. The images obtained by fusing RS2 data and THEOS data provided high classification accuracy at over 95%. The results showed improved classification accuracy and demonstrated the advantages of using the GA-SVM algorithm, which provided the best accuracy using fewer features.

Journal ArticleDOI
TL;DR: A fully convolutional network is utilized to tackle the problem of inshore ship detection and a task partitioning model in the network is implemented, where layers at different depths are assigned different tasks.
Abstract: Ship detection in optical remote sensing imagery has drawn much attention in recent years, especially with regards to the more challenging inshore ship detection. However, recent work on this subject relies heavily on hand-crafted features that require carefully tuned parameters and on complicated procedures. In this letter, we utilize a fully convolutional network (FCN) to tackle the problem of inshore ship detection and design a ship detection framework that possesses a more simplified procedure and a more robust performance. When tackling the ship detection problem with FCN, there are two major difficulties: 1) the long and thin shape of the ships and their arbitrary direction makes the objects extremely anisotropic and hard to be captured by network features and 2) ships can be closely docked side by side, which makes separating them difficult. Therefore, we implement a task partitioning model in the network, where layers at different depths are assigned different tasks. The deep layer in the network provides detection functionality and the shallow layer supplements with accurate localization. This approach mitigates the tradeoff of FCN between localization accuracy and feature representative ability, which is of importance in the detection of closely docked ships. The experiments demonstrate that this framework, with the advantages of FCN and the task partitioning model, provides robust and reliable inshore ship detection in complex contexts.

Journal ArticleDOI
TL;DR: Experiments on real SAR images validate that the proposed transform does enhance these structures and the whole algorithm is of good performance, especially in the case of low-contrast targets.
Abstract: Synthetic aperture radar (SAR) is an indispensable and extensively used sensor in ship detection. As high-resolution SAR introduces more spatial details into images, this letter proposes an intensity-space (IS) domain constant false alarm rate (CFAR) ship detector to make good use of this information. The method fuses intensity of each pixel and correlations between pixels into one characteristic, i.e., IS index. All the detection procedures center on the calculation and analysis of IS index. First, a new transform maps an image into a new IS domain. Structures like ships and wakes are enhanced in IS domain. Second, a CFAR detector picks up high IS index pixels. Third, a chain of target features is checked to screen out false candidate target pixels. Also, enhanced wakes are taken to improve detection results. Experiments on real SAR images validate that the proposed transform does enhance these structures and the whole algorithm is of good performance, especially in the case of low-contrast targets.

Journal ArticleDOI
TL;DR: A deep-learning-based classification method, which combines convolutional neural networks (CNNs) and extreme learning machine (ELM) to improve classification performance, is proposed, which achieves satisfactory results.
Abstract: One of the challenging issues in high-resolution remote sensing images is classifying land-use scenes with high quality and accuracy. An effective feature extractor and classifier can boost classification accuracy in scene classification. This letter proposes a deep-learning-based classification method, which combines convolutional neural networks (CNNs) and extreme learning machine (ELM) to improve classification performance. A pretrained CNN is initially used to learn deep and robust features. However, the generalization ability is finite and suboptimal, because the traditional CNN adopts fully connected layers as classifier. We use an ELM classifier with the CNN-learned features instead of the fully connected layers of CNN to obtain excellent results. The effectiveness of the proposed method is tested on the UC-Merced data set that has 2100 remotely sensed land-use-scene images with 21 categories. Experimental results show that the proposed CNN-ELM classification method achieves satisfactory results.

Journal ArticleDOI
TL;DR: Results indicate that the proposed computer vision system to track soybean foliar diseases in the field using images captured by the low-cost unmanned aerial vehicle model DJI Phantom 3 can support experts and farmers to monitor diseases in soybean fields.
Abstract: Soybean has been the main Brazilian agricultural commodity, contributing substantially to the country’s trade balance. However, foliar diseases are the key factor that can undermine the soy production, usually caused by fungi, bacteria, viruses, and nematodes. This letter proposes a computer vision system to track soybean foliar diseases in the field using images captured by the low-cost unmanned aerial vehicle model DJI Phantom 3. The proposed system is based on the segmentation method Simple Linear Iterative Clustering to detect plant leaves in the images and on visual attributes to describe the features of foliar physical properties, such as color, gradient, texture, and shape. Our methodology evaluated the performance of six classifiers for different heights, including 1, 2, 4, 8, and 16 m. Experimental results showed that color and texture attributes lead to higher classification rates, achieving the precision of 98.34% for heights between 1 and 2 m, with a decay of 2% at each meter. Results indicate that our approach can support experts and farmers to monitor diseases in soybean fields.

Journal ArticleDOI
TL;DR: This letter describes extensive experiments using actual WV-3 images that demonstrate that some approaches can yield better performance than others, as measured by the proposed blind image quality assessment model of hypersharpened SWIR images.
Abstract: WorldView 3 (WV-3) is the first commercially deployed super-spectral, very high-resolution (HR) satellite. However, the resolution of the short-wave infrared (SWIR) bands is much lower than that of the other bands. In this letter, we describe four different approaches, which are combinations of pansharpening and hypersharpening methods, to generate HR SWIR images. Since there are no ground truth HR SWIR images, we also propose a new picture quality predictor to assess hypersharpening performance, without the need for reference images. We describe extensive experiments using actual WV-3 images that demonstrate that some approaches can yield better performance than others, as measured by the proposed blind image quality assessment model of hypersharpened SWIR images.

Journal ArticleDOI
TL;DR: Experimental results demonstrate the advantage of the novel similarity metric DLSC over the state-of-the-art similarity metrics (such as NCC and mutual information), and show the superior matching performance.
Abstract: Although image matching techniques have been developed in the last decades, automatic optical-to-synthetic aperture radar (SAR) image matching is still a challenging task due to significant nonlinear intensity differences between such images. This letter addresses this problem by proposing a novel similarity metric for image matching using shape properties. A shape descriptor named dense local self-similarity (DLSS) is first developed based on self-similarities within images. Then a similarity metric (named DLSC) is defined using the normalized cross correlation (NCC) of the DLSS descriptors, followed by a template matching strategy to detect correspondences between images. DLSC is robust against significant nonlinear intensity differences because it captures the shape similarity between images, which is independent of intensity patterns. DLSC has been evaluated with four pairs of optical and SAR images. Experimental results demonstrate its advantage over the state-of-the-art similarity metrics (such as NCC and mutual information), and show the superior matching performance.

Journal ArticleDOI
TL;DR: A specially designed fully convolutional network was introduced to learn deep patterns for cloud and snow detection from the multispectrum satellite images and outperforms the state-of-the-art methods greatly both in quantitative and qualitative performances.
Abstract: Cloud and snow detection has significant remote sensing applications, while they share similar low-level features due to their consistent color distributions and similar local texture patterns. Thus, accurately distinguishing cloud from snow in pixel level from satellite images is always a challenging task with traditional approaches. To solve this shortcoming, in this letter, we proposed a deep learning system to classify cloud and snow with fully convolutional neural networks in pixel level. Specifically, a specially designed fully convolutional network was introduced to learn deep patterns for cloud and snow detection from the multispectrum satellite images. Then, a multiscale prediction strategy was introduced to integrate the low-level spatial information and high-level semantic information simultaneously. Finally, a new and challenging cloud and snow data set was labeled manually to train and further evaluate the proposed method. Extensive experiments demonstrate that the proposed deep model outperforms the state-of-the-art methods greatly both in quantitative and qualitative performances.

Journal ArticleDOI
TL;DR: An end-to-end model was developed that could directly synthesize the desired images from the known image database that improved the speed of convergence up to 10 times and the quality of the synthesized images was improved.
Abstract: Synthetic aperture radar (SAR) image simulators based on computer-aided drawing models play an important role in SAR applications, such as automatic target recognition and image interpretation. However, the accuracy of such simulators is due to geometric error and simplification in the electromagnetic calculation. In this letter, an end-to-end model was developed that could directly synthesize the desired images from the known image database. The model was based on generative adversarial nets (GANs), and its feasibility was validated by comparisons with real images and ray-tracing results. As a further step, the samples were synthesized at angles outside of the data set. However, the training process of GAN models was difficult, especially for SAR images which are usually affected by noise interference. The major failure modes were analyzed in experiments, and a clutter normalization method was proposed to ameliorate them. The results showed that the method improved the speed of convergence up to 10 times. The quality of the synthesized images was also improved.

Journal ArticleDOI
TL;DR: This letter considers the spectral and spatial properties of HSI in the anchor graph construction and proposes a novel approach, called fast spectral clustering with anchor graph (FSCAG), to efficiently deal with the large-scale HSI clustering problem.
Abstract: The large-scale hyperspectral image (HSI) clustering problem has attracted significant attention in the field of remote sensing. Most traditional graph-based clustering methods still face challenges in the successful application of the large-scale HSI clustering problem mainly due to their high computational complexity. In this letter, we propose a novel approach, called fast spectral clustering with anchor graph (FSCAG), to efficiently deal with the large-scale HSI clustering problem. Specifically, we consider the spectral and spatial properties of HSI in the anchor graph construction. The proposed FSCAG algorithm first constructs anchor graph and then performs spectral analysis on the graph. With this, the computational complexity can be reduced to $O(ndm)$ , which is a significant improvement compared to conventional graph-based clustering methods that need at least $O(n^{2}d)$ , where $n$ , $d$ , and $m$ are the number of samples, features, and anchors, respectively. Several experiments are conducted to demonstrate the efficiency and effectiveness of the proposed FSCAG algorithm.