scispace - formally typeset
Search or ask a question
Journal ArticleDOI

On combining multiscale deep learning features for the classification of hyperspectral remote sensing imagery

Wenzhi Zhao1, Zhou Guo1, Jun Yue1, Xiuyuan Zhang1, Liqun Luo 
01 Jul 2015-Journal of remote sensing (Taylor & Francis)-Vol. 36, Iss: 13, pp 3368-3379
TL;DR: A technique is proposed which attempts to classify hyperspectral imagery by incorporating deep learning features and is found that the deep learning-based method provides a more accurate classification result than the traditional ones.
Abstract: In recent years, satellite imagery has greatly improved in both spatial and spectral resolution. One of the major unsolved problems in highly developed remote sensing imagery is the manual selection and combination of appropriate features according to spectral and spatial properties. Deep learning framework can learn global and robust features from the training data set automatically, and it has achieved state-of-the-art classification accuracies over different image classification tasks. In this study, a technique is proposed which attempts to classify hyperspectral imagery by incorporating deep learning features. Firstly, deep learning features are extracted by multiscale convolutional auto-encoder. Then, based on the learned deep learning features, a logistic regression classifier is trained for classification. Finally, parameters of deep learning framework are analysed and the potential development is introduced. Experiments are conducted on the well-known Pavia data set which is acquired by the reflective optics system imaging spectrometer sensor. It is found that the deep learning-based method provides a more accurate classification result than the traditional ones.
Citations
More filters
Journal ArticleDOI
TL;DR: A general framework of DL for RS data is provided, and the state-of-the-art DL methods in RS are regarded as special cases of input-output data combined with various deep networks and tuning tricks.
Abstract: Deep-learning (DL) algorithms, which learn the representative and discriminative features in a hierarchical manner from the data, have recently become a hotspot in the machine-learning area and have been introduced into the geoscience and remote sensing (RS) community for RS big data analysis. Considering the low-level features (e.g., spectral and texture) as the bottom level, the output feature representation from the top level of the network can be directly fed into a subsequent classifier for pixel-based classification. As a matter of fact, by carefully addressing the practical demands in RS applications and designing the input?output levels of the whole network, we have found that DL is actually everywhere in RS data analysis: from the traditional topics of image preprocessing, pixel-based classification, and target recognition, to the recent challenging tasks of high-level semantic feature extraction and RS scene understanding.

1,625 citations


Cites background from "On combining multiscale deep learni..."

  • ...Typical unsupervised feature-learning methods are RBMs, sparse coding, AEs, k-means clustering, and the Gaussian Mixture Model [104]....

    [...]

  • ...An AE can be directly employed as a feature extractor for RS data analysis [51], and it has been more frequently stacked into the AEs for DL from RS data [52]–[54]. restriCted Boltzmann maChines An RBM is commonly used as a layer-wise training model in the construction of a DBN....

    [...]

  • ...In the related literature, both the supervised DL structures (e.g., the CNN [45]) and the unsupervised DL structures (e.g., the AEs [73]–[75], DBNs [29], [76], and other self-defined neurons in each layer [77]) are employed....

    [...]

  • ...The preferred deep networks in these papers are SAEs and DBNs, respectively....

    [...]

  • ...Unlike AEs, the sparse coding algorithms [42] generate sparse representations from the data themselves from a different perspective by learning an overcomplete dictionary via self-decomposition....

    [...]

Journal ArticleDOI
TL;DR: A spectral-spatial feature based classification (SSFC) framework that jointly uses dimension reduction and deep learning techniques for spectral and spatial feature extraction, respectively is proposed.
Abstract: In this paper, we propose a spectral–spatial feature based classification (SSFC) framework that jointly uses dimension reduction and deep learning techniques for spectral and spatial feature extraction, respectively. In this framework, a balanced local discriminant embedding algorithm is proposed for spectral feature extraction from high-dimensional hyperspectral data sets. In the meantime, convolutional neural network is utilized to automatically find spatial-related features at high levels. Then, the fusion feature is extracted by stacking spectral and spatial features together. Finally, the multiple-feature-based classifier is trained for image classification. Experimental results on well-known hyperspectral data sets show that the proposed SSFC method outperforms other commonly used methods for hyperspectral image classification.

872 citations


Cites result from "On combining multiscale deep learni..."

  • ...However, the configuration of CNN can greatly affect the classification accuracies in terms of spatial feature extraction as we reported in previous works [29]–[31]....

    [...]

Journal ArticleDOI
TL;DR: An end-to-end framework for the dense, pixelwise classification of satellite imagery with convolutional neural networks (CNNs) and design a multiscale neuron module that alleviates the common tradeoff between recognition and precise localization is proposed.
Abstract: We propose an end-to-end framework for the dense, pixelwise classification of satellite imagery with convolutional neural networks (CNNs). In our framework, CNNs are directly trained to produce classification maps out of the input images. We first devise a fully convolutional architecture and demonstrate its relevance to the dense classification problem. We then address the issue of imperfect training data through a two-step training approach: CNNs are first initialized by using a large amount of possibly inaccurate reference data, and then refined on a small amount of accurately labeled data. To complete our framework, we design a multiscale neuron module that alleviates the common tradeoff between recognition and precise localization. A series of experiments show that our networks consider a large amount of context to provide fine-grained classification maps.

859 citations


Cites background from "On combining multiscale deep learni..."

  • ...at multiple spatial scales has also been exploited, notably for hyperspectral classification [22], [23] and image segmenta-...

    [...]

Journal ArticleDOI
TL;DR: The potential of DL in environmental remote sensing, including land cover mapping, environmental parameter retrieval, data fusion and downscaling, and information reconstruction and prediction, will be analyzed and a typical network structure will be introduced.

631 citations


Cites methods from "On combining multiscale deep learni..."

  • ...Accordingly, DL has been successfully applied to land cover classification and achieved impressive results (Zhang et al., 2018a; Zhao and Du, 2016; Zhao et al., 2015)....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive review of the current-state-of-the-art in DL for HSI classification, analyzing the strengths and weaknesses of the most widely used classifiers in the literature is provided, providing an exhaustive comparison of the discussed techniques.
Abstract: Advances in computing technology have fostered the development of new and powerful deep learning (DL) techniques, which have demonstrated promising results in a wide range of applications. Particularly, DL methods have been successfully used to classify remotely sensed data collected by Earth Observation (EO) instruments. Hyperspectral imaging (HSI) is a hot topic in remote sensing data analysis due to the vast amount of information comprised by this kind of images, which allows for a better characterization and exploitation of the Earth surface by combining rich spectral and spatial information. However, HSI poses major challenges for supervised classification methods due to the high dimensionality of the data and the limited availability of training samples. These issues, together with the high intraclass variability (and interclass similarity) –often present in HSI data– may hamper the effectiveness of classifiers. In order to solve these limitations, several DL-based architectures have been recently developed, exhibiting great potential in HSI data interpretation. This paper provides a comprehensive review of the current-state-of-the-art in DL for HSI classification, analyzing the strengths and weaknesses of the most widely used classifiers in the literature. For each discussed method, we provide quantitative results using several well-known and widely used HSI scenes, thus providing an exhaustive comparison of the discussed techniques. The paper concludes with some remarks and hints about future challenges in the application of DL techniques to HSI classification. The source codes of the methods discussed in this paper are available from: https://github.com/mhaut/hyperspectral_deeplearning_review .

534 citations

References
More filters
Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Journal ArticleDOI
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

15,055 citations


"On combining multiscale deep learni..." refers methods in this paper

  • ...As regards the computer vision, convolutional neural networks perform well at recognizing faces and digits (LeCun et al. 1989; Hinton, Osindero, and Teh 2006; Le 2013)....

    [...]

Journal ArticleDOI
21 Oct 1999-Nature
TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

11,500 citations


"On combining multiscale deep learni..." refers methods in this paper

  • ...Although other spectral feature extraction algorithms can also be used, such as decision boundary feature extraction (Landgrebe 2005) and non-negative matrix factorization (Lee and Sebastian Seung 1999), the comparison of different spectral extraction algorithms is beyond the scope of this paper....

    [...]

Journal ArticleDOI
TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
Abstract: The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.

9,775 citations


"On combining multiscale deep learni..." refers methods in this paper

  • ...As regards the computer vision, convolutional neural networks perform well at recognizing faces and digits (LeCun et al. 1989; Hinton, Osindero, and Teh 2006; Le 2013)....

    [...]

01 Jan 1999
TL;DR: In this article, non-negative matrix factorization is used to learn parts of faces and semantic features of text, which is in contrast to principal components analysis and vector quantization that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

9,604 citations

Trending Questions (1)
Is SVM a part of deep learning?

It is found that the deep learning-based method provides a more accurate classification result than the traditional ones.