scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Deep Learning-Based Classification of Hyperspectral Data

TL;DR: The concept of deep learning is introduced into hyperspectral data classification for the first time, and a new way of classifying with spatial-dominated information is proposed, which is a hybrid of principle component analysis (PCA), deep learning architecture, and logistic regression.
Abstract: Classification is one of the most popular topics in hyperspectral remote sensing. In the last two decades, a huge number of methods were proposed to deal with the hyperspectral data classification problem. However, most of them do not hierarchically extract deep features. In this paper, the concept of deep learning is introduced into hyperspectral data classification for the first time. First, we verify the eligibility of stacked autoencoders by following classical spectral information-based classification. Second, a new way of classifying with spatial-dominated information is proposed. We then propose a novel deep learning framework to merge the two features, from which we can get the highest classification accuracy. The framework is a hybrid of principle component analysis (PCA), deep learning architecture, and logistic regression. Specifically, as a deep learning architecture, stacked autoencoders are aimed to get useful high-level features. Experimental results with widely-used hyperspectral data indicate that classifiers built in this deep learning-based framework provide competitive performance. In addition, the proposed joint spectral-spatial deep neural network opens a new window for future research, showcasing the deep learning-based methods' huge potential for accurate hyperspectral data classification.
Citations
More filters
Journal ArticleDOI
TL;DR: A survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges indicates that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.

2,100 citations


Cites background or methods or result from "Deep Learning-Based Classification ..."

  • ...CNN had also superior performance than Penalized Discriminant Analysis (Grinblat, Uzal, Larese, & Granitto, 2016), SVM Regression (Kuwata & Shibasaki, 2015), area-based techniques (Rahnemoonfar & Sheppard, 2017), texturebased regression models (Chen, et al., 2017), LMC classifiers (Xinshao & Cheng, 2015), Gaussian Mixture Models (Santoni, Sensuse, Arymurthy, & Fanany, 2015) and NaïveBayes classifiers (Yalcin, 2017 )....

    [...]

  • ...%) and/or F1 scores (i.e. 0.558 - 0.746), however state of the art work in these particular problems has shown lower CA (i.e. SVM, RF, Naïve- Bayes classifier)....

    [...]

  • ...The most popular techniques used for analyzing images include machine learning (ML) (K-means, support vector machines (SVM), artificial neural networks (ANN) amongst others), linear polarizations, wavelet-based filtering, vegetation indices (NDVI) and regression analysis (Saxena & Armstrong, 2014), (Singh, Ganapathysubramanian, Singh, & Sarkar, 2016)....

    [...]

  • ...Some of the CNN approaches combined their model with a classifier at the output layer, such as logistic regression (Chen, Lin, Zhao, Wang, & Gu, 2014), Scalable Vector Machines (SVM) (Douarre, Schielein, Frindel, Gerth, & Rousseau, 2016), linear regression (Chen, et al., 2017), Large Margin Classifiers (LCM) (Xinshao & Cheng, 2015) and macroscopic cellular automata (Song, et al., 2016)....

    [...]

  • ...Some of the CNN approaches combined their model with a classifier at the output layer, such as logistic regression (Chen et al., 2014), Scalable Vector Machines (SVM) (Douarre et al....

    [...]

Journal ArticleDOI
TL;DR: The challenges of using deep learning for remote-sensing data analysis are analyzed, recent advances are reviewed, and resources are provided that hope will make deep learning in remote sensing seem ridiculously simple.
Abstract: Central to the looming paradigm shift toward data-intensive science, machine-learning techniques are becoming increasingly important. In particular, deep learning has proven to be both a major breakthrough and an extremely powerful tool in many fields. Shall we embrace deep learning as the key to everything? Or should we resist a black-box solution? These are controversial issues within the remote-sensing community. In this article, we analyze the challenges of using deep learning for remote-sensing data analysis, review recent advances, and provide resources we hope will make deep learning in remote sensing seem ridiculously simple. More importantly, we encourage remote-sensing scientists to bring their expertise into deep learning and use it as an implicit general model to tackle unprecedented, large-scale, influential challenges, such as climate change and urbanization.

2,095 citations


Cites background from "Deep Learning-Based Classification ..."

  • ...SAE FOR HyPERSPECTRAL DATA CLASSIFICATION A first attempt in this direction can be found in [22], where the authors make use of an SAE to extract hierarchical features in the spectral domain....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a 3-D CNN-based FE model with combined regularization to extract effective spectral-spatial features of hyperspectral imagery and reveals that the proposed models with sparse constraints provide competitive results to state-of-the-art methods.
Abstract: Due to the advantages of deep learning, in this paper, a regularized deep feature extraction (FE) method is presented for hyperspectral image (HSI) classification using a convolutional neural network (CNN). The proposed approach employs several convolutional and pooling layers to extract deep features from HSIs, which are nonlinear, discriminant, and invariant. These features are useful for image classification and target detection. Furthermore, in order to address the common issue of imbalance between high dimensionality and limited availability of training samples for the classification of HSI, a few strategies such as L2 regularization and dropout are investigated to avoid overfitting in class data modeling. More importantly, we propose a 3-D CNN-based FE model with combined regularization to extract effective spectral-spatial features of hyperspectral imagery. Finally, in order to further improve the performance, a virtual sample enhanced method is proposed. The proposed approaches are carried out on three widely used hyperspectral data sets: Indian Pines, University of Pavia, and Kennedy Space Center. The obtained results reveal that the proposed models with sparse constraints provide competitive results to state-of-the-art methods. In addition, the proposed deep FE opens a new window for further research.

2,059 citations


Cites background from "Deep Learning-Based Classification ..."

  • ..., stacked autoencoder (SAE), was proposed for HSI classification in 2014 [29]....

    [...]

Journal ArticleDOI
TL;DR: A general framework of DL for RS data is provided, and the state-of-the-art DL methods in RS are regarded as special cases of input-output data combined with various deep networks and tuning tricks.
Abstract: Deep-learning (DL) algorithms, which learn the representative and discriminative features in a hierarchical manner from the data, have recently become a hotspot in the machine-learning area and have been introduced into the geoscience and remote sensing (RS) community for RS big data analysis. Considering the low-level features (e.g., spectral and texture) as the bottom level, the output feature representation from the top level of the network can be directly fed into a subsequent classifier for pixel-based classification. As a matter of fact, by carefully addressing the practical demands in RS applications and designing the input?output levels of the whole network, we have found that DL is actually everywhere in RS data analysis: from the traditional topics of image preprocessing, pixel-based classification, and target recognition, to the recent challenging tasks of high-level semantic feature extraction and RS scene understanding.

1,625 citations


Cites background or methods from "Deep Learning-Based Classification ..."

  • ...Straightforwardly, differing from the spectral–spatial classification scheme, the spectral and initial spatial features are combined together into a vector as the input of the DL network in a joint framework, as presented in the works [29], [53]–[55], [73], and [74]....

    [...]

  • ...[74] adopted the stacked AE as the deep network structure....

    [...]

  • ...In general, there are two main styles of classifiers: 1) the hard classifiers, such as SVMs, which directly output an integer number as the class label of each sample [76], and 2) the soft classifiers, such as logistic regression, which can simultaneously fine-tune the whole pretrained network and predict the class label in a probability distribution manner [29], [73], [74], [78]....

    [...]

Journal ArticleDOI
TL;DR: Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.
Abstract: Recently, convolutional neural networks have demonstrated excellent performance on various visual tasks, including the classification of common two-dimensional images. In this paper, deep convolutional neural networks are employed to classify hyperspectral images directly in spectral domain. More specifically, the architecture of the proposed classifier contains five layers with weights which are the input layer, the convolutional layer, the max pooling layer, the full connection layer, and the output layer. These five layers are implemented on each spectral signature to discriminate against others. Experimental results based on several hyperspectral image data sets demonstrate that the proposed method can achieve better classification performance than some traditional methods, such as support vector machines and the conventional deep learning-based methods.

1,316 citations

References
More filters
Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Journal ArticleDOI
28 Jul 2006-Science
TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Abstract: High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

16,717 citations

Journal ArticleDOI
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

15,055 citations


"Deep Learning-Based Classification ..." refers background in this paper

  • ...Typical deep neural network architectures include deep belief networks (DBNs) [38], deep Boltzmann machines (DBMs) [39], SAEs [40], and stacked denoising AEs (SDAEs) [41]....

    [...]

Journal ArticleDOI
TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.
Abstract: The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.

11,201 citations


"Deep Learning-Based Classification ..." refers background in this paper

  • ...sifiers like linear SVMand logistic regression can be attributed to single-layer classifiers, whereas decision tree or SVM with kernels are believed to have two layers [24]....

    [...]

  • ...It is believed that deep architectures can potentially lead to progressively more abstract features at higher layers of feature, and more abstract features are generally invariant to most local changes of the input [24]....

    [...]

Journal ArticleDOI
TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
Abstract: The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.

9,775 citations


"Deep Learning-Based Classification ..." refers methods in this paper

  • ...The layer-wise training models have a bunch of alternatives such as restricted Boltzmann machines (RBMs) [42], pooling units [43], convolutional neural networks (CNNs) [44], AEs, and denoising AEs (DAE) [40]....

    [...]