scispace - formally typeset
Search or ask a question

Showing papers in "Isprs Journal of Photogrammetry and Remote Sensing in 2017"


Journal ArticleDOI
TL;DR: It is found that supervised object- based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework, and spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest shows the best performance inobject-based classification.
Abstract: Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.

608 citations


Journal ArticleDOI
Zhe Zhu1
TL;DR: It is observed that the more recent the study, the higher the frequency of Landsat time series used and some of the widely-used change detection algorithms were discussed, including thresholding, differencing, segmentation, trajectory classification, and regression.
Abstract: The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.

521 citations


Journal ArticleDOI
TL;DR: A new CNN architecture for the classification of hyperspectral images is presented, a 3-D network that uses both spectral and spatial information and implements a border mirroring strategy to effectively process border areas in the image.
Abstract: Artificial neural networks (ANNs) have been widely used for the analysis of remotely sensed imagery. In particular, convolutional neural networks (CNNs) are gaining more and more attention in this field. CNNs have proved to be very effective in areas such as image recognition and classification, especially for the classification of large sets composed by two-dimensional images. However, their application to multispectral and hyperspectral images faces some challenges, especially related to the processing of the high-dimensional information contained in multidimensional data cubes. This results in a significant increase in computation time. In this paper, we present a new CNN architecture for the classification of hyperspectral images. The proposed CNN is a 3-D network that uses both spectral and spatial information. It also implements a border mirroring strategy to effectively process border areas in the image, and has been efficiently implemented using graphics processing units (GPUs). Our experimental results indicate that the proposed network performs accurately and efficiently, achieving a reduction of the computation time and increasing the accuracy in the classification of hyperspectral images when compared to other traditional ANN techniques.

446 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigate various methods to deal with semantic labeling of very high-resolution multi-modal remote sensing data and propose an efficient multi-scale approach to leverage both a large spatial context and the high resolution data, and investigate early and late fusion of Lidar and multispectral data.
Abstract: In this work, we investigate various methods to deal with semantic labeling of very high resolution multi-modal remote sensing data. Especially, we study how deep fully convolutional networks can be adapted to deal with multi-modal and multi-scale remote sensing data for semantic labeling. Our contributions are threefold: (a) we present an efficient multi-scale approach to leverage both a large spatial context and the high resolution data, (b) we investigate early and late fusion of Lidar and multispectral data, (c) we validate our methods on two public datasets with state-of-the-art results. Our results indicate that late fusion make it possible to recover errors steaming from ambiguous data, while early fusion allows for better joint-feature learning but at the cost of higher sensitivity to missing data.

397 citations


Journal ArticleDOI
TL;DR: In this paper, a rice grain yield was predicted with single stage vegetation indices (VIs) and multi-temporal VIs derived from the multispectral (MS) and digital images.
Abstract: Timely and non-destructive assessment of crop yield is an essential part of agricultural remote sensing (RS). The development of unmanned aerial vehicles (UAVs) has provided a novel approach for RS, and makes it possible to acquire high spatio-temporal resolution imagery on a regional scale. In this study, the rice grain yield was predicted with single stage vegetation indices (VIs) and multi-temporal VIs derived from the multispectral (MS) and digital images. The results showed that the booting stage was identified as the optimal stage for grain yield prediction with VIs at a single stage for both digital image and MS image. And corresponding optimal color index was VARI with R 2 value of 0.71 (Log relationship). While the optimal vegetation index NDVI [800,720] based on MS images showed a linear relationship with the grain yield and gained a higher R 2 value (0.75) than color index did. The multi-temporal VIs showed a higher correlation with grain yield than the single stage VIs did. And the VIs at two random growth stage with the multiple linear regression function [MLR(VI)] performed best. The highest correlation coefficient were 0.76 with MLR(NDVI [800,720] ) at the booting and heading stages (for the MS image) and 0.73 with MLR(VARI) at the jointing and booting stages (for the digital image). In addition, the VIs that showed a high correlation with LAI performed well for yield prediction, and the VIs composed of red edge band (720 nm) and near infrared band (800 nm) were found to be more effective in predicting yield and LAI at high level. In conclusion, this study has demonstrated that both MS and digital sensors mounted on the UAV are reliable platforms for rice growth and grain yield estimation, and determined the best period and optimal VIs for rice grain yield prediction.

353 citations


Journal ArticleDOI
TL;DR: In this article, the authors explored the relationship between land use land cover and land surface temperature (LST) patterns in the context of urbanization and proposed a model applying non-parametric regression to estimate future urban climate patterns using predicted land use and land use change.
Abstract: Exploring changes in land use land cover (LULC) to understand the urban heat island (UHI) effect is valuable for both communities and local governments in cities in developing countries, where urbanization and industrialization often take place rapidly but where coherent planning and control policies have not been applied. This work aims at determining and analyzing the relationship between LULC change and land surface temperature (LST) patterns in the context of urbanization. We first explore the relationship between LST and vegetation, man-made features, and cropland using normalized vegetation, and built-up indices within each LULC type. Afterwards, we assess the impacts of LULC change and urbanization in UHI using hot spot analysis (Getis-Ord Gi∗ statistics) and urban landscape analysis. Finally, we propose a model applying non-parametric regression to estimate future urban climate patterns using predicted land cover and land use change. Results from this work provide an effective methodology for UHI characterization, showing that (a) LST depends on a nonlinear way of LULC types; (b) hotspot analysis using Getis Ord Gi∗ statistics allows to analyze the LST pattern change through time; (c) UHI is influenced by both urban landscape and urban development type; (d) LST pattern forecast and UHI effect examination can be done by the proposed model using nonlinear regression and simulated LULC change scenarios. We chose an inner city area of Hanoi as a case-study, a small and flat plain area where LULC change is significant due to urbanization and industrialization. The methodology presented in this paper can be broadly applied in other cities which exhibit a similar dynamic growth. Our findings can represent an useful tool for policy makers and the community awareness by providing a scientific basis for sustainable urban planning and management.

350 citations


Journal ArticleDOI
TL;DR: In this article, an automated cropland mapping algorithm (ACMA) was used to capture extensive knowledge on the croplands of Africa available through ground-based training samples, very high (sub-meter to five-meter) resolution imagery (VHRI), and local knowledge captured during field visits and/or sourced from country reports and literature.
Abstract: The automation of agricultural mapping using satellite-derived remotely sensed data remains a challenge in Africa because of the heterogeneous and fragmental landscape, complex crop cycles, and limited access to local knowledge. Currently, consistent, continent-wide routine cropland mapping of Africa does not exist, with most studies focused either on certain portions of the continent or at most a one-time effort at mapping the continent at coarse resolution remote sensing. In this research, we addressed these limitations by applying an automated cropland mapping algorithm (ACMA) that captures extensive knowledge on the croplands of Africa available through: (a) ground-based training samples, (b) very high (sub-meter to five-meter) resolution imagery (VHRI), and (c) local knowledge captured during field visits and/or sourced from country reports and literature. The study used 16-day time-series of Moderate Resolution Imaging Spectroradiometer (MODIS) normalized difference vegetation index (NDVI) composited data at 250-m resolution for the entire African continent. Based on these data, the study first produced accurate reference cropland layers or RCLs (cropland extent/areas, irrigation versus rainfed, cropping intensities, crop dominance, and croplands versus cropland fallows) for the year 2014 that provided an overall accuracy of around 90% for crop extent in different agro-ecological zones (AEZs). The RCLs for the year 2014 (RCL2014) were then used in the development of the ACMA algorithm to create ACMA-derived cropland layers for 2014 (ACL2014). ACL2014 when compared pixel-by-pixel with the RCL2014 had an overall similarity greater than 95%. Based on the ACL2014, the African continent had 296 Mha of net cropland areas (260 Mha cultivated plus 36 Mha fallows) and 330 Mha of gross cropland areas. Of the 260 Mha of net cropland areas cultivated during 2014, 90.6% (236 Mha) was rainfed and just 9.4% (24 Mha) was irrigated. Africa has about 15% of the world’s population, but only about 6% of world’s irrigation. Net cropland area distribution was 95 Mha during season 1, 117 Mha during season 2, and 84 Mha continuous. About 58% of the rainfed and 39% of the irrigated were single crops (net cropland area without cropland fallows) cropped during either season 1 (January-May) or season 2 (June-September). The ACMA algorithm was deployed on Google Earth Engine (GEE) cloud computing platform and applied on MODIS time-series data from 2003 through 2014 to obtain ACMA-derived cropland layers for these years (ACL2003 to ACL2014). The results indicated that over these twelve years, on average: (a) croplands increased by 1 Mha/yr, and (b) cropland fallows decreased by 1 Mha/year. Cropland areas computed from ACL2014 for the 55 African countries were largely underestimated when compared with an independent source of census-based cropland data, with a root-mean-square error (RMSE) of 3.5 Mha. ACMA demonstrated the ability to hind-cast (past years), now-cast (present year), and forecast (future years) cropland products using MODIS 250-m time-series data rapidly, but currently, insufficient reference data exist to rigorously report trends from these results.

332 citations


Journal ArticleDOI
TL;DR: This paper reports on the final performance of the TanDEM-X global DEM and presents the acquisition and processing strategy which allowed to obtain the final DEM quality.
Abstract: The primary objective of the TanDEM-X mission is the generation of a global, consistent, and high-resolution digital elevation model (DEM) with unprecedented global accuracy. The goal is achieved by exploiting the interferometric capabilities of the two twin SAR satellites TerraSAR-X and TanDEM-X, which fly in a close orbit formation, acting as an X-band single-pass interferometer. Between December 2010 and early 2015 all land surfaces have been acquired at least twice, difficult terrain up to seven or eight times. The acquisition strategy, data processing, and DEM calibration and mosaicking have been systematically monitored and optimized throughout the entire mission duration, in order to fulfill the specification. The processing of all data has finally been completed in September 2016 and this paper reports on the final performance of the TanDEM-X global DEM and presents the acquisition and processing strategy which allowed to obtain the final DEM quality. The results confirm the outstanding global accuracy of the delivered product, which can be now utilized for both scientific and commercial applications.

323 citations


Journal ArticleDOI
TL;DR: This work proposes a single patch-based Convolutional Neural Network architecture for extraction of roads and buildings from high-resolution remote sensing data and demonstrates the validity and superior performance of the proposed network architecture for extracting Roads and buildings in urban areas.
Abstract: Extraction of man-made objects (e.g., roads and buildings) from remotely sensed imagery plays an important role in many urban applications (e.g., urban land use and land cover assessment, updating geographical databases, change detection, etc). This task is normally difficult due to complex data in the form of heterogeneous appearance with large intra-class and lower inter-class variations. In this work, we propose a single patch-based Convolutional Neural Network (CNN) architecture for extraction of roads and buildings from high-resolution remote sensing data. Low-level features of roads and buildings (e.g., asymmetry and compactness) of adjacent regions are integrated with Convolutional Neural Network (CNN) features during the post-processing stage to improve the performance. Experiments are conducted on two challenging datasets of high-resolution images to demonstrate the performance of the proposed network architecture and the results are compared with other patch-based network architectures. The results demonstrate the validity and superior performance of the proposed network architecture for extracting roads and buildings in urban areas.

297 citations


Journal ArticleDOI
TL;DR: The proposed ensemble classifier MLP-CNN harvests the complementary results acquired from the CNN based on deep spatial feature representation and from the MLP based on spectral discrimination, paving the way to effectively address the complicated problem of VFSR image classification.
Abstract: The contextual-based convolutional neural network (CNN) with deep architecture and pixel-based multilayer perceptron (MLP) with shallow structure are well-recognized neural network algorithms, representing the state-of-the-art deep learning method and the classical non-parametric machine learning approach, respectively. The two algorithms, which have very different behaviours, were integrated in a concise and effective way using a rule-based decision fusion approach for the classification of very fine spatial resolution (VFSR) remotely sensed imagery. The decision fusion rules, designed primarily based on the classification confidence of the CNN, reflect the generally complementary patterns of the individual classifiers. In consequence, the proposed ensemble classifier MLP-CNN harvests the complementary results acquired from the CNN based on deep spatial feature representation and from the MLP based on spectral discrimination. Meanwhile, limitations of the CNN due to the adoption of convolutional filters such as the uncertainty in object boundary partition and loss of useful fine spatial resolution detail were compensated. The effectiveness of the ensemble MLP-CNN classifier was tested in both urban and rural areas using aerial photography together with an additional satellite sensor dataset. The MLP-CNN classifier achieved promising performance, consistently outperforming the pixel-based MLP, spectral and textural-based MLP, and the contextual-based CNN in terms of classification accuracy. This research paves the way to effectively address the complicated problem of VFSR image classification.

282 citations


Journal ArticleDOI
TL;DR: In this paper, the Global Urban Footprint (GUF) raster map is used for the analysis of global urbanization and peri-urbanization patterns, population estimation, vulnerability assessment, or the modeling of diseases and phenomena of global change in general.
Abstract: Today, approximately 7.2 billion people inhabit the Earth and by 2050 this number will have risen to around nine billion, of which about 70% will be living in cities. The population growth and the related global urbanization pose one of the major challenges to a sustainable future. Hence, it is essential to understand drivers, dynamics, and impacts of the human settlements development. A key component in this context is the availability of an up-to-date and spatially consistent map of the location and distribution of human settlements. It is here that the Global Urban Footprint (GUF) raster map can make a valuable contribution. The new global GUF binary settlement mask shows a so far unprecedented spatial resolution of 0.4 ″ ( ∼ 12 m ) that provides – for the first time – a complete picture of the entirety of urban and rural settlements. The GUF has been derived by means of a fully automated processing framework – the Urban Footprint Processor (UFP) – that was used to analyze a global coverage of more than 180,000 TanDEM-X and TerraSAR-X radar images with 3 m ground resolution collected in 2011–2012. The UFP consists of five main technical modules for data management, feature extraction, unsupervised classification, mosaicking and post-editing. Various quality assessment studies to determine the absolute GUF accuracy based on ground truth data on the one hand and the relative accuracies compared to established settlements maps on the other hand, clearly indicate the added value of the new global GUF layer, in particular with respect to the representation of rural settlement patterns. The Kappa coefficient of agreement compared to absolute ground truth data, for instance, shows GUF accuracies which are frequently twice as high as those of established low resolution maps. Generally, the GUF layer achieves an overall absolute accuracy of about 85%, with observed minima around 65% and maxima around 98%. The GUF will be provided open and free for any scientific use in the full resolution and for any non-profit (but also non-scientific) use in a generalized version of 2.8 ″ ( ∼ 84 m ). Therewith, the new GUF layer can be expected to break new ground with respect to the analysis of global urbanization and peri-urbanization patterns, population estimation, vulnerability assessment, or the modeling of diseases and phenomena of global change in general.

Journal ArticleDOI
TL;DR: In this paper, a new classification algorithm was developed using the biophysical characteristics of mangrove forests in China by identifying: greenness, canopy coverage, and tidal inundation from time series Landsat data, and elevation, slope, and intersection-with-sea criterion.
Abstract: Due to rapid losses of mangrove forests caused by anthropogenic disturbances and climate change, accurate and contemporary maps of mangrove forests are needed to understand how mangrove ecosystems are changing and establish plans for sustainable management. In this study, a new classification algorithm was developed using the biophysical characteristics of mangrove forests in China. More specifically, these forests were mapped by identifying: (1) greenness, canopy coverage, and tidal inundation from time series Landsat data, and (2) elevation, slope, and intersection-with-sea criterion. The annual mean Normalized Difference Vegetation Index (NDVI) was found to be a key variable in determining the classification thresholds of greenness, canopy coverage, and tidal inundation of mangrove forests, which are greatly affected by tide dynamics. In addition, the integration of Sentinel-1A VH band and modified Normalized Difference Water Index (mNDWI) shows great potential in identifying yearlong tidal and fresh water bodies, which is related to mangrove forests. This algorithm was developed using 6 typical Regions of Interest (ROIs) as algorithm training and was run on the Google Earth Engine (GEE) cloud computing platform to process 1941 Landsat images (25 Path/Row) and 586 Sentinel-1A images circa 2015. The resultant mangrove forest map of China at 30 m spatial resolution has an overall/users/producer’s accuracy greater than 95% when validated with ground reference data. In 2015, China’s mangrove forests had a total area of 20,303 ha, about 92% of which was in the Guangxi Zhuang Autonomous Region, Guangdong, and Hainan Provinces. This study has demonstrated the potential of using the GEE platform, time series Landsat and Sentine-1A SAR images to identify and map mangrove forests along the coastal zones. The resultant mangrove forest maps are likely to be useful for the sustainable management and ecological assessments of mangrove forests in China.

Journal ArticleDOI
TL;DR: A multiple-kernel-learning framework, an effective way for integrating features from different modalities, was used for combining the two sets of features for classification and the results are encouraging: while CNN features produced an average classification accuracy, the integration of 3D point cloud features led to an additional improvement of about 3%.
Abstract: Oblique aerial images offer views of both building roofs and facades, and thus have been recognized as a potential source to detect severe building damages caused by destructive disaster events such as earthquakes. Therefore, they represent an important source of information for first responders or other stakeholders involved in the post-disaster response process. Several automated methods based on supervised learning have already been demonstrated for damage detection using oblique airborne images. However, they often do not generalize well when data from new unseen sites need to be processed, hampering their practical use. Reasons for this limitation include image and scene characteristics, though the most prominent one relates to the image features being used for training the classifier. Recently features based on deep learning approaches, such as convolutional neural networks (CNNs), have been shown to be more effective than conventional hand-crafted features, and have become the state-of-the-art in many domains, including remote sensing. Moreover, often oblique images are captured with high block overlap, facilitating the generation of dense 3D point clouds – an ideal source to derive geometric characteristics. We hypothesized that the use of CNN features, either independently or in combination with 3D point cloud features, would yield improved performance in damage detection. To this end we used CNN and 3D features, both independently and in combination, using images from manned and unmanned aerial platforms over several geographic locations that vary significantly in terms of image and scene characteristics. A multiple-kernel-learning framework, an effective way for integrating features from different modalities, was used for combining the two sets of features for classification. The results are encouraging: while CNN features produced an average classification accuracy of about 91%, the integration of 3D point cloud features led to an additional improvement of about 3% (i.e. an average classification accuracy of 94%). The significance of 3D point cloud features becomes more evident in the model transferability scenario (i.e., training and testing samples from different sites that vary slightly in the aforementioned characteristics), where the integration of CNN and 3D point cloud features significantly improved the model transferability accuracy up to a maximum of 7% compared with the accuracy achieved by CNN features alone. Overall, an average accuracy of 85% was achieved for the model transferability scenario across all experiments. Our main conclusion is that such an approach qualifies for practical use.

Journal ArticleDOI
TL;DR: A novel deep model with convolutional neural networks (CNNs), i.e., an end-to-end self-cascaded network (ScasNet), for confusing manmade objects and fine-structured objects, ScasNet improves the labeling coherence with sequential global- to-local contexts aggregation.
Abstract: Semantic labeling for very high resolution (VHR) images in urban areas, is of significant importance in a wide range of remote sensing applications. However, many confusing manmade objects and intricate fine-structured objects make it very difficult to obtain both coherent and accurate labeling results. For this challenging task, we propose a novel deep model with convolutional neural networks (CNNs), i.e., an end-to-end self-cascaded network (ScasNet). Specifically, for confusing manmade objects, ScasNet improves the labeling coherence with sequential global-to-local contexts aggregation. Technically, multi-scale contexts are captured on the output of a CNN encoder, and then they are successively aggregated in a self-cascaded manner. Meanwhile, for fine-structured objects, ScasNet boosts the labeling accuracy with a coarse-to-fine refinement strategy. It progressively refines the target objects using the low-level features learned by CNN’s shallow layers. In addition, to correct the latent fitting residual caused by multi-feature fusion inside ScasNet, a dedicated residual correction scheme is proposed. It greatly improves the effectiveness of ScasNet. Extensive experimental results on three public datasets, including two challenging benchmarks, show that ScasNet achieves the state-of-the-art performance.

Journal ArticleDOI
Jonathan P. Dash1, Michael S. Watt1, Grant D. Pearse1, Marie Heaphy1, Heidi S. Dungey1 
TL;DR: In this paper, a disease outbreak in mature Pinus radiata D. don trees using targeted application of herbicide was simulated and a nonparametric approach was used to model physiological stress based on spectral indices and was found to provide good classification accuracy.
Abstract: Research into remote sensing tools for monitoring physiological stress caused by biotic and abiotic factors is critical for maintaining healthy and highly-productive plantation forests. Significant research has focussed on assessing forest health using remotely sensed data from satellites and manned aircraft. Unmanned aerial vehicles (UAVs) may provide new tools for improved forest health monitoring by providing data with very high temporal and spatial resolutions. These platforms also pose unique challenges and methods for health assessments must be validated before use. In this research, we simulated a disease outbreak in mature Pinus radiata D. Don trees using targeted application of herbicide. The objective was to acquire a time-series simulated disease expression dataset to develop methods for monitoring physiological stress from a UAV platform. Time-series multi-spectral imagery was acquired using a UAV flown over a trial at regular intervals. Traditional field-based health assessments of crown health (density) and needle health (discolouration) were carried out simultaneously by experienced forest health experts. Our results showed that multi-spectral imagery collected from a UAV is useful for identifying physiological stress in mature plantation trees even during the early stages of tree stress. We found that physiological stress could be detected earliest in data from the red edge and near infra-red bands. In contrast to previous findings, red edge data did not offer earlier detection of physiological stress than the near infra-red data. A non-parametric approach was used to model physiological stress based on spectral indices and was found to provide good classification accuracy (weighted kappa = 0.694). This model can be used to map physiological stress based on high-resolution multi-spectral data.

Journal ArticleDOI
TL;DR: A small-scale data based method, multi-grained network (MugNet), to explore the application of deep learning approaches in hyperspectral image classification and is built upon the basis of a very simple network which does not include many hyperparameters for tuning.
Abstract: In recent years, deep learning based methods have attracted broad attention in the field of hyperspectral image classification. However, due to the massive parameters and the complex network structure, deep learning methods may not perform well when only few training samples are available. In this paper, we propose a small-scale data based method, multi-grained network (MugNet), to explore the application of deep learning approaches in hyperspectral image classification. MugNet could be considered as a simplified deep learning model which mainly targets at limited samples based hyperspectral image classification. Three novel strategies are proposed to construct MugNet. First, the spectral relationship among different bands, as well as the spatial correlation within neighboring pixels, are both utilized via a multi-grained scanning approach. The proposed multi-grained scanning strategy could not only extract the joint spectral-spatial information, but also combine different grains’ spectral and spatial relationship. Second, because there are abundant unlabeled pixels available in hyperspectral images, we take full advantage of these samples, and adopt a semi-supervised manner in the process of generating convolution kernels. At last, the MugNet is built upon the basis of a very simple network which does not include many hyperparameters for tuning. The performance of MugNet is evaluated on a popular and two challenging data sets, and comparison experiments with several state-of-the-art hyperspectral image classification methods are revealed.

Journal ArticleDOI
TL;DR: In this paper, the power of high spatial resolution RGB, multispectral and thermal data fusion to estimate soybean (Glycine max) biochemical parameters including chlorophyll content and nitrogen concentration, and biophysical parameters including leaf area index (LAI), above ground fresh and dry biomass.
Abstract: Estimating crop biophysical and biochemical parameters with high accuracy at low-cost is imperative for high-throughput phenotyping in precision agriculture. Although fusion of data from multiple sensors is a common application in remote sensing, less is known on the contribution of low-cost RGB, multispectral and thermal sensors to rapid crop phenotyping. This is due to the fact that (1) simultaneous collection of multi-sensor data using satellites are rare and (2) multi-sensor data collected during a single flight have not been accessible until recent developments in Unmanned Aerial Systems (UASs) and UAS-friendly sensors that allow efficient information fusion. The objective of this study was to evaluate the power of high spatial resolution RGB, multispectral and thermal data fusion to estimate soybean (Glycine max) biochemical parameters including chlorophyll content and nitrogen concentration, and biophysical parameters including Leaf Area Index (LAI), above ground fresh and dry biomass. Multiple low-cost sensors integrated on UASs were used to collect RGB, multispectral, and thermal images throughout the growing season at a site established near Columbia, Missouri, USA. From these images, vegetation indices were extracted, a Crop Surface Model (CSM) was advanced, and a model to extract the vegetation fraction was developed. Then, spectral indices/features were combined to model and predict crop biophysical and biochemical parameters using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Extreme Learning Machine based Regression (ELR) techniques. Results showed that: (1) For biochemical variable estimation, multispectral and thermal data fusion provided the best estimate for nitrogen concentration and chlorophyll (Chl) a content (RMSE of 9.9% and 17.1%, respectively) and RGB color information based indices and multispectral data fusion exhibited the largest RMSE 22.6%; the highest accuracy for Chl a + b content estimation was obtained by fusion of information from all three sensors with an RMSE of 11.6%. (2) Among the plant biophysical variables, LAI was best predicted by RGB and thermal data fusion while multispectral and thermal data fusion was found to be best for biomass estimation. (3) For estimation of the above mentioned plant traits of soybean from multi-sensor data fusion, ELR yields promising results compared to PLSR and SVR in this study. This research indicates that fusion of low-cost multiple sensor data within a machine learning framework can provide relatively accurate estimation of plant traits and provide valuable insight for high spatial precision in agriculture and plant stress assessment.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper compared the similarities and discrepancies in both area and spatial patterns, and analyzed their inherent relations to data sources and classification schemes and methods, and built a spatial analysis model and depicted their spatial variation in accuracy based on the five sets of VSUs.
Abstract: Land cover (LC) is the vital foundation to Earth science. Up to now, several global LC datasets have arisen with efforts of many scientific communities. To provide guidelines for data usage over China, nine LC maps from seven global LC datasets (IGBP DISCover, UMD, GLC, MCD12Q1, GLCNMO, CCI-LC, and GlobeLand30) were evaluated in this study. First, we compared their similarities and discrepancies in both area and spatial patterns, and analysed their inherent relations to data sources and classification schemes and methods. Next, five sets of validation sample units (VSUs) were collected to calculate their accuracy quantitatively. Further, we built a spatial analysis model and depicted their spatial variation in accuracy based on the five sets of VSUs. The results show that, there are evident discrepancies among these LC maps in both area and spatial patterns. For LC maps produced by different institutes, GLC 2000 and CCI-LC 2000 have the highest overall spatial agreement (53.8%). For LC maps produced by same institutes, overall spatial agreement of CCI-LC 2000 and 2010, and MCD12Q1 2001 and 2010 reach up to 99.8% and 73.2%, respectively; while more efforts are still needed if we hope to use these LC maps as time series data for model inputting, since both CCI-LC and MCD12Q1 fail to represent the rapid changing trend of several key LC classes in the early 21st century, in particular urban and built-up, snow and ice, water bodies, and permanent wetlands. With the highest spatial resolution, the overall accuracy of GlobeLand30 2010 is 82.39%. For the other six LC datasets with coarse resolution, CCI-LC 2010/2000 has the highest overall accuracy, and following are MCD12Q1 2010/2001, GLC 2000, GLCNMO 2008, IGBP DISCover, and UMD in turn. Beside that all maps exhibit high accuracy in homogeneous regions; local accuracies in other regions are quite different, particularly in Farming-Pastoral Zone of North China, mountains in Northeast China, and Southeast Hills. Special attention should be paid for data users who are interested in these regions.

Journal ArticleDOI
TL;DR: In this paper, a hierarchical object-based Random Forest (RF) classification approach is proposed for discriminating between different wetland classes in a subregion located in the north eastern portion of the Avalon Peninsula.
Abstract: Wetlands are important ecosystems around the world, although they are degraded due both to anthropogenic and natural process. Newfoundland is among the richest Canadian province in terms of different wetland classes. Herbaceous wetlands cover extensive areas of the Avalon Peninsula, which are the habitat of a number of animal and plant species. In this study, a novel hierarchical object-based Random Forest (RF) classification approach is proposed for discriminating between different wetland classes in a sub-region located in the north eastern portion of the Avalon Peninsula. Particularly, multi-polarization and multi-frequency SAR data, including X-band TerraSAR-X single polarized (HH), L-band ALOS-2 dual polarized (HH/HV), and C-band RADARSAT-2 fully polarized images, were applied in different classification levels. First, a SAR backscatter analysis of different land cover types was performed by training data and used in Level-I classification to separate water from non-water classes. This was followed by Level-II classification, wherein the water class was further divided into shallow- and deep-water classes, and the non-water class was partitioned into herbaceous and non-herbaceous classes. In Level-III classification, the herbaceous class was further divided into bog, fen, and marsh classes, while the non-herbaceous class was subsequently partitioned into urban, upland, and swamp classes. In Level-II and -III classifications, different polarimetric decomposition approaches, including Cloude-Pottier, Freeman-Durden, Yamaguchi decompositions, and Kennaugh matrix elements were extracted to aid the RF classifier. The overall accuracy and kappa coefficient were determined in each classification level for evaluating the classification results. The importance of input features was also determined using the variable importance obtained by RF. It was found that the Kennaugh matrix elements, Yamaguchi, and Freeman-Durden decompositions were the most important parameters for wetland classification in this study. Using this new hierarchical RF classification approach, an overall accuracy of up to 94% was obtained for classifying different land cover types in the study area.

Journal ArticleDOI
TL;DR: A unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene is proposed, which is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale.
Abstract: This work investigates the estimation of dense three-dimensional motion fields, commonly referred to as scene flow. While great progress has been made in recent years, large displacements and adverse imaging conditions as observed in natural outdoor environments are still very challenging for current approaches to reconstruction and motion estimation. In this paper, we propose a unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene. We formulate the problem as the task of decomposing the scene into a small number of rigidly moving objects sharing the same motion parameters. Thus, our formulation effectively introduces long-range spatial dependencies which commonly employed local rigidity priors are lacking. Our inference algorithm then estimates the association of image segments and object hypotheses together with their three-dimensional shape and motion. We demonstrate the potential of the proposed approach by introducing a novel challenging scene flow benchmark which allows for a thorough comparison of the proposed scene flow approach with respect to various baseline models. In contrast to previous benchmarks, our evaluation is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale. Our experiments reveal that rigid motion segmentation can be utilized as an effective regularizer for the scene flow problem, improving upon existing two-frame scene flow methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.

Journal ArticleDOI
TL;DR: In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison, and results on real datasets validate the effectiveness and superiority of the proposed framework.
Abstract: Ternary change detection aims to detect changes and group the changes into positive change and negative change It is of great significance in the joint interpretation of spatial-temporal synthetic aperture radar images In this study, sparse autoencoder, convolutional neural networks (CNN) and unsupervised clustering are combined to solve ternary change detection problem without any supervison Firstly, sparse autoencoder is used to transform log-ratio difference image into a suitable feature space for extracting key changes and suppressing outliers and noise And then the learned features are clustered into three classes, which are taken as the pseudo labels for training a CNN model as change feature classifier The reliable training samples for CNN are selected from the feature maps learned by sparse autoencoder with certain selection rules Having training samples and the corresponding pseudo labels, the CNN model can be trained by using back propagation with stochastic gradient descent During its training procedure, CNN is driven to learn the concept of change, and more powerful model is established to distinguish different types of changes Unlike the traditional methods, the proposed framework integrates the merits of sparse autoencoder and CNN to learn more robust difference representations and the concept of change for ternary change detection Experimental results on real datasets validate the effectiveness and superiority of the proposed framework

Journal ArticleDOI
TL;DR: In this paper, the formation of surface urban heat island (SUHI) in a tropical mountain city of Southeast Asia (Baguio City, the summer capital of the Philippines) using Landsat data (1987-2015).
Abstract: Since it was first described about two centuries ago and due to its adverse impacts on urban ecological environment and the overall livability of cities, the urban heat island (UHI) phenomenon has been, and still is, an important research topic across various fields of study. However, UHI studies on cities in mountain regions are still lacking. This study aims to contribute to this endeavor by monitoring and examining the formation of surface UHI (SUHI) in a tropical mountain city of Southeast Asia –Baguio City, the summer capital of the Philippines– using Landsat data (1987–2015). Based on mean surface temperature difference between impervious surface (IS) and green space (GS1), SUHI intensity (SUHII) in the study area increased from 2.7 °C in 1987 to 3.4 °C in 2015. Between an urban zone (>86% impervious) and a rural zone (

Journal ArticleDOI
TL;DR: In this article, the authors evaluated the ability of Sentinel imagery for the retrieval and predictive mapping of above-ground biomass of mangroves and their replacement land uses, and developed models each from SAR raw polarisation backscatter data, multispectral bands, vegetation indices and canopy biophysical variables.
Abstract: The recent launch of the Sentinel-1 (SAR) and Sentinel-2 (multispectral) missions offers a new opportunity for land-based biomass mapping and monitoring especially in the tropics where deforestation is highest. Yet, unlike in agriculture and inland land uses, the use of Sentinel imagery has not been evaluated for biomass retrieval in mangrove forest and the non-forest land uses that replaced mangroves. In this study, we evaluated the ability of Sentinel imagery for the retrieval and predictive mapping of above-ground biomass of mangroves and their replacement land uses. We used Sentinel SAR and multispectral imagery to develop biomass prediction models through the conventional linear regression and novel Machine Learning algorithms. We developed models each from SAR raw polarisation backscatter data, multispectral bands, vegetation indices, and canopy biophysical variables. The results show that the model based on biophysical variable Leaf Area Index (LAI) derived from Sentinel-2 was more accurate in predicting the overall above-ground biomass. In contrast, the model which utilised optical bands had the lowest accuracy. However, the SAR-based model was more accurate in predicting the biomass in the usually deficient to low vegetation cover non-forest replacement land uses such as abandoned aquaculture pond, cleared mangrove and abandoned salt pond. These models had 0.82–0.83 correlation/agreement of observed and predicted value, and root mean square error of 27.8–28.5 Mg ha −1 . Among the Sentinel-2 multispectral bands, the red and red edge bands (bands 4, 5 and 7), combined with elevation data, were the best variable set combination for biomass prediction. The red edge-based Inverted Red-Edge Chlorophyll Index had the highest prediction accuracy among the vegetation indices. Overall, Sentinel-1 SAR and Sentinel-2 multispectral imagery can provide satisfactory results in the retrieval and predictive mapping of the above-ground biomass of mangroves and the replacement non-forest land uses, especially with the inclusion of elevation data. The study demonstrates encouraging results in biomass mapping of mangroves and other coastal land uses in the tropics using the freely accessible and relatively high-resolution Sentinel imagery.

Journal ArticleDOI
TL;DR: In this article, an octocopter was used to investigate the spatio-temporal variations of species composition in a tall grassland in Ontario, Canada, during the growing season (April to December) in 2015.
Abstract: Investigating spatio-temporal variations of species composition in grassland is an essential step in evaluating grassland health conditions, understanding the evolutionary processes of the local ecosystem, and developing grassland management strategies. Space-borne remote sensing images (e.g., MODIS, Landsat, and Quickbird) with spatial resolutions varying from less than 1 m to 500 m have been widely applied for vegetation species classification at spatial scales from community to regional levels. However, the spatial resolutions of these images are not fine enough to investigate grassland species composition, since grass species are generally small in size and highly mixed, and vegetation cover is greatly heterogeneous. Unmanned Aerial Vehicle (UAV) as an emerging remote sensing platform offers a unique ability to acquire imagery at very high spatial resolution (centimetres). Compared to satellites or airplanes, UAVs can be deployed quickly and repeatedly, and are less limited by weather conditions, facilitating advantageous temporal studies. In this study, we utilize an octocopter, on which we mounted a modified digital camera (with near-infrared (NIR), green, and blue bands), to investigate species composition in a tall grassland in Ontario, Canada. Seven flight missions were conducted during the growing season (April to December) in 2015 to detect seasonal variations, and four of them were selected in this study to investigate the spatio-temporal variations of species composition. To quantitatively compare images acquired at different times, we establish a processing flow of UAV-acquired imagery, focusing on imagery quality evaluation and radiometric correction. The corrected imagery is then applied to an object-based species classification. Maps of species distribution are subsequently used for a spatio-temporal change analysis. Results indicate that UAV-acquired imagery is an incomparable data source for studying fine-scale grassland species composition, owing to its high spatial resolution. The overall accuracy is around 85% for images acquired at different times. Species composition is spatially attributed by topographical features and soil moisture conditions. Spatio-temporal variation of species composition implies the growing process and succession of different species, which is critical for understanding the evolutionary features of grassland ecosystems. Strengths and challenges of applying UAV-acquired imagery for vegetation studies are summarized at the end.

Journal ArticleDOI
TL;DR: The impressive results demonstrate that the proposed SSGF and the extended method is effective to solve the problem of lacking an annotated HRRS dataset, which can learn valuable information from unlabeled samples to improve classification ability and obtain a reliable annotation dataset for supervised learning.
Abstract: High resolution remote sensing (HRRS) image scene classification plays a crucial role in a wide range of applications and has been receiving significant attention. Recently, remarkable efforts have been made to develop a variety of approaches for HRRS scene classification, wherein deep-learning-based methods have achieved considerable performance in comparison with state-of-the-art methods. However, the deep-learning-based methods have faced a severe limitation that a great number of manually-annotated HRRS samples are needed to obtain a reliable model. However, there are still not sufficient annotation datasets in the field of remote sensing. In addition, it is a challenge to get a large scale HRRS image dataset due to the abundant diversities and variations in HRRS images. In order to address the problem, we propose a semi-supervised generative framework (SSGF), which combines the deep learning features, a self-label technique, and a discriminative evaluation method to complete the task of scene classification and annotating datasets. On this basis, we further develop an extended algorithm (SSGA-E) and evaluate it by exclusive experiments. The experimental results show that the SSGA-E outperforms most of the fully-supervised methods and semi-supervised methods. It has achieved the third best accuracy on the UCM dataset, the second best accuracy on the WHU-RS, the NWPU-RESISC45, and the AID datasets. The impressive results demonstrate that the proposed SSGF and the extended method is effective to solve the problem of lacking an annotated HRRS dataset, which can learn valuable information from unlabeled samples to improve classification ability and obtain a reliable annotation dataset for supervised learning.

Journal ArticleDOI
TL;DR: H Hierarchical semantic cognition is presented in this study, and serves as a general cognition structure for recognizing urban functional zones and can further support urban planning and management.
Abstract: As the basic units of urban areas, functional zones are essential for city planning and management, but functional-zone maps are hardly available in most cities, as traditional urban investigations focus mainly on land-cover objects instead of functional zones. As a result, an automatic/semi-automatic method for mapping urban functional zones is highly required. Hierarchical semantic cognition (HSC) is presented in this study, and serves as a general cognition structure for recognizing urban functional zones. Unlike traditional classification methods, the HSC relies on geographic cognition and considers four semantic layers, i.e., visual features, object categories, spatial object patterns, and zone functions, as well as their hierarchical relations. Here, we used HSC to classify functional zones in Beijing with a very-high-resolution (VHR) satellite image and point-of-interest (POI) data. Experimental results indicate that this method can produce more accurate results than Support Vector Machine (SVM) and Latent Dirichlet Allocation (LDA) with a larger overall accuracy of 90.8%. Additionally, the contributions of diverse semantic layers are quantified: the object-category layer is the most important and makes 54% contribution to functional-zone classification; while, other semantic layers are less important but their contributions cannot be ignored. Consequently, the presented HSC is effective in classifying urban functional zones, and can further support urban planning and management.

Journal ArticleDOI
TL;DR: The results of fusing Landsat-8 Operational Land Imager data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series, and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data.
Abstract: Although many advances have been made in past decades, land cover classification of fine-resolution remotely sensed (RS) data integrating multiple temporal, angular, and spectral features remains limited, and the contribution of different RS features to land cover classification accuracy remains uncertain. We proposed to improve land cover classification accuracy by integrating multi-source RS features through data fusion. We further investigated the effect of different RS features on classification performance. The results of fusing Landsat-8 Operational Land Imager (OLI) data with Moderate Resolution Imaging Spectroradiometer (MODIS), China Environment 1A series (HJ-1A), and Advanced Spaceborne Thermal Emission and Reflection (ASTER) digital elevation model (DEM) data, showed that the fused data integrating temporal, spectral, angular, and topographic features achieved better land cover classification accuracy than the original RS data. Compared with the topographic feature, the temporal and angular features extracted from the fused data played more important roles in classification performance, especially those temporal features containing abundant vegetation growth information, which markedly increased the overall classification accuracy. In addition, the multispectral and hyperspectral fusion successfully discriminated detailed forest types. Our study provides a straightforward strategy for hierarchical land cover classification by making full use of available RS data. All of these methods and findings could be useful for land cover classification at both regional and global scales.

Journal ArticleDOI
TL;DR: A new approach to contextual classification of segmented airborne laser scanning data using a Conditional Random Field to minimise both under- and over-segmentation of point cloud segmentation methods.
Abstract: Classification of point clouds is needed as a first step in the extraction of various types of geo-information from point clouds. We present a new approach to contextual classification of segmented airborne laser scanning data. Potential advantages of segment-based classification are easily offset by segmentation errors. We combine different point cloud segmentation methods to minimise both under- and over-segmentation. We propose a contextual segment-based classification using a Conditional Random Field. Segment adjacencies are represented by edges in the graphical model and characterised by a range of features of points along the segment borders. A mix of small and large segments allows the interaction between nearby and distant points. Results of the segment-based classification are compared to results of a point-based CRF classification. Whereas only a small advantage of the segment-based classification is observed for the ISPRS Vaihingen dataset with 4–7 points/m2, the percentage of correctly classified points in a 30 points/m2 dataset of Rotterdam amounts to 91.0% for the segment-based classification vs. 82.8% for the point-based classification.

Journal ArticleDOI
TL;DR: A new method for extracting roads from high-resolution imagery based on hierarchical graph-based image segmentation, which demonstrates the validity and superior performance of the proposed method for road extraction in urban areas.
Abstract: Extraction of road networks in urban areas from remotely sensed imagery plays an important role in many urban applications (e.g. road navigation, geometric correction of urban remote sensing images, updating geographic information systems, etc.). It is normally difficult to accurately differentiate road from its background due to the complex geometry of the buildings and the acquisition geometry of the sensor. In this paper, we present a new method for extracting roads from high-resolution imagery based on hierarchical graph-based image segmentation. The proposed method consists of: 1. Extracting features (e.g., using Gabor and morphological filtering) to enhance the contrast between road and non-road pixels, 2. Graph-based segmentation consisting of (i) Constructing a graph representation of the image based on initial segmentation and (ii) Hierarchical merging and splitting of image segments based on color and shape features, and 3. Post-processing to remove irregularities in the extracted road segments. Experiments are conducted on three challenging datasets of high-resolution images to demonstrate the proposed method and compare with other similar approaches. The results demonstrate the validity and superior performance of the proposed method for road extraction in urban areas.

Journal ArticleDOI
TL;DR: The purpose is to highlight the progresses attained in the detection of landmines using hyperspectral imaging and to identify possible perspectives for future work, in order to achieve a better detection in real-time operation mode.
Abstract: Hyperspectral imaging is a trending technique in remote sensing that finds its application in many different areas, such as agriculture, mapping, target detection, food quality monitoring, etc. This technique gives the ability to remotely identify the composition of each pixel of the image. Therefore, it is a natural candidate for the purpose of landmine detection, thanks to its inherent safety and fast response time. In this paper, we will present the results of several studies that employed hyperspectral imaging for the purpose of landmine detection, discussing the different signal processing techniques used in this framework for hyperspectral image processing and target detection. Our purpose is to highlight the progresses attained in the detection of landmines using hyperspectral imaging and to identify possible perspectives for future work, in order to achieve a better detection in real-time operation mode.