scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Use of Salient Features for the Design of a Multistage Framework to Extract Roads From High-Resolution Multispectral Satellite Images

TL;DR: Two salient features of roads, namely, distinct spectral contrast and locally linear trajectory, are exploited to design a multistage framework to extract roads from high-resolution multispectral satellite images and are compared against a few state-of-the-art methods to validate the superior performance.
Abstract: The process of road extraction from high-resolution satellite images is complex, and most researchers have shown results on a few selected set of images. Based on the satellite data acquisition sensor and geolocation of the region, the type of processing varies and users tune several heuristic parameters to achieve a reasonable degree of accuracy. We exploit two salient features of roads, namely, distinct spectral contrast and locally linear trajectory, to design a multistage framework to extract roads from high-resolution multispectral satellite images. We trained four Probabilistic Support Vector Machines separately using four different categories of training samples extracted from urban/suburban areas. Dominant Singular Measure is used to detect locally linear edge segments as potential trajectories for roads. This complimentary information is integrated using an optimization framework to obtain potential targets for roads. This provides decent results in situations only when the roads have few obstacles (trees, large vehicles, and tall buildings). Linking of disjoint segments uses the local gradient functions at the adjacent pair of road endings. Region part segmentation uses curvature information to remove stray nonroad structures. Medial-Axis-Transform-based hypothesis verification eliminates connected nonroad structures to improve the accuracy in road detection. Results are evaluated with a large set of multispectral remotely sensed images and are compared against a few state-of-the-art methods to validate the superior performance of our proposed method.
Citations
More filters
Journal ArticleDOI
TL;DR: A semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction, which outperforms all the comparing methods and demonstrates its superiority over recently developed state of the arts methods.
Abstract: Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model are twofold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters, however, better performance. We test our network on a public road data set and compare it with U-Net and other two state-of-the-art deep-learning-based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.

1,564 citations


Cites background from "Use of Salient Features for the Des..."

  • ...[12] exploited two salient features of roads and designed a multistage framework to extract roads...

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation- Invariant layer on the basis of the existing CNN architectures.
Abstract: Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful feature representations, machine-learning-based object detection is receiving increasing attention. Although numerous feature representations exist, most of them are handcrafted or shallow-learning-based features. As the object detection task becomes more challenging, their description capability becomes limited or even impoverished. More recently, deep learning algorithms, especially convolutional neural networks (CNNs), have shown their much stronger feature representation power in computer vision. Despite the progress made in nature scene images, it is problematic to directly use the CNN feature for object detection in optical remote sensing images because it is difficult to effectively deal with the problem of object rotation variations. To address this problem, this paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation-invariant layer on the basis of the existing CNN architectures. However, different from the training of traditional CNN models that only optimizes the multinomial logistic regression objective, our RICNN model is trained by optimizing a new objective function via imposing a regularization constraint, which explicitly enforces the feature representations of the training samples before and after rotating to be mapped close to each other, hence achieving rotation invariance. To facilitate training, we first train the rotation-invariant layer and then domain-specifically fine-tune the whole RICNN network to further boost the performance. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.

1,370 citations


Cites background from "Use of Salient Features for the Des..."

  • ...can be performed by learning a classifier, such as support vector machine (SVM) [1], [7], [8], [12], [13], [20]–[24], AdaBoost [2]–[5], k-nearest neighbors [15], [17], conditional random field [6], [19], and sparse-coding-based classifier [9]–[11], [14], [16], which captures the variation in object appearances and views from a set of training data in a supervised [2]–[7], [9]–[14], [16]–[21], [23], [24] or semisupervised [15], [22] or weakly supervised framework [1], [8], [25], [51]....

    [...]

Journal ArticleDOI
TL;DR: This survey focuses on more generic object categories including, but not limited to, road, building, tree, vehicle, ship, airport, urban-area, and proposes two promising research directions, namely deep learning- based feature representation and weakly supervised learning-based geospatial object detection.
Abstract: Object detection in optical remote sensing images, being a fundamental but challenging problem in the field of aerial and satellite image analysis, plays an important role for a wide range of applications and is receiving significant attention in recent years. While enormous methods exist, a deep review of the literature concerning generic object detection is still lacking. This paper aims to provide a review of the recent progress in this field. Different from several previously published surveys that focus on a specific object class such as building and road, we concentrate on more generic object categories including, but are not limited to, road, building, tree, vehicle, ship, airport, urban-area. Covering about 270 publications we survey (1) template matching-based object detection methods, (2) knowledge-based object detection methods, (3) object-based image analysis (OBIA)-based object detection methods, (4) machine learning-based object detection methods, and (5) five publicly available datasets and three standard evaluation metrics. We also discuss the challenges of current studies and propose two promising research directions, namely deep learning-based feature representation and weakly supervised learning-based geospatial object detection. It is our hope that this survey will be beneficial for the researchers to have better understanding of this research field.

994 citations


Cites background or methods from "Use of Salient Features for the Des..."

  • ...…1998) and now has been widely used for various object detection applications, such as man-made objects recognition (Inglada, 2007), road extraction (Das et al., 2011; Huang and Zhang, 2009; Song and Civco, 2004), change detection (Bovolo et al., 2008; De Morsier et al., 2013), multi-class object…...

    [...]

  • ...…different types of objects in satellite and aerial images, such as roads (Barsi and Heipke, 2003; Barzohar and Coope, 1996; Chaudhuri et al., 2012; Das et al., 2011; Hu et al., 2007; Huang and Zhang, 2009; Kim et al., 2004; Laptev et al., 2000; Leninisha and Vani, 2015; Li et al., 2010; Maillard…...

    [...]

Journal ArticleDOI
TL;DR: A comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities is provided and a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images is proposed, which is named as DIOR.
Abstract: Substantial efforts have been devoted more recently to presenting various methods for object detection in optical remote sensing images. However, the current survey of datasets and deep learning based methods for object detection in optical remote sensing images is not adequate. Moreover, most of the existing datasets have some shortcomings, for example, the numbers of images and object categories are small scale, and the image diversity and variations are insufficient. These limitations greatly affect the development of deep learning based object detection methods. In the paper, we provide a comprehensive review of the recent deep learning based object detection progress in both the computer vision and earth observation communities. Then, we propose a large-scale, publicly available benchmark for object DetectIon in Optical Remote sensing images, which we name as DIOR. The dataset contains 23,463 images and 192,472 instances, covering 20 object classes. The proposed DIOR dataset (1) is large-scale on the object categories, on the object instance number, and on the total image number; (2) has a large range of object size variations, not only in terms of spatial resolutions, but also in the aspect of inter- and intra-class size variability across objects; (3) holds big variations as the images are obtained with different imaging conditions, weathers, seasons, and image quality; and (4) has high inter-class similarity and intra-class diversity. The proposed benchmark can help the researchers to develop and validate their data-driven methods. Finally, we evaluate several state-of-the-art approaches on our DIOR dataset to establish a baseline for future research.

771 citations


Cites background from "Use of Salient Features for the Des..."

  • ...To date, significant efforts (Cheng and Han, 2016; Cheng et al., 2016a; Das et al., 2011; Han et al., 2015; Li et al., 2018; Razakarivony and Jurie, 2015; Tang et al., 2017b; Xia et al., 2018; Yokoya and Iwasaki, 2015; Zhang et al., 2016; Zhu et al., 2017) have been made for object detection in remote sensing images....

    [...]

  • ...Driven by this requirement, significant efforts have been made in the past few years to develop a variety of methods for object detection in optical remote sensing images (Aksoy, 2014; Bai et al., 2014; Cheng et al., 2013a; Cheng and Han, 2016; Cheng et al., 2013b; Cheng et al., 2014; Cheng et al., 2019; Cheng et al., 2016a; Das et al., 2011; Han et al., 2015; Han et al., 2014; Li et al., 2018; Long et al., 2017; Tang et al., 2017b; Yang et al., 2017; Zhang et al., 2016; Zhang et al., 2017; Zhou et al., 2016)....

    [...]

Journal ArticleDOI
TL;DR: A novel deep model, i.e., a cascaded end-to-end convolutional neural network (CasNet), to simultaneously cope with the road detection and centerline extraction tasks and outperforms the state-of-the-art methods greatly in learning quality and learning speed.
Abstract: Accurate road detection and centerline extraction from very high resolution (VHR) remote sensing imagery are of central importance in a wide range of applications. Due to the complex backgrounds and occlusions of trees and cars, most road detection methods bring in the heterogeneous segments; besides for the centerline extraction task, most current approaches fail to extract a wonderful centerline network that appears smooth, complete, as well as single-pixel width. To address the above-mentioned complex issues, we propose a novel deep model, i.e., a cascaded end-to-end convolutional neural network (CasNet), to simultaneously cope with the road detection and centerline extraction tasks. Specifically, CasNet consists of two networks. One aims at the road detection task, whose strong representation ability is well able to tackle the complex backgrounds and occlusions of trees and cars. The other is cascaded to the former one, making full use of the feature maps produced formerly, to obtain the good centerline extraction. Finally, a thinning algorithm is proposed to obtain smooth, complete, and single-pixel width road centerline network. Extensive experiments demonstrate that CasNet outperforms the state-of-the-art methods greatly in learning quality and learning speed. That is, CasNet exceeds the comparing methods by a large margin in quantitative performance, and it is nearly 25 times faster than the comparing methods. Moreover, as another contribution, a large and challenging road centerline data set for the VHR remote sensing image will be publicly available for further studies.

346 citations


Cites background or methods from "Use of Salient Features for the Des..."

  • ...[5], [7]–[9], [32]–[34] and road centerline extraction methods...

    [...]

  • ...Most road detection approaches [5], [7]–[10] are based on the pixel-level labeling....

    [...]

  • ...[5] introduced a multistage framework to extract road from the high-resolution multispectral satellite image, in which probabilistic SVM and salient features were used....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations


"Use of Salient Features for the Des..." refers background in this paper

  • ...In SVM, the input vectors are mapped nonlinearly to a very high dimensional feature space [66]....

    [...]

  • ...However, SVM [66] produces an uncalibrated value that is not...

    [...]

Journal ArticleDOI
TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Abstract: This paper describes a computational approach to edge detection. The success of the approach depends on the definition of a comprehensive set of goals for the computation of edge points. These goals must be precise enough to delimit the desired behavior of the detector while making minimal assumptions about the form of the solution. We define detection and localization criteria for a class of edges, and present mathematical forms for these criteria as functionals on the operator impulse response. A third criterion is then added to ensure that the detector has only one response to a single edge. We use the criteria in numerical optimization to derive detectors for several common image features, including step edges. On specializing the analysis to step edges, we find that there is a natural uncertainty principle between detection and localization performance, which are the two main goals. With this principle we derive a single operator shape which is optimal at any scale. The optimal detector has a simple approximate implementation in which edges are marked at maxima in gradient magnitude of a Gaussian-smoothed image. We extend this simple detector using operators of several widths to cope with different signal-to-noise ratios in the image. We present a general method, called feature synthesis, for the fine-to-coarse integration of information from operators at different scales. Finally we show that step edge detector performance improves considerably as the operator point spread function is extended along the edge.

28,073 citations

Book
01 Jan 1973

20,541 citations

Journal ArticleDOI
TL;DR: A review of recent as well as classic image registration methods to provide a comprehensive reference source for the researchers involved in image registration, regardless of particular application areas.
Abstract: This paper aims to present a review of recent as well as classic image registration methods. Image registration is the process of overlaying images (two or more) of the same scene taken at different times, from different viewpoints, and/or by different sensors. The registration geometrically align two images (the reference and sensed images). The reviewed approaches are classified according to their nature (areabased and feature-based) and according to four basic steps of image registration procedure: feature detection, feature matching, mapping function design, and image transformation and resampling. Main contributions, advantages, and drawbacks of the methods are mentioned in the paper. Problematic issues of image registration and outlook for the future research are discussed too. The major goal of the paper is to provide a comprehensive reference source for the researchers involved in image registration, regardless of particular application areas. q 2003 Elsevier B.V. All rights reserved.

6,842 citations

Journal ArticleDOI
TL;DR: In this article, the authors introduce and study the most basic properties of three new variational problems which are suggested by applications to computer vision, and study their application in computer vision.
Abstract: : This reprint will introduce and study the most basic properties of three new variational problems which are suggested by applications to computer vision. In computer vision, a fundamental problem is to appropriately decompose the domain R of a function g (x,y) of two variables. This problem starts by describing the physical situation which produces images: assume that a three-dimensional world is observed by an eye or camera from some point P and that g1(rho) represents the intensity of the light in this world approaching the point sub 1 from a direction rho. If one has a lens at P focusing this light on a retina or a film-in both cases a plane domain R in which we may introduce coordinates x, y then let g(x,y) be the strength of the light signal striking R at a point with coordinates (x,y); g(x,y) is essentially the same as sub 1 (rho) -possibly after a simple transformation given by the geometry of the imaging syste. The function g(x,y) defined on the plane domain R will be called an image. What sort of function is g? The light reflected off the surfaces Si of various solid objects O sub i visible from P will strike the domain R in various open subsets R sub i. When one object O1 is partially in front of another object O2 as seen from P, but some of object O2 appears as the background to the sides of O1, then the open sets R1 and R2 will have a common boundary (the 'edge' of object O1 in the image defined on R) and one usually expects the image g(x,y) to be discontinuous along this boundary. (JHD)

5,516 citations


"Use of Salient Features for the Des..." refers methods in this paper

  • ...It is built on the basis of the modified two-phase Mumford–Shah model [42] with the combined feature constraints....

    [...]