Change Detection Based on Deep Siamese Convolutional Network for Optical Aerial Images

doi:10.1109/LGRS.2017.2738149

Home
/
Papers
/
Change Detection Based on Deep Siamese Convolutional Network for Optical Aerial Images

Journal Article•DOI•

Change Detection Based on Deep Siamese Convolutional Network for Optical Aerial Images

Yang Zhan¹, Kun Fu¹, Menglong Yan¹, Xian Sun¹, Hongqi Wang¹, Xiaosong Qiu¹ - Show less +2 more•Institutions (1)

Chinese Academy of Sciences¹

30 Aug 2017-IEEE Geoscience and Remote Sensing Letters (IEEE)-Vol. 14, Iss: 10, pp 1845-1849

TL;DR: A novel supervised change detection method based on a deep siamese convolutional network for optical aerial images that is comparable, even better, with the two state-of-the-art methods in terms of F-measure.

read less

Abstract: In this letter, we propose a novel supervised change detection method based on a deep siamese convolutional network for optical aerial images. We train a siamese convolutional network using the weighted contrastive loss. The novelty of the method is that the siamese network is learned to extract features directly from the image pairs. Compared with hand-crafted features used by the conventional change detection method, the extracted features are more abstract and robust. Furthermore, because of the advantage of the weighted contrastive loss function, the features have a unique property: the feature vectors of the changed pixel pair are far away from each other, while the ones of the unchanged pixel pair are close. Therefore, we use the distance of the feature vectors to detect changes between the image pair. Simple threshold segmentation on the distance map can even obtain good performance. For improvement, we use a $k$ -nearest neighbor approach to update the initial result. Experimental results show that the proposed method produces results comparable, even better, with the two state-of-the-art methods in terms of F-measure.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection

[...]

Hao Chen, Zhenwei Shi

22 May 2020-Remote Sensing

TL;DR: This work proposes a novel Siamese-based spatial–temporal attention neural network, which improves the F1-score of the baseline model from 83.9 to 87.3 with acceptable computational overhead and introduces a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field.

...read moreread less

Abstract: Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the performances of CD methods. In our work, we propose a novel Siamese-based spatial–temporal attention neural network. In contrast to previous methods that separately encode the bitemporal images without referring to any useful spatial–temporal dependency, we design a CD self-attention mechanism to model the spatial–temporal relationships. We integrate a new CD self-attention module in the procedure of feature extraction. Our self-attention module calculates the attention weights between any two pixels at different times and positions and uses them to generate more discriminative features. Considering that the object may have different scales, we partition the image into multi-scale subregions and introduce the self-attention in each subregion. In this way, we could capture spatial–temporal dependencies at various scales, thereby generating better representations to accommodate objects of various sizes. We also introduce a CD dataset LEVIR-CD, which is two orders of magnitude larger than other public datasets of this field. LEVIR-CD consists of a large set of bitemporal Google Earth images, with 637 image pairs (1024 × 1024) and over 31 k independently labeled change instances. Our proposed attention module improves the F1-score of our baseline model from 83.9 to 87.3 with acceptable computational overhead. Experimental results on a public remote sensing image CD dataset show our method outperforms several other state-of-the-art methods.

...read moreread less

552 citations

Cites background or methods from "Change Detection Based on Deep Siam..."

...Remote sensing image CD requires pixel-wise prediction and benefits from the dense features by FCN based methods [46,47]....
[...]
...The embedding space can be learned by deep Siamese fully convolutional networks (FCN) [27,28], which contains two identical networks sharing the same weight, each independently generating the feature maps for each temporal image....
[...]
...During the last few years, deep metric learning has been applied in many remote sensing applications [27,28,60,61]....
[...]
...The first approach used an FCN to separately classify the land use of each temporal image and then determined the change type by the change trajectory....
[...]
...The results of DSCNN, rRL and TBSRL are reported by [28]....
[...]

Proceedings Article•DOI•

Fully Convolutional Siamese Networks for Change Detection

[...]

Rodrigo Caye Daudt¹, Bertrand Le Saux¹, Alexandre Boulch¹•Institutions (1)

Université Paris-Saclay¹

05 Oct 2018

TL;DR: This paper presents three fully convolutional neural network architectures which perform change detection using a pair of coregistered images, and proposes two Siamese extensions of fully Convolutional networks which use heuristics about the current problem to achieve the best results.

...read moreread less

Abstract: This paper presents three fully convolutional neural network architectures which perform change detection using a pair of coregistered images. Most notably, we propose two Siamese extensions of fully convolutional networks which use heuristics about the current problem to achieve the best results in our tests on two open change detection datasets, using both RGB and multispectral images. We show that our system is able to learn from scratch using annotated change detection images. Our architectures achieve better performance than previously proposed methods, while being at least 500 times faster than related systems. This work is a step towards efficient processing of data from large scale Earth observation systems such as Copernicus or Landsat.

...read moreread less

484 citations

Cites methods or result from "Change Detection Based on Deep Siam..."

...Comparison between the results obtained by the method presented in [11] and the ones described in this paper on the Air Change dataset....
[...]
...For the AC dataset, the methods user for comparison were DSCN [11], CXM [4], and SCCN [8], using the values claimed by Zhan et al. in [11]....
[...]
...For the AC dataset, the methods user for comparison were DSCN [11], CXM [4], and SCCN [8], using the values claimed by Zhan et al....
[...]
...The proposed techniques have followed the tendencies of computer vision and image analysis: at first, pixels were analyzed directly using manually crafted techniques; later on, descriptors began to be used in conjunction with simple machine learning techniques [6]; recently, more elaborate machine learning techniques (deep learning) are dominating most problems in the image analysis field, and this evolution is slowly reaching the problem of change detection [7, 8, 9, 10, 11, 3, 12]....
[...]
...For the AC dataset, we followed the data split that was proposed in [11]: the top left 748x448 rectangle of the Data Network Prec....
[...]

Journal Article•DOI•

End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet++

[...]

Daifeng Peng, Yongjun Zhang, Haiyan Guan

10 Jun 2019-Remote Sensing

TL;DR: A novel end-to-end CD method based on an effective encoderdecoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets, which outperforms the other state-of-the-art CD methods.

...read moreread less

Abstract: Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modeling, deep learning is becoming increasingly popular to solve CD tasks in remote-sensing community. However, most existing deep learning-based CD methods are implemented by either generating difference images using deep features or learning change relations between pixel patches, which leads to error accumulation problems since many intermediate processing steps are needed to obtain final change maps. To address the above-mentioned issues, a novel end-to-end CD method is proposed based on an effective encoder-decoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets. Firstly, co-registered image pairs are concatenated as an input for the improved UNet++ network, where both global and fine-grained information can be utilized to generate feature maps with high spatial accuracy. Then, the fusion strategy of multiple side outputs is adopted to combine change maps from different semantic levels, thereby generating a final change map with high accuracy. The effectiveness and reliability of our proposed CD method are verified on very-high-resolution (VHR) satellite image datasets. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art CD methods.

...read moreread less

408 citations

Journal Article•DOI•

DASNet: Dual Attentive Fully Convolutional Siamese Networks for Change Detection in High-Resolution Satellite Images

[...]

Jie Chen¹, Yuan Ziyang¹, Jian Peng¹, Li Chen¹, Haozhe Huang¹, Jiawei Zhu¹, Yu Liu², Haifeng Li¹ - Show less +4 more•Institutions (2)

Central South University¹, National University of Defense Technology²

01 Jan 2021-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: The weighted double-margin contrastive loss is proposed to address the imbalanced sample is a serious problem in change detection, i.e., unchanged samples are much more abundant than changed samples, which is one of the main reasons for pseudochanges.

...read moreread less

Abstract: Change detection is a basic task of remote sensing image processing. The research objective is to identify the change information of interest and filter out the irrelevant change information as interference factors. Recently, the rise in deep learning has provided new tools for change detection, which have yielded impressive results. However, the available methods focus mainly on the difference information between multitemporal remote sensing images and lack robustness to pseudochange information. To overcome the lack of resistance in current methods to pseudochanges, in this article, we propose a new method, namely, dual attentive fully convolutional Siamese networks, for change detection in high-resolution images. Through the dual attention mechanism, long-range dependencies are captured to obtain more discriminant feature representations to enhance the recognition performance of the model. Moreover, the imbalanced sample is a serious problem in change detection, i.e., unchanged samples are much more abundant than changed samples, which is one of the main reasons for pseudochanges. We propose the weighted double-margin contrastive loss to address this problem by punishing attention to unchanged feature pairs and increasing attention to changed feature pairs. The experimental results of our method on the change detection dataset and the building change detection dataset demonstrate that compared with other baseline methods, the proposed method realizes maximum improvements of 2.9% and 4.2%, respectively, in the F 1 score. Our PyTorch implementation is available at https://github.com/lehaifeng/DASNet .

...read moreread less

324 citations

Cites methods from "Change Detection Based on Deep Siam..."

...SCCN [28] uses a deep symmetrical network to study changes in remote sensing images, and DSCN [29] uses two branch networks that share weights for feature extraction and uses the features that are obtained by the last layer of the two branches for threshold segmentation to obtain a binary change map....
[...]
...[28] uses a deep symmetrical network to study changes in remote sensing images, and DSCN [29] uses two branch networks that share weights for feature extraction and uses the features that...
[...]

Journal Article•DOI•

Unsupervised Deep Change Vector Analysis for Multiple-Change Detection in VHR Images

[...]

Sudipan Saha¹, Francesca Bovolo¹, Lorenzo Bruzzone²•Institutions (2)

fondazione bruno kessler¹, University of Trento²

10 Jan 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A novel unsupervised context-sensitive framework—deep change vector analysis (DCVA)—for CD in multitemporal VHR images that exploit convolutional neural network (CNN) features is proposed and experimental results on mult itemporal data sets of Worldview-2, Pleiades, and Quickbird images confirm the effectiveness of the proposed method.

...read moreread less

Abstract: Change detection (CD) in multitemporal images is an important application of remote sensing. Recent technological evolution provided very high spatial resolution (VHR) multitemporal optical satellite images showing high spatial correlation among pixels and requiring an effective modeling of spatial context to accurately capture change information. Here, we propose a novel unsupervised context-sensitive framework—deep change vector analysis (DCVA)—for CD in multitemporal VHR images that exploit convolutional neural network (CNN) features. To have an unsupervised system, DCVA starts from a suboptimal pretrained multilayered CNN for obtaining deep features that can model spatial relationship among neighboring pixels and thus complex objects. An automatic feature selection strategy is employed layerwise to select features emphasizing both high and low prior probability change information. Selected features from multiple layers are combined into a deep feature hypervector providing a multiscale scene representation. The use of the same pretrained CNN for semantic segmentation of single images enables us to obtain coherent multitemporal deep feature hypervectors that can be compared pixelwise to obtain deep change vectors that also model spatial context information. Deep change vectors are analyzed based on their magnitude to identify changed pixels. Then, deep change vectors corresponding to identified changed pixels are binarized to obtain a compressed binary deep change vectors that preserve information about the direction (kind) of change. Changed pixels are analyzed for multiple CD based on the binary features, thus implicitly using the spatial information. Experimental results on multitemporal data sets of Worldview-2, Pleiades, and Quickbird images confirm the effectiveness of the proposed method.

...read moreread less

310 citations

Cites methods from "Change Detection Based on Deep Siam..."

...[34] proposed a supervised CD method for optical aerial images based on the deep Siamese network....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81

Collapse

References

PDF

Open Access

More filters

Posted Content•

Caffe: Convolutional Architecture for Fast Feature Embedding

[...]

Yangqing Jia¹, Evan Shelhamer², Jeff Donahue², Sergey Karayev², Jonathan Long², Ross Girshick², Sergio Guadarrama², Trevor Darrell² - Show less +4 more•Institutions (2)

Google¹, University of California, Berkeley²

20 Jun 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Abstract: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU ($\approx$ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.

...read moreread less

12,531 citations

Posted Content•

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren¹, Jian Sun¹•Institutions (2)

Microsoft¹, Xi'an Jiaotong University²

06 Feb 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.

...read moreread less

Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

...read moreread less

11,866 citations

"Change Detection Based on Deep Siam..." refers methods in this paper

...The weights of each convolutional layer are initialized with the Msra algorithm [15]....
[...]

Proceedings Article•DOI•

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren¹, Jian Sun¹•Institutions (2)

Microsoft¹, Xi'an Jiaotong University²

07 Dec 2015

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.

...read moreread less

Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on the learnable activation and advanced initialization, we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66% [33]). To our knowledge, our result is the first to surpass the reported human-level performance (5.1%, [26]) on this dataset.

...read moreread less

11,732 citations

Proceedings Article•DOI•

Caffe: Convolutional Architecture for Fast Feature Embedding

[...]

Yangqing Jia¹, Evan Shelhamer², Jeff Donahue², Sergey Karayev², Jonathan Long², Ross Girshick², Sergio Guadarrama², Trevor Darrell² - Show less +4 more•Institutions (2)

Google¹, University of California, Berkeley²

03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Abstract: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (approx 2 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments.Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.

...read moreread less

10,161 citations

"Change Detection Based on Deep Siam..." refers methods in this paper

...3) Optimization: We implement the proposed siamese network using the Caffe [14] framework....
[...]

Proceedings Article•DOI•

Dimensionality Reduction by Learning an Invariant Mapping

[...]

Raia Hadsell¹, Sumit Chopra¹, Yann LeCun¹•Institutions (1)

New York University¹

17 Jun 2006

TL;DR: This work presents a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold.

...read moreread less

Abstract: Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that 'similar" points in input space are mapped to nearby points on the manifold. We present a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold. The learning relies solely on neighborhood relationships and does not require any distancemeasure in the input space. The method can learn mappings that are invariant to certain transformations of the inputs, as is demonstrated with a number of experiments. Comparisons are made to other techniques, in particular LLE.

...read moreread less

4,524 citations

"Change Detection Based on Deep Siam..." refers background or methods in this paper

...As [10] presented, the contrastive loss can produce the abovementioned function when it gets a minimum value....
[...]
..., the numbers of changed and unchanged pixels vary greatly) in change detection, we use a weighted contrastive loss [10], in which not only the unchanged pixels but also the changed ones are considered as the objective function when training the network....
[...]
...In [10], LU and LC are defined as follows:...
[...]
...LU and LC must be designed, such that Di, j would produce a low value for a pair of unchanged pixels and a high value for the changed pixel pair when L gets the minimum value [10]....
[...]
...Define Dw(X1, X2)i, j be the Euclidean distance between the feature vector Gw(X1)i, j and Gw(X2)i, j [10]....
[...]