scispace - formally typeset
Search or ask a question
Author

Qingjie Liu

Bio: Qingjie Liu is an academic researcher from Beihang University. The author has contributed to research in topics: Convolutional neural network & Computer science. The author has an hindex of 16, co-authored 72 publications receiving 1503 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction, which outperforms all the comparing methods and demonstrates its superiority over recently developed state of the arts methods.
Abstract: Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model are twofold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters, however, better performance. We test our network on a public road data set and compare it with U-Net and other two state-of-the-art deep-learning-based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.

1,564 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel deep feature-based method to detect ships in very high-resolution optical remote sensing images by using a regional proposal network to generate ship candidates from feature maps produced by a deep convolutional neural network.
Abstract: Ship detection is an important and challenging task in remote sensing applications. Most methods utilize specially designed hand-crafted features to detect ships, and they usually work well only on one scale, which lack generalization and impractical to identify ships with various scales from multiresolution images. In this paper, we propose a novel deep feature-based method to detect ships in very high-resolution optical remote sensing images. In our method, a regional proposal network is used to generate ship candidates from feature maps produced by a deep convolutional neural network. To efficiently detect ships with various scales, a hierarchical selective filtering layer is proposed to map features in different scales to the same scale space. The proposed method is an end-to-end network that can detect both inshore and offshore ships ranging from dozens of pixels to thousands. We test our network on a large ship data set which will be released in the future, consisting of Google Earth images, GaoFen-2 images, and unmanned aerial vehicle data. Experiments demonstrate high precision and robustness of our method. Further experiments on aerial images show its good generalization to unseen scenes.

152 citations

Journal ArticleDOI
TL;DR: Experiments demonstrate that the proposed Two-stream Fusion Network (TFNet) can fuse PAN and MS images effectively, and produce pan-sharpened images competitive with even superior to state of the arts images.
Abstract: Remote sensing image fusion (also known as pan-sharpening) aims at generating a high resolution multi-spectral (MS) image from inputs of a high spatial resolution single band panchromatic (PAN) image and a low spatial resolution multi-spectral image. Inspired by the astounding achievements of convolutional neural networks (CNNs) in a variety of computer vision tasks, in this paper we propose a Two-stream Fusion Network (TFNet) to address the problem of pan-sharpening. Unlike many previous CNN based methods that consider pan-sharpening as a super-resolution problem and perform pan-sharpening through mapping the stacked PAN and MS to the target high resolution MS image, the proposed TFNet aims to fuse PAN and MS images in feature domain and reconstruct the pan-sharpened image from the fused features. The TFNet mainly consists of three parts. The first part is comprised of two networks extracting features from PAN and MS images, respectively. The subsequent network fuses them together to form compact features that represent both spatial and spectral information of PAN and MS images, simultaneously. Finally, the desired high spatial resolution MS image is recovered from the fused features through an image reconstruction network. Experiments on Quickbird and GaoFen-1 images demonstrate that the proposed TFNet can fuse PAN and MS images effectively, and produce pan-sharpened images competitive with even superior to state of the arts.

140 citations

Book ChapterDOI
05 Feb 2018
TL;DR: Experiments on Quickbird and GaoFen-1 satellite images demonstrate that the proposed TFNet can fuse PAN and MS images, effectively, and produce pan-sharpened images competitive with even superior to state of the arts.
Abstract: Remote sensing image fusion (or pan-sharpening) aims at generating high resolution multi-spectral (MS) image from inputs of a high spatial resolution single band panchromatic (PAN) image and a low spatial resolution multi-spectral image. In this paper, a deep convolutional neural network with two-stream inputs respectively for PAN and MS images is proposed for remote sensing image pan-sharpening. Firstly the network extracts features from PAN and MS images, then it fuses them to form compact feature maps that can represent both spatial and spectral information of PAN and MS images, simultaneously. Finally, the desired high spatial resolution MS image is recovered from the fused features using an encoding-decoding scheme. Experiments on Quickbird satellite images demonstrate that the proposed method can fuse the PAN and MS image effectively.

114 citations

Posted Content
Guangshuai Gao1, Junyu Gao, Qingjie Liu, Qi Wang, Yunhong Wang 
TL;DR: Over 220 works are surveyed to comprehensively and systematically study the crowd counting models, mainly CNN-based density map estimation methods to make reasonable inference and prediction for the future development of crowd counting and to provide feasible solutions for the problem of object counting in other fields.
Abstract: Accurately estimating the number of objects in a single image is a challenging yet meaningful task and has been applied in many applications such as urban planning and public safety. In the various object counting tasks, crowd counting is particularly prominent due to its specific significance to social security and development. Fortunately, the development of the techniques for crowd counting can be generalized to other related fields such as vehicle counting and environment survey, if without taking their characteristics into account. Therefore, many researchers are devoting to crowd counting, and many excellent works of literature and works have spurted out. In these works, they are must be helpful for the development of crowd counting. However, the question we should consider is why they are effective for this task. Limited by the cost of time and energy, we cannot analyze all the algorithms. In this paper, we have surveyed over 220 works to comprehensively and systematically study the crowd counting models, mainly CNN-based density map estimation methods. Finally, according to the evaluation metrics, we select the top three performers on their crowd counting datasets and analyze their merits and drawbacks. Through our analysis, we expect to make reasonable inference and prediction for the future development of crowd counting, and meanwhile, it can also provide feasible solutions for the problem of object counting in other fields. We provide the density maps and prediction results of some mainstream algorithm in the validation set of NWPU dataset for comparison and testing. Meanwhile, density map generation and evaluation tools are also provided. All the codes and evaluation results are made publicly available at this https URL.

100 citations


Cited by
More filters
01 Jan 2006

3,012 citations

Journal ArticleDOI
TL;DR: A semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction, which outperforms all the comparing methods and demonstrates its superiority over recently developed state of the arts methods.
Abstract: Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model are twofold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters, however, better performance. We test our network on a public road data set and compare it with U-Net and other two state-of-the-art deep-learning-based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.

1,564 citations

Proceedings ArticleDOI
01 Jun 2019
TL;DR: Experimental results on six public datasets show that the proposed predict-refine architecture, BASNet, outperforms the state-of-the-art methods both in terms of regional and boundary evaluation measures.
Abstract: Deep Convolutional Neural Networks have been adopted for salient object detection and achieved the state-of-the-art performance. Most of the previous works however focus on region accuracy but not on the boundary quality. In this paper, we propose a predict-refine architecture, BASNet, and a new hybrid loss for Boundary-Aware Salient object detection. Specifically, the architecture is composed of a densely supervised Encoder-Decoder network and a residual refinement module, which are respectively in charge of saliency prediction and saliency map refinement. The hybrid loss guides the network to learn the transformation between the input image and the ground truth in a three-level hierarchy -- pixel-, patch- and map- level -- by fusing Binary Cross Entropy (BCE), Structural SIMilarity (SSIM) and Intersection-over-Union (IoU) losses. Equipped with the hybrid loss, the proposed predict-refine architecture is able to effectively segment the salient object regions and accurately predict the fine structures with clear boundaries. Experimental results on six public datasets show that our method outperforms the state-of-the-art methods both in terms of regional and boundary evaluation measures. Our method runs at over 25 fps on a single GPU. The code is available at: https://github.com/NathanUA/BASNet.

962 citations