Showing papers by "Yulan Guo published in 2018"

PDF

Open Access

Proceedings Article•DOI•

Learning for Disparity Estimation Through Feature Constancy

[...]

Zhengfa Liang¹, Yiliu Feng¹, Yulan Guo¹, Hengzhu Liu¹, Wei Chen¹, Linbo Qiao¹, Li Zhou¹, Jianfeng Zhang¹ - Show less +4 more•Institutions (1)

National University of Defense Technology¹

18 Jun 2018

TL;DR: In this article, the authors propose a network architecture to incorporate all steps of stereo matching, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement, which achieves the state-of-the-art performance on the KITTI 2012 and KittI 2015 benchmarks while maintaining a very fast running time.

...read moreread less

Abstract: Stereo matching algorithms usually consist of four steps, including matching cost calculation, matching cost aggregation, disparity calculation, and disparity refinement. Existing CNN-based methods only adopt CNN to solve parts of the four steps, or use different networks to deal with different steps, making them difficult to obtain the overall optimal solution. In this paper, we propose a network architecture to incorporate all steps of stereo matching. The network consists of three parts. The first part calculates the multi-scale shared features. The second part performs matching cost calculation, matching cost aggregation and disparity calculation to estimate the initial disparity using shared features. The initial disparity and the shared features are used to calculate the feature constancy that measures correctness of the correspondence between two input images. The initial disparity and the feature constancy are then fed into a sub-network to refine the initial disparity. The proposed method has been evaluated on the Scene Flow and KITTI datasets. It achieves the state-of-the-art performance on the KITTI 2012 and KITTI 2015 benchmarks while maintaining a very fast running time. Source code is available at http://github.com/leonzfa/iResNet.

...read moreread less

252 citations

Journal Article•DOI•

Spatial–Spectral Total Variation Regularized Low-Rank Tensor Decomposition for Hyperspectral Image Denoising

[...]

Haiyan Fan, Chang Li¹, Yulan Guo², Gangyao Kuang², Jiayi Ma³ - Show less +1 more•Institutions (3)

Hefei University of Technology¹, National University of Defense Technology², Wuhan University³

28 May 2018-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: Both simulated and real data experiments demonstrate that the proposed SSTV-LRTF method achieves superior performance for HSI mixed-noise removal, as compared to the state-of-the-art TV regularized and LR-based methods.

...read moreread less

Abstract: Several bandwise total variation (TV) regularized low-rank (LR)-based models have been proposed to remove mixed noise in hyperspectral images (HSIs). These methods convert high-dimensional HSI data into 2-D data based on LR matrix factorization. This strategy introduces the loss of useful multiway structure information. Moreover, these bandwise TV-based methods exploit the spatial information in a separate manner. To cope with these problems, we propose a spatial–spectral TV regularized LR tensor factorization (SSTV-LRTF) method to remove mixed noise in HSIs. From one aspect, the hyperspectral data are assumed to lie in an LR tensor, which can exploit the inherent tensorial structure of hyperspectral data. The LRTF-based method can effectively separate the LR clean image from sparse noise. From another aspect, HSIs are assumed to be piecewisely smooth in the spatial domain. The TV regularization is effective in preserving the spatial piecewise smoothness and removing Gaussian noise. These facts inspire the integration of the LRTF with TV regularization. To address the limitations of bandwise TV, we use the SSTV regularization to simultaneously consider local spatial structure and spectral correlation of neighboring bands. Both simulated and real data experiments demonstrate that the proposed SSTV-LRTF method achieves superior performance for HSI mixed-noise removal, as compared to the state-of-the-art TV regularized and LR-based methods.

...read moreread less

144 citations

Journal Article•DOI•

3-D Road Boundary Extraction From Mobile Laser Scanning Data via Supervoxels and Graph Cuts

[...]

Dawei Zai¹, Jonathan Li¹, Yulan Guo², Ming Cheng¹, Yangbin Lin³, Huan Luo¹, Cheng Wang¹ - Show less +3 more•Institutions (3)

Xiamen University¹, National University of Defense Technology², Jimei University³

01 Mar 2018-IEEE Transactions on Intelligent Transportation Systems

TL;DR: Experimental results show that road boundaries can be robustly extracted with an average completeness over 95%, an average correctness over 98%, and an average quality over 94% on two data sets.

...read moreread less

Abstract: Effective extraction of road boundaries plays a significant role in intelligent transportation applications, including autonomous driving, vehicle navigation, and mapping This paper presents a new method to automatically extract 3-D road boundaries from mobile laser scanning (MLS) data The proposed method includes two main stages: supervoxel generation and 3-D road boundary extraction Supervoxels are generated by selecting smooth points as seeds and assigning points into facets centered on these seeds using several attributes (eg, geometric, intensity, and spatial distance) 3-D road boundaries are then extracted using the $\alpha $ -shape algorithm and the graph cuts-based energy minimization algorithm The proposed method was tested on two data sets acquired by a RIEGL VMX-450 MLS system Experimental results show that road boundaries can be robustly extracted with an average completeness over 95%, an average correctness over 98%, and an average quality over 94% on two data sets The effectiveness and superiority of the proposed method over the state-of-the-art methods is demonstrated

...read moreread less

73 citations

Book Chapter•DOI•

Learning for Video Super-Resolution through HR Optical Flow Estimation

[...]

Longguang Wang¹, Yulan Guo¹, Zaiping Lin¹, Xinpu Deng¹, Wei An¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

02 Dec 2018

TL;DR: This paper proposes an end-to-end trainable video SR framework to super-resolve both images and optical flows and demonstrates that HR optical flows provide more accurate correspondences than their LR counterparts and improve both accuracy and consistency performance.

...read moreread less

Abstract: Video super-resolution (SR) aims to generate a sequence of high-resolution (HR) frames with plausible and temporally consistent details from their low-resolution (LR) counterparts. The generation of accurate correspondence plays a significant role in video SR. It is demonstrated by traditional video SR methods that simultaneous SR of both images and optical flows can provide accurate correspondences and better SR results. However, LR optical flows are used in existing deep learning based methods for correspondence generation. In this paper, we propose an end-to-end trainable video SR framework to super-resolve both images and optical flows. Specifically, we first propose an optical flow reconstruction network (OFRnet) to infer HR optical flows in a coarse-to-fine manner. Then, motion compensation is performed according to the HR optical flows. Finally, compensated LR inputs are fed to a super-resolution network (SRnet) to generate the SR results. Extensive experiments demonstrate that HR optical flows provide more accurate correspondences than their LR counterparts and improve both accuracy and consistency performance. Comparative results on the Vid4 and DAVIS-10 datasets show that our framework achieves the state-of-the-art performance.

...read moreread less

67 citations

Journal Article•DOI•

Robust Infrared Small Target Detection Using Multiscale Gray and Variance Difference Measures

[...]

Jinyan Gao¹, Yulan Guo¹, Zaiping Lin¹, Wei An¹, Jonathan Li² - Show less +1 more•Institutions (2)

National University of Defense Technology¹, Xiamen University²

19 Nov 2018-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: An accurate and robust method for infrared small target detection using multiscale gray and variance difference measures is proposed to alleviate the impact of background fluctuation and improve the robustness of the method.

...read moreread less

Abstract: As a long-standing problem, infrared small target detection is challenging due to the dimness of targets and the complexity of background. Considering the limitation of traditional approaches, we propose an accurate and robust method for infrared small target detection using multiscale gray and variance difference measures. A multiscale adaptive gray difference measure is first used to enhance small targets and improve detection accuracy. Then, a multiscale variance difference measure is proposed to alleviate the impact of background fluctuation and improve the robustness of our method. By integrating these two approaches, targets can be extracted accurately using a threshold-adaptive segmentation. Extensive experiments have been conducted on datasets with various scenes. Results have demonstrated the effectiveness and outperformance of our method as compared to the state-of-the-art methods.

...read moreread less

35 citations

Posted Content•

Learning for Video Super-Resolution through HR Optical Flow Estimation

[...]

Longguang Wang¹, Yulan Guo¹, Zaiping Lin¹, Xinpu Deng¹, Wei An¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

23 Sep 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: Wang et al. as mentioned in this paper proposed an end-to-end trainable video super-resolution framework to super-resolve both images and optical flows, where an optical flow reconstruction network (OFRnet) was proposed to infer HR optical flows in a coarse to fine manner.

...read moreread less

18 citations

Proceedings Article•DOI•

3DMAX-Net: A Multi-Scale Spatial Contextual Network for 3D Point Cloud Semantic Segmentation

[...]

Yanxin Ma¹, Yulan Guo¹, Yinjie Lei², Lu Min³, Jun Zhang¹ - Show less +1 more•Institutions (3)

National University of Defense Technology¹, Sichuan University², Hunan Institute of Science and Technology³

01 Aug 2018

TL;DR: A multi-scale feature learning block is first introduced to obtain informative contextual features in 3D point clouds and a global and local feature aggregation block is extended to improve the feature learning ability of the network.

...read moreread less

Abstract: Semantic segmentation of 3D scenes is a fundamental problem in 3D computer vision. In this paper, we propose a deep neural network for 3D semantic segmentation of raw point clouds. A multi-scale feature learning block is first introduced to obtain informative contextual features in 3D point clouds. A global and local feature aggregation block is then extended to improve the feature learning ability of the network. Based on these strategies, a powerful architecture named 3DMAX-Net is finally provided for semantic segmentation in raw 3D point clouds. Experiments have been conducted on the Stanford large-scale 3D Indoor Spaces Dataset using only geometry information. Experimental results have clearly shown the superiority of the proposed network.

...read moreread less

15 citations

Journal Article•DOI•

SISE: Self-Updating of Indoor Semantic Floorplans for General Entities

[...]

Xiaoqiang Teng¹, Deke Guo¹, Yulan Guo¹, Xiang Zhao¹, Zhong Liu¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

01 Nov 2018-IEEE Transactions on Mobile Computing

TL;DR: This paper presents SISE as a mobile crowdsourcing system that uses a new abstraction for indoor general entities and their semantics, enGraph, to automatically update changed semantics of indoor floorplans using images and inertial data, and proposes efficient methods to generate enGraph.

...read moreread less

Abstract: Indoor semantic floorplan is important for a range of location based service (LBS) applications, attracting many research efforts in several years. In many cases, the out-of-date indoor semantic floorplans would gradually deteriorate and even break down the LBS performance. Thus, it is important to automatically update changed semantics of indoor floorplans caused by environmental variation. However, few research has been focused on the continuous semantic updating problem. This paper presents SISE as a mobile crowdsourcing system that uses a new abstraction for indoor general entities and their semantics, enGraph, to automatically update changed semantics of indoor floorplans using images and inertial data. We first propose efficient methods to generate enGraph. Thus, an image can be associated with an indoor semantic floorplan. Accordingly, we formulate the enGraph matching problem and then propose a quality-based maximum common subgraph matching algorithm so that entities extracted from an image can be corresponded to entities in the indoor semantic floorplan. Furthermore, we propose a quadrant comparison algorithm and a region shrink based localization algorithm to detect and localize changed entities. Thus, the new semantics can be labeled and out-of-date semantics can be removed. Extensive experiments have been conducted on real and synthetic data. Experimental results show that 80 percent of out-of-date semantics of indoor general entities can be updated by SISE.

...read moreread less

10 citations

Journal Article•DOI•

Semi-Online Multiple Object Tracking Using Graphical Tracklet Association

[...]

Jiahui Wang¹, Yulan Guo¹, Xing Tang, Qingyong Hu¹, Wei An¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

27 Sep 2018-IEEE Signal Processing Letters

TL;DR: This letter proposes a semi-online MOT method using online discriminative appearance learning and tracklet association with a sliding window that is improved by 8.31% and 12.38% in terms of Multiple Object Tracking Accuracy and Multiple Object tracking Precision, respectively, as compared to the baseline.

...read moreread less

Abstract: Online multiple object tracking (MOT) is highly challenging when multiple objects have similar appearance or under long occlusion. In this letter, we propose a semi-online MOT method using online discriminative appearance learning and tracklet association with a sliding window. We connect similar detections of neighboring frames in a temporal window, and improve the performance of appearance feature by online discriminative appearance learning. Then, tracklet association is performed by minimizing a subgraph decomposition cost. Occlusions and missing detections are recovered after tracklet stitching. Our method has been tested on two public datasets. Experimental results have demonstrated the significant performance improvement of our method. Specifically, the proposed method is improved by 8.31% and 12.38% in terms of Multiple Object Tracking Accuracy and Multiple Object Tracking Precision, respectively, as compared to the baseline.

...read moreread less

9 citations

Book Chapter•DOI•

Infrared Small Target Detection Using Multiscale Gray and Variance Difference

[...]

Jinyan Gao¹, Yulan Guo², Yulan Guo¹, Zaiping Lin¹, Wei An¹ - Show less +1 more•Institutions (2)

National University of Defense Technology¹, Chinese Academy of Sciences²

23 Nov 2018

TL;DR: A local adaptive contrast measure for robust infrared small target detection using gray and variance difference is proposed and can achieve better detection performance than the state-of-the-art approaches.

...read moreread less

Abstract: Infrared small target detection plays an important role in infrared monitoring and early warning systems. This paper proposes a local adaptive contrast measure for robust infrared small target detection using gray and variance difference. First, a size-adaptive gray-level target enhancement process is performed. Then, an improved multiscale variance difference method is proposed for target enhancement and cloud clutter removal. To demonstrate the effectiveness of the proposed approach, a test dataset consisting of two infrared image sequences with different backgrounds was collected. Experiments on the test dataset demonstrate that the proposed infrared small target detection method can achieve better detection performance than the state-of-the-art approaches.

...read moreread less

3 citations

Proceedings Article•DOI•

Long-term Object Tracking with Instance Specific Proposals

[...]

Hao Liu¹, Qingyong Hu¹, Biao Li¹, Yulan Guo¹•Institutions (1)

National University of Defense Technology¹

01 Aug 2018

TL;DR: The proposed Complementary Learners with Instance-specific Proposals (CLIP) tracker consists of three main components, including a translation filter, a scale filter, and an error correction module, which aims to provide an excellent real-time inference.

...read moreread less

Abstract: Correlation filter based trackers have been extensively investigated for their superior efficiency and fairly good robustness. However, it remains challenging to achieve longterm tracking when the object is under occlusion and severe deformation. In this paper, we propose a tracker named Complementary Learners with Instance-specific Proposals (CLIP). The CLIP tracker consists of three main components, including a translation filter, a scale filter, and an error correction module. Complementary features are incorporated into the translation filter to cope with illumination changes and deformation, and an adaptive updating mechanism is proposed to prevent model corruption. The translation filter aims to provide an excellent real-time inference. Furthermore, the error correction module is activated to correct the localization error by an instance-specific proposal generator, especially when the target suffers from dramatic appearance changes. Experimental results on the OTB, Temple-Color 128 and UAV20L datasets demonstrate that the CLIP tracker performs favorably against existing competitive trackers in term of accuracy and robustness. Moreover, our proposed CLIP tracker runs at the speed of 33 fps on the OTB. It is highly suitable for real-time applications.

...read moreread less

Proceedings Article•DOI•

Simultaneous Context Feature Learning and Hashing for Large Scale Loop Closure Detection

[...]

Zhiheng Fu¹, Yulan Guo¹, Wei An¹•Institutions (1)

National University of Defense Technology¹

01 Aug 2018

TL;DR: An augmented descriptor is proposed by combining ORB feature and the context descriptor to increase its discriminability and matching performance and achieves higher precision/recall and faster speed than the original algorithm proposed by Antonio et al.

...read moreread less

Abstract: Visual loop closure is important in pose tracking and relocalization in many robotics and Argument Reality (AR) systems. For large and highly repetitive environments, sparse keypoint-based methods face several challenges, especially the discriminability of descriptors. In this paper, we propose an augmented descriptor by combining ORB feature and the context descriptor to increase its discriminability and matching performance. An end-to-end network is adopted to perform simultaneous feature learning and code hashing for the context. In addition, feature position clustering is used to reduce the number of contexts. Besides, hash mapping is adopted to reduce the dimensionality of ORB features. Finally, the context descriptors and ORB features with dimensionality reduction are stacked. Experimental results on the NewCollege and TUM datasets demonstrate that our algorithm achieves higher precision/recall and faster speed than the original algorithm proposed by Antonio et al. [1].

...read moreread less

Patent•

Semi-online visual multi-target tracking method based on wavelet graph correlation model

[...]

Sheng Weidong, Jiahui Wang, Yulan Guo, Jungang Yang, Wei An - Show less +1 more

06 Nov 2018

TL;DR: In this article, a semi-online visual multi-target tracking method based on a wavelet graph correlation model is proposed, and the multi-frame detection of a plurality of targets is associated into a small track in a short time window, and appearance characteristics and the movement speed of the initial time period and the ending time period of the wavelets are extracted.

...read moreread less

Abstract: The invention discloses a semi-online visual multi-target tracking method based on a wavelet graph correlation model, and the multi-frame detection of a plurality of targets is associated into a smalltrack in a short time window, and the appearance characteristics and the movement speed of the initial time period and the ending time period of the wavelets are extracted. After the mutual attraction between the wavelets is evaluated, the wavelets are further correlated to form a long track through an undirected graph model, and partial results are output after batch processing. The method achieves the high offline tracking precision and cannot achieve the compromising with the low online tracking accuracy in real time, achieving better balance and having the characteristics of being rapid,simple and robust. By establishing the appearance model of the target, the appearance feature of the target has higher discrimination, and the time sequence is determined before the wavelet correlation, the appearance similarity analysis is carried out on the feature with the closer time, and the number of identity changes is effectively reduced, and the time delay cannot be longer, so that the algorithm is high in precision.

...read moreread less