Showing papers on "Homography (computer vision) published in 2021"

PDF

Open Access

Proceedings Article•DOI•

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

[...]

Dongxu Li¹, Chenchen Xu¹, Kaihao Zhang¹, Xin Yu², Yiran Zhong¹, Wenqi Ren, Hanna Suominen¹, Hongdong Li¹ - Show less +4 more•Institutions (2)

Australian National University¹, University of Technology, Sydney²

01 Jun 2021

TL;DR: Wang et al. as mentioned in this paper propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space, which builds a correlation volume pyramid among all the pixel-pairs between neigh-boring frames.

...read moreread less

Abstract: Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neigh-boring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.

...read moreread less

59 citations

Journal Article•DOI•

Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images

[...]

Lang Nie¹, Chunyu Lin¹, Kang Liao¹, Shuaicheng Liu², Yao Zhao¹ - Show less +1 more•Institutions (2)

Beijing Jiaotong University¹, University of Electronic Science and Technology of China²

02 Jul 2021-IEEE Transactions on Image Processing

TL;DR: Zhang et al. as mentioned in this paper proposed an unsupervised deep image stitching framework consisting of two stages, which consists of an ablation-based loss and a transformer layer to warp the input images in the stitching-domain space.

...read moreread less

Abstract: Traditional feature-based image stitching technologies rely heavily on feature detection quality, often failing to stitch images with few features or low resolution. The learning-based image stitching solutions are rarely studied due to the lack of labeled data, making the supervised methods unreliable. To address the above limitations, we propose an unsupervised deep image stitching framework consisting of two stages: unsupervised coarse image alignment and unsupervised image reconstruction. In the first stage, we design an ablation-based loss to constrain an unsupervised homography network, which is more suitable for large-baseline scenes. Moreover, a transformer layer is introduced to warp the input images in the stitching-domain space. In the second stage, motivated by the insight that the misalignments in pixel-level can be eliminated to a certain extent in feature-level, we design an unsupervised image reconstruction network to eliminate the artifacts from features to pixels. Specifically, the reconstruction network can be implemented by a low-resolution deformation branch and a high-resolution refined branch, learning the deformation rules of image stitching and enhancing the resolution simultaneously. To establish an evaluation benchmark and train the learning framework, a comprehensive real-world image dataset for unsupervised deep image stitching is presented and released. Extensive experiments well demonstrate the superiority of our method over other state-of-the-art solutions. Even compared with the supervised solutions, our image stitching quality is still preferred by users.

...read moreread less

48 citations

Proceedings Article•DOI•

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences

[...]

Qunjie Zhou¹, Torsten Sattler², Laura Leal-Taixé¹•Institutions (2)

Technische Universität München¹, Czech Technical University in Prague²

01 Jun 2021

TL;DR: Patch2Pix as discussed by the authors proposes a new perspective to estimate correspondences in a detect-to-refine manner, where they first predict patch-level match proposals and then refine them.

...read moreread less

Abstract: The classical matching pipeline used for visual localization typically involves three steps: (i) local feature detection and description, (ii) feature matching, and (iii) outlier rejection. Recently emerged correspondence networks propose to perform those steps inside a single network but suffer from low matching resolution due to the memory bottle-neck. In this work, we propose a new perspective to estimate correspondences in a detect-to-refine manner, where we first predict patch-level match proposals and then refine them. We present Patch2Pix, a novel refinement network that refines match proposals by regressing pixel-level matches from the local regions defined by those proposals and jointly rejecting outlier matches with confidence scores. Patch2Pix is weakly supervised to learn correspondences that are consistent with the epipolar geometry of an input image pair. We show that our refinement network significantly improves the performance of correspondence networks on image matching, homography estimation, and localization tasks. In addition, we show that our learned refinement generalizes to fully-supervised methods without retraining, which leads us to state-of-the-art localization performance. The code is available at https://github.com/GrumpyZhou/patch2pix.

...read moreread less

43 citations

Journal Article•DOI•

Homography-based measurement of bridge vibration using UAV and DIC method

[...]

Gongfa Chen¹, Qiang Liang¹, Wentao Zhong¹, Xingjun Gao¹, Fangsen Cui² - Show less +1 more•Institutions (2)

Guangdong University of Technology¹, Agency for Science, Technology and Research²

01 Jan 2021-Measurement

TL;DR: A homography-based method by combining an unmanned aerial vehicle (UAV) and the digital image correlation (DIC) together for vibration measurement of a bridge model and the effectiveness of the proposed method is validated against the fixed camera- based method.

...read moreread less

43 citations

Proceedings Article•DOI•

Leveraging Line-point Consistence to Preserve Structures for Wide Parallax Image Stitching

[...]

Qi Jia¹, ZhengJun Li¹, Xin Fan¹, Haotian Zhao¹, Shiyu Teng¹, Xinchen Ye¹, Longin Jan Latecki² - Show less +3 more•Institutions (2)

Dalian University of Technology¹, Temple University²

20 Jun 2021

TL;DR: In this article, a projective invariant, Characteristic Number, is used to match co-planar local sub-regions for input images, which produces consistent line and point pairs, suppressing artifacts in overlapping areas.

...read moreread less

Abstract: Generating high-quality stitched images with natural structures is a challenging task in computer vision. In this paper, we succeed in preserving both local and global geometric structures for wide parallax images, while reducing artifacts and distortions. A projective invariant, Characteristic Number, is used to match co-planar local sub-regions for input images. The homography between these well-matched sub-regions produces consistent line and point pairs, suppressing artifacts in overlapping areas. We explore and introduce global collinear structures into an objective function to specify and balance the desired characters for image warping, which can preserve both local and global structures while alleviating distortions. We also develop comprehensive measures for stitching quality to quantify the collinearity of points and the discrepancy of matched line pairs by considering the sensitivity to linear structures for human vision. Extensive experiments demonstrate the superior performance of the proposed method over the state-of-the-art by presenting sharp textures and preserving prominent natural structures in stitched images. Especially, our method not only exhibits lower errors but also the least divergence across all test images. Code is available at https://github.com/dut-media-lab/Image-Stitching.

...read moreread less

41 citations

Journal Article•DOI•

Homography-based structural displacement measurement for large structures using unmanned aerial vehicles

[...]

Yufeng Weng¹, Jiazeng Shan¹, Zheng Lu¹, Xilin Lu¹, Billie F. Spencer² - Show less +1 more•Institutions (2)

Tongji University¹, University of Illinois at Urbana–Champaign²

01 Sep 2021-Computer-aided Civil and Infrastructure Engineering

TL;DR: In this article, a vision-based approach using unmanned aerial vehicles (UAVs) mounted with high-resolution cameras was proposed to assess the health of civil infrastructure in the Middle East.

...read moreread less

Abstract: Structural displacement is an important quantity to assess the health of civil infrastructure. Vision‐based approaches using unmanned aerial vehicles (UAV) mounted with high‐resolution cam...

...read moreread less

41 citations

Journal Article•DOI•

Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images

[...]

Lang Nie¹, Chunyu Lin¹, Kang Liao¹, Shuaicheng Liu¹, Yao Zhao² - Show less +1 more•Institutions (2)

Beijing Jiaotong University¹, University of Electronic Science and Technology of China²

24 Jun 2021-arXiv: Computer Vision and Pattern Recognition

...read moreread less

32 citations

Journal Article•DOI•

A Vision-Based Pipeline for Vehicle Counting, Speed Estimation, and Classification

[...]

Chenghuan Liu¹, Du Q. Huynh¹, Yuchao Sun¹, Mark Reynolds¹, Steve Atkinson² - Show less +1 more•Institutions (2)

University of Western Australia¹, Main Roads Western Australia²

01 Dec 2021-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A novel application of the image-to-world homography which gives the monocular vision system the efficacy of counting vehicles by lane and estimating vehicle length and speed in real-world units.

...read moreread less

Abstract: Cameras have been widely used in traffic operations. While many technologically smart camera solutions in the market can be integrated into Intelligent Transport Systems (ITS) for automated detection, monitoring and data generation, many Network Operations (a.k.a Traffic Control) Centres still use legacy camera systems as manual surveillance devices. In this paper, we demonstrate effective use of these older assets by applying computer vision techniques to extract traffic data from videos captured by legacy cameras. In our proposed vision-based pipeline, we adopt recent state-of-the-art object detectors and transfer-learning to detect vehicles, pedestrians, and cyclists from monocular videos. By weakly calibrating the camera, we demonstrate a novel application of the image-to-world homography which gives our monocular vision system the efficacy of counting vehicles by lane and estimating vehicle length and speed in real-world units. Our pipeline also includes a module which combines a convolutional neural network (CNN) classifier with projective geometry information to classify vehicles. We have tested it on videos captured at several sites with different traffic flow conditions and compared the results with the data collected by piezoelectric sensors. Our experimental results show that the proposed pipeline can process 60 frames per second for pre-recorded videos and yield high-quality metadata for further traffic analysis.

...read moreread less

30 citations

Proceedings Article•DOI•

Deep Lucas-Kanade Homography for Multimodal Image Alignment

[...]

Yiming Zhao¹, Xinming Huang¹, Ziming Zhang¹•Institutions (1)

Worcester Polytechnic Institute¹

22 Apr 2021

TL;DR: Deep Lucas-Kanade feature map (DLKFM) as mentioned in this paper was proposed to align multimodal image pairs by extending the traditional Lucas-kanade algorithm with networks, which can spontaneously recognize invariant features under various appearance changing conditions.

...read moreread less

Abstract: Estimating homography to align image pairs captured by different sensors or image pairs with large appearance changes is an important and general challenge for many computer vision applications. In contrast to others, we propose a generic solution to pixel-wise align multimodal image pairs by extending the traditional Lucas-Kanade algorithm with networks. The key contribution in our method is how we construct feature maps, named as deep Lucas-Kanade feature map (DLKFM). The learned DLKFM can spontaneously recognize invariant features under various appearance-changing conditions. It also has two nice properties for the Lucas-Kanade algorithm: (1) The template feature map keeps brightness consistency with the input feature map, thus the color difference is very small while they are well-aligned. (2) The Lucas-Kanade objective function built on DLKFM has a smooth landscape around ground truth homography parameters, so the iterative solution of the Lucas-Kanade can easily converge to the ground truth. With those properties, directly updating the Lucas-Kanade algorithm on our feature maps will precisely align image pairs with large appearance changes. We share the datasets, code, and demo video online 1.

...read moreread less

27 citations

Journal Article•DOI•

Image stitching via deep homography estimation

[...]

Qiang Zhao¹, Yike Ma¹, Chen Zhu¹, Chunfeng Yao², Bailan Feng², Feng Dai¹ - Show less +2 more•Institutions (2)

Chinese Academy of Sciences¹, Huawei²

25 Aug 2021-Neurocomputing

TL;DR: In this article, a deep neural network that estimates homography accurately enough for image stitching of images with small parallax is presented, where the key components of the network are feature maps with progressively increased resolution and matching cost volumes constructed in a hybrid manner.

...read moreread less

22 citations

Proceedings Article•DOI•

Perceptual Loss for Robust Unsupervised Homography Estimation

[...]

Daniel Koguciuk, Elahe Arani, Bahram Zonooz

20 Apr 2021

TL;DR: In this article, a bidirectional implicit homography estimation (biHomE) loss is proposed to minimize the distance in the feature space between warped images from the source viewpoint and the corresponding image from the target viewpoint.

...read moreread less

Abstract: Homography estimation is often an indispensable step in many computer vision tasks. The existing approaches, however, are not robust to illumination and/or larger viewpoint changes. In this paper, we propose bidirectional implicit Homography Estimation (biHomE) loss for unsupervised homography estimation. biHomE minimizes the distance in the feature space between the warped image from the source viewpoint and the corresponding image from the target viewpoint. Since we use a fixed pre-trained feature extractor and the only learnable component of our frame-work is the homography network, we effectively decouple the homography estimation from representation learning. We use an additional photometric distortion step in the synthetic COCO dataset generation to better represent the illumination variation of the real-world scenarios. We show that biHomE achieves state-of-the-art performance on synthetic COCO dataset, which is also comparable or better compared to supervised approaches. Furthermore, the empirical results demonstrate the robustness of our approach to illumination variation compared to existing methods.

...read moreread less

Posted Content•

Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras through Homography.

[...]

Minghan Zhu, Songan Zhang, Yuanxin Zhong, Pingping Lu, Huei Peng, John K. Lenneman - Show less +2 more

29 Mar 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: 3D vehicle detection is conducted by estimating the rotated bounding boxes (r-boxes) in the bird’s eye view (BEV) images generated from inverse perspective mapping and proposes a new regression target called tailed r-box and a dual-view network architecture which boosts the detection accuracy on warped BEV images.

...read moreread less

Abstract: This paper proposes a method to extract the position and pose of vehicles in the 3D world from a single traffic camera. Most previous monocular 3D vehicle detection algorithms focused on cameras on vehicles from the perspective of a driver, and assumed known intrinsic and extrinsic calibration. On the contrary, this paper focuses on the same task using uncalibrated monocular traffic cameras. We observe that the homography between the road plane and the image plane is essential to 3D vehicle detection and the data synthesis for this task, and the homography can be estimated without the camera intrinsics and extrinsics. We conduct 3D vehicle detection by estimating the rotated bounding boxes (r-boxes) in the bird's eye view (BEV) images generated from inverse perspective mapping. We propose a new regression target called \textit{tailed~r-box} and a \textit{dual-view} network architecture which boosts the detection accuracy on warped BEV images. Experiments show that the proposed method can generalize to new camera and environment setups despite not seeing imaged from them during training.

...read moreread less

Journal Article•DOI•

Stereo-rectification and homography-transform-based stereo matching methods for stereo digital image correlation

[...]

Fuqiang Zhong¹, Chenggen Quan²•Institutions (2)

University of Pittsburgh¹, National University of Singapore²

01 Mar 2021-Measurement

TL;DR: The accuracy and efficiency of HTSM are the highest among the three methods if the object has a planar surface, and the efficiency of SRSM is higher than that of CSM.

...read moreread less

Proceedings Article•

Motion Basis Learning for Unsupervised Deep Homography Estimation With Subspace Projection

[...]

Nianjin Ye¹, Chuan Wang², Haoqiang Fan³, Shuaicheng Liu¹•Institutions (3)

University of Electronic Science and Technology of China¹, Chinese Academy of Sciences², Tsinghua University³

01 Jan 2021

TL;DR: Zhang et al. as discussed by the authors proposed a homography flow representation, which can be estimated by a weighted sum of 8 pre-defined homography Flow bases, and a Low Rank Representation (LRR) block that reduces the feature rank, so that features corresponding to the dominant motions are retained while others are rejected.

...read moreread less

Abstract: In this paper, we introduce a new framework for unsupervised deep homography estimation. Our contributions are 3 folds. First, unlike previous methods that regress 4 offsets for a homography, we propose a homography flow representation, which can be estimated by a weighted sum of 8 pre-defined homography flow bases. Second, considering a homography contains 8 Degree-of-Freedoms (DOFs) that is much less than the rank of the network features, we propose a Low Rank Representation (LRR) block that reduces the feature rank, so that features corresponding to the dominant motions are retained while others are rejected. Last, we propose a Feature Identity Loss (FIL) to enforce the learned image feature warp-equivariant, meaning that the result should be identical if the order of warp operation and feature extraction is swapped. With this constraint, the unsupervised optimization is achieved more effectively and more stable features are learned. Extensive experiments are conducted to demonstrate the effectiveness of all the newly proposed components, and results show that our approach outperforms the state-of-the-art on the homography benchmark datasets both qualitatively and quantitatively. Code is available at this https URL.

...read moreread less

Journal Article•DOI•

Recaptured Screen Image Demoiréing

[...]

Huanjing Yue¹, Mao Yan¹, Liang Lipu¹, Hongteng Xu², Chunping Hou¹, Jingyu Yang¹ - Show less +2 more•Institutions (2)

Tianjin University¹, Durham University²

01 Jan 2021-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This paper proposes a CNN-based moiré removal method for recaptured screen images, and proposes a convolutional neural Network with Additive and Multiplicative modules (termed as AMNet) to transfer the low light moirés image to the bright moirÉ-free image.

...read moreread less

Abstract: In many situations, such as transferring data between devices and recording precious moments, we would like to capture the contents on screens using digital cameras for convenience. These recaptured screen images and videos suffer from a special type of degradation called “moire pattern”, which is caused by the aliasing between the grid of display screen and the array of camera sensor. However, few works are proposed to tackle this problem. Considering the great success of convolutional neural networks (CNNs) in image restoration, we propose a CNN-based moire removal method for recaptured screen images. There are mainly two contributions in this paper. First, for the generation of training data, we propose an image registration algorithm via global homography transform and local patch matching to compensate the significant viewpoint disparity between the recaptured screen image and the moire-free image obtained via screenshot. We construct a moire removal and brightness improvement (MRBI) database with aligned moire-free and moire images. Second, we propose a convolutional neural Network with Additive and Multiplicative modules (termed as AMNet) to transfer the low light moire image to the bright moire-free image. The proposed network is trained with pixel-wise loss, perceptual loss, and adversarial loss. Extensive experiments on 340 test images demonstrate that the proposed method outperforms state-of-the-art moire removal methods.

...read moreread less

Journal Article•DOI•

Three-Dimensional Reconstruction-Based Vibration Measurement of Bridge Model Using UAVs

[...]

Zhihua Wu, Gongfa Chen, Qiong Ding, Bing Yuan, Xiaomei Yang - Show less +1 more

31 May 2021-Applied Sciences

TL;DR: 3D reconstruction method can effectively overcome the limitation of the homography-based method that the fixed reference points and the target points must be coplanar.

...read moreread less

Abstract: This paper presents a measurement method of bridge vibration based on three-dimensional (3D) reconstruction. A video of bridge model vibration is recorded by an unmanned aerial vehicle (UAV), and the displacement of target points on the bridge model is tracked by the digital image correlation (DIC) method. Due to the UAV motion, the DIC-tracked displacement of the bridge model includes the absolute displacement caused by the excitation and the false displacement induced by the UAV motion. Therefore, the UAV motion must be corrected to measure the real displacement. Using four corner points on a fixed object plane as the reference points, the projection matrix for each frame of images can be estimated by the UAV camera calibration, and then the 3D world coordinates of the target points on the bridge model can be recovered. After that, the real displacement of the target points can be obtained. To verify the correctness of the results, the operational modal analysis (OMA) method is used to extract the natural frequencies of the bridge model. The results show that the first natural frequency obtained from the proposed method is consistent with the one obtained from the homography-based method. By further comparing with the homography-based correction method, it is found that the 3D reconstruction method can effectively overcome the limitation of the homography-based method that the fixed reference points and the target points must be coplanar.

...read moreread less

Proceedings Article•DOI•

A Robust and Efficient Framework for Sports-Field Registration

[...]

Xiaohan Nie¹, Shixing Chen¹, Raffay Hamid¹•Institutions (1)

Amazon.com¹

01 Jan 2021

TL;DR: In this paper, a grid of key-points distributed uniformly on the entire field instead of using only sparse local corners and line intersections is used to extend the keypoint coverage to the textureless parts of the field as well.

...read moreread less

Abstract: We propose a novel framework to register sports-fields as they appear in broadcast sports videos. Unlike previous approaches, we particularly address the challenge of field- registration when: (a) there are not enough distinguishable features on the field, and (b) no prior knowledge is available about the camera. To this end, we detect a grid of key- points distributed uniformly on the entire field instead of using only sparse local corners and line intersections, thereby extending the keypoint coverage to the texture-less parts of the field as well. To further improve keypoint based homography estimate, we differentialbly warp and align it with a set of dense field-features defined as normalized distance- map of pixels to their nearest lines and key-regions. We predict the keypoints and dense field-features simultaneously using a multi-task deep network to achieve computational efficiency. To have a comprehensive evaluation, we have compiled a new dataset called SportsFields which is collected from 192 video-clips from 5 different sports covering large environmental and camera variations. We empirically demonstrate that our algorithm not only achieves state of the art field-registration accuracy but also runs in real-time for HD resolution videos using commodity hardware.

...read moreread less

Journal Article•DOI•

Mobile projective augmented reality for collaborative robots in construction

[...]

Siyuan Xiang¹, Ruoyu Wang¹, Chen Feng¹•Institutions (1)

New York University¹

01 Jul 2021-Automation in Construction

TL;DR: A mobile projective AR framework in which the AR device is detached from human workers and carried by one or more mobile collaborative robots (co-robots) is proposed, which achieves glassless AR that is visible to the naked eye using a camera-projector system to superimpose virtual 3D information onto planar or non-planar physical surfaces.

...read moreread less

Proceedings Article•

Stacked Homography Transformations for Multi-View Pedestrian Detection

[...]

Liangchen Song¹, Jialian Wu², Ming Yang³, Qian Zhang⁴, Yuan Li⁵, Junsong Yuan⁶ - Show less +2 more•Institutions (6)

Wuhan University¹, University at Buffalo², Nanyang Technological University³, International Institute of Minnesota⁴, Google⁵, State University of New York System⁶

01 Jan 2021

Journal Article•DOI•

Road pollution estimation from vehicle tracking in surveillance videos by deep convolutional neural networks

[...]

Jorge García-González¹, Miguel A. Molina-Cabello¹, Rafael Marcos Luque-Baena¹, Juan Miguel Ortiz-de-Lazcano-Lobato¹, Ezequiel López-Rubio¹ - Show less +1 more•Institutions (1)

University of Málaga¹

01 Dec 2021-Applied Soft Computing

TL;DR: In this paper, a method which detects the pollution levels of transport vehicles from the images of IP cameras by means of computer vision techniques and neural networks is proposed, and the trajectory of each vehicle is computed by applying convolutional neural networks for object detection and tracking algorithms.

...read moreread less

Journal Article•DOI•

Single-shot cuboids: Geodesics-based end-to-end Manhattan aligned layout estimation from spherical panoramas

[...]

Nikolaos Zioulis¹, Federico Alvarez¹, Dimitrios Zarpalas, Petros Daras•Institutions (1)

Technical University of Madrid¹

18 Mar 2021-Image and Vision Computing

TL;DR: This work is the first to directly infer Manhattan-aligned outputs, and introduces the geodesic heatmaps and loss and a boundary-aware center of mass calculation that facilitate higher quality keypoint estimation in the spherical domain.

...read moreread less

Journal Article•DOI•

An Image-Based Steel Rebar Size Estimation and Counting Method Using a Convolutional Neural Network Combined with Homography

[...]

Yoonsoo Shin, Sekojae Heo, Sehee Han, Junhee Kim, Seunguk Na - Show less +1 more

09 Oct 2021-Buildings

TL;DR: In this article, the authors developed an automated system to estimate the size and count the number of steel rebars in bale packing using computer vision techniques based on a convolutional neural network (CNN).

...read moreread less

Abstract: Conventionally, the number of steel rebars at construction sites is manually counted by workers. However, this practice gives rise to several problems: it is slow, human-resource-intensive, time-consuming, error-prone, and not very accurate. Consequently, a new method of quickly and accurately counting steel rebars with a minimal number of workers needs to be developed to enhance work efficiency and reduce labor costs at construction sites. In this study, the authors developed an automated system to estimate the size and count the number of steel rebars in bale packing using computer vision techniques based on a convolutional neural network (CNN). A dataset containing 622 images of rebars with a total of 186,522 rebar cross sections and 409 poly tags was established for segmentation rebars and poly tags in images. The images were collected in a full HD resolution of 1920 × 1080 pixels and then center-cropped to 512 × 512 pixels. Moreover, data augmentation was carried out to create 4668 images for the training dataset. Based on the training dataset, YOLACT-based steel bar size estimation and a counting model with a Box and Mask of over 30 mAP was generated to satisfy the aim of this study. The proposed method, which is a CNN model combined with homography, can estimate the size and count the number of steel rebars in an image quickly and accurately, and the developed method can be applied to real construction sites to efficiently manage the stock of steel rebars.

...read moreread less

Journal Article•DOI•

Creating navigation map in semi-open scenarios for intelligent vehicle localization using multi-sensor fusion

[...]

Li Yicheng¹, Li Yicheng², Yingfeng Cai², Reza Malekian³, Hai Wang², Miguel Angel Sotelo⁴, Zhixiong Li⁵, Zhixiong Li⁶ - Show less +4 more•Institutions (6)

Wuhan University of Technology¹, Jiangsu University², Malmö University³, University of Alcalá⁴, Yonsei University⁵, Ocean University of China⁶

01 Dec 2021-Expert Systems With Applications

TL;DR: The proposed RSF map can be applied to semi-open scenarios in practice to provide a reliable basic for IV localization and demonstrate that the mean error of the nodes between the created and actual maps was 2.7 cm.

...read moreread less

Abstract: In order to pursue high-accuracy localization for intelligent vehicles (IVs) in semi-open scenarios, this study proposes a new map creation method based on multi-sensor fusion technique. In this new method, the road scenario fingerprint (RSF) was employed to fuse the visual features, three-dimensional (3D) data and trajectories in the multi-view and multi-sensor information fusion process. The visual features were collected in the front and downward views of the IVs; the 3D data were collected by the laser scanner and the downward camera and a homography method was proposed to reconstruct the monocular 3D data; the trajectories were computed from the 3D data in the downward view. Moreover, a new plane-corresponding calibration strategy was developed to ensure the fusion quality of sensory measurements of the camera and laser. In order to evaluate the proposed method, experimental tests were carried out in a 5 km semi-open ring route. A series of nodes were found to construct the RSF map. The experimental results demonstrate that the mean error of the nodes between the created and actual maps was 2.7 cm, the standard deviation of the nodes was 2.1 cm and the max error was 11.8 cm. The localization error of the IV was 10.8 cm. Hence, the proposed RSF map can be applied to semi-open scenarios in practice to provide a reliable basic for IV localization.

...read moreread less

Proceedings Article•DOI•

Deep learning assisted visual tracking of evader-UAV

[...]

Athanasios Tsoukalas¹, Daitao Xing², Nikolaos Evangeliou¹, Nikolaos Giakoumidis¹, Anthony Tzes¹ - Show less +1 more•Institutions (2)

New York University Abu Dhabi¹, New York University²

15 Jun 2021

TL;DR: In this article, the visual tracking of an evading UAV using a pursuer-UAV is examined, which combines principles of deep learning, optical flow, intra-frame homography and correlation based tracking.

...read moreread less

Abstract: In this work the visual tracking of an evading UAV using a pursuer-UAV is examined. The developed method combines principles of deep learning, optical flow, intra-frame homography and correlation based tracking. A Yolo tracker for short term tracking is employed, complimented by optical flow and homography techniques. In case there is no detected evader-UAV, the MOSSE tracking algorithm re-initializes the search and the PTZ-camera zooms-out to cover a wider Filed of View. The camera's controller adjusts the pan and tilt angles so that the evader-UAV is as close to the center of view as possible, while its zoom is commanded in order to for the captured evader-UAV bounding box cover as much as possible the captured-frame. Experimental studies are offered to highlight the algorithm's principle and evaluate its performance.

...read moreread less

Journal Article•DOI•

Damage inspection for road markings based on images with hierarchical semantic segmentation strategy and dynamic homography estimation

[...]

Chong Wei¹, Shurong Li¹, Kai Wu², Zhang Zijian¹, Ying Wang¹ - Show less +1 more•Institutions (2)

Beijing Jiaotong University¹, Tencent²

01 Nov 2021-Automation in Construction

TL;DR: The experimental results confirm that the proposed computer vision-based damage inspection system for road markings is effective in automating the inspection of road markings and producing objective damage assessments that should significantly assist road managers in prioritizing maintenance operations.

...read moreread less

Proceedings Article•DOI•

Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning

[...]

Abdelhak Loukkal¹, Yves Grandvalet¹, Tom Drummond², You Li³•Institutions (3)

University of Technology of Compiègne¹, Monash University², Renault³

01 Jan 2021

TL;DR: In this article, a novel monocular camera-only holistic end-to-end trajectory planning network with a Bird-Eye-View (BEV) intermediate representation that comes in the form of binary Occupancy Grid Maps (OGMs) is proposed.

...read moreread less

Abstract: Camera-based end-to-end driving neural networks bring the promise of a low-cost system that maps camera images to driving control commands. These networks are appealing because they replace laborious hand engineered building blocks but their black-box nature makes them difficult to delve in case of failure. Recent works have shown the importance of using an explicit intermediate representation that has the benefits of increasing both the interpretability and the accuracy of networks’ decisions. Nonetheless, these camera-based networks reason in camera view where scale is not homogeneous and hence not directly suitable for motion forecasting. In this paper, we introduce a novel monocular camera-only holistic end-to-end trajectory planning network with a Bird-Eye-View (BEV) intermediate representation that comes in the form of binary Occupancy Grid Maps (OGMs). To ease the prediction of OGMs in BEV from camera images, we introduce a novel scheme where the OGMs are first predicted as semantic masks in camera view and then warped in BEV using the homography between the two planes. The key element allowing this transformation to be applied to 3D objects such as vehicles, consists in predicting solely their footprint in camera-view, hence respecting the flat world hypothesis implied by the homography.

...read moreread less

Journal Article•DOI•

UAV Image Stitching Based on Mesh-Guided Deformation and Ground Constraint

[...]

Quan Xu¹, Jun Chen¹, Linbo Luo¹, Wenping Gong¹, Yong Wang¹ - Show less +1 more•Institutions (1)

China University of Geosciences (Wuhan)¹

23 Feb 2021-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: A drone image stitching based on mesh-guided deformation and ground constraint, which can closely match the characteristics of images and achieve precise registration and acquire ideal stitching effect is introduced.

...read moreread less

Abstract: This article introduces a drone image stitching based on mesh-guided deformation and ground constraint, which can closely match the characteristics of images and achieve precise registration and acquire ideal stitching effect. The traditional methods use the homography model to align the image, which causes artifacts in the result of stitching the images with parallax. To overcome this situation, the image is divided into meshes and the mesh vertices of the target image are used to guide the warping. A new energy function is designed to represent the deformation characteristics of the image. We propose a new alignment term by using local homography and a local scale term by using the edge information of the mesh. The established mesh-guided deformation model can overcome image parallax caused by some external factors and eliminate the ghostly parts of the result. Moreover, imaged scene is not effectively planar and some fluctuations exist in the scene of the images, which will distort the stitching result. We propose a ground constraint with the ground plane as the main plane to reduce projection distortions in non-overlapping areas between images. Finally, the method of creating groundtruth is proposed, which can evaluate the naturalness of results and make comparison more reasonable. Several sets of challenging drone images are tested, and the experimental results show that our stitching system has good results.

...read moreread less

Journal Article•DOI•

Deep corner prediction to rectify tilted license plate images

[...]

Hojin Yoo¹, Kyungkoo Jun¹•Institutions (1)

Incheon National University¹

01 Aug 2021-Multimedia Systems

TL;DR: This work proposes deep neural network models that can locate four corner plate positions, which can then be used to perform the perspective transformation that can been used to rectify plates.

...read moreread less

Abstract: Skewness and obliqueness of vehicle plate images influence license plate recognition. The more tilted plate images are, the harder the recognition task is. To this end, if plate images are preprocessed to be aligned and rectified, the recognition performance would improve. We propose deep neural network models that can locate four corner plate positions, which can then be used to perform the perspective transformation that can be used to rectify plates. Such a transformation is called homography. The models consist of two sequential parts: a feature extraction part having convolution and a regression part with fully connected layers. The models are open in the sense that the feature extraction part can host other well-known models such as Mobilenet as long as they have the feature capture capability. We devise a loss function as the sum of Euclidean distance between predicted coordinates and ground truth and discuss image augmentation schemes. The experiment results show that the models with well-known object detection models are able to predict corner positions with relatively high precision.

...read moreread less

Journal Article•DOI•

Homography-based camera pose estimation with known gravity direction for UAV navigation

[...]

Chunhui Zhao¹, Bin Fan¹, Jinwen Hu¹, Quan Pan¹, Zhao Xu¹ - Show less +1 more•Institutions (1)

Northwestern Polytechnical University¹

01 Jan 2021-Science in China Series F: Information Sciences

TL;DR: Zhang et al. as mentioned in this paper statistically optimizes the solution for the homography-based relative pose estimation problem, assuming a known gravity direction and a dominant ground plane, enabling a least squares pose estimation between two views.

...read moreread less

Abstract: Relative pose estimation has become a fundamental and important problem in visual simultaneous localization and mapping. This paper statistically optimizes the solution for the homography-based relative pose estimation problem. Assuming a known gravity direction and a dominant ground plane, the homography representation in the normalized image plane enables a least squares pose estimation between two views. Furthermore, an iterative estimation method of the camera trajectory is developed for visual odometry. The accuracy and robustness of the proposed algorithm are experimentally tested on synthetic and real data in indoor and outdoor environments. Various metrics confirm the effectiveness of the proposed method in practical applications.

...read moreread less

Journal Article•DOI•

A Homography-Based Dynamic Control Approach Applied to Station Keeping of Autonomous Underwater Vehicles Without Linear Velocity Measurements

[...]

Lam-Hung Nguyen¹, Minh-Duc Hua¹, Guillaume Allibert¹, Tarek Hamel¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Sep 2021-IEEE Transactions on Control Systems and Technology

TL;DR: A homography-based dynamic control approach applied to station keeping of autonomous underwater vehicles (AUVs) without relying on linear velocity measurements is proposed, which is robust with respect to model uncertainties and unknown currents.

...read moreread less

Abstract: A homography-based dynamic control approach applied to station keeping of autonomous underwater vehicles (AUVs) without relying on linear velocity measurements is proposed. The homography estimated from images of a planar target scene captured by a downward-looking camera is directly used as feedback information. The full dynamics of the AUV are exploited in a hierarchical control design with inner and outer loop architectures. Enhanced by integral compensation actions and disturbance torque estimation, the proposed controller is robust with respect to model uncertainties and unknown currents. The performance of the proposed control approach is illustrated using both comparative simulation results conducted on a realistic AUV model and experimental validations on an in-house AUV.

...read moreread less