Topic

Homography (computer vision)

About: Homography (computer vision) is a research topic. Over the lifetime, 2247 publications have been published within this topic receiving 51916 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•

An iterative image registration technique with an application to stereo vision

[...]

Bruce D. Lucas¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

24 Aug 1981

TL;DR: In this paper, the spatial intensity gradient of the images is used to find a good match using a type of Newton-Raphson iteration, which can be generalized to handle rotation, scaling and shearing.

...read moreread less

Abstract: Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is taster because it examines far fewer potential matches between the images than existing techniques Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show how our technique can be adapted tor use in a stereo vision system.

...read moreread less

12,944 citations

Journal Article•DOI•

Image registration methods: a survey

[...]

Barbara Zitová¹, Jan Flusser¹•Institutions (1)

Academy of Sciences of the Czech Republic¹

01 Oct 2003-Image and Vision Computing

TL;DR: A review of recent as well as classic image registration methods to provide a comprehensive reference source for the researchers involved in image registration, regardless of particular application areas.

...read moreread less

6,842 citations

Proceedings Article•DOI•

Action Recognition with Improved Trajectories

[...]

Heng Wang¹, Cordelia Schmid¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Dec 2013

TL;DR: Dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets are improved by taking into account camera motion to correct them.

...read moreread less

Abstract: Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking into account camera motion to correct them. To estimate camera motion, we match feature points between frames using SURF descriptors and dense optical flow, which are shown to be complementary. These matches are, then, used to robustly estimate a homography with RANSAC. Human motion is in general different from camera motion and generates inconsistent matches. To improve the estimation, a human detector is employed to remove these matches. Given the estimated camera motion, we remove trajectories consistent with it. We also use this estimation to cancel out camera motion from the optical flow. This significantly improves motion-based descriptors, such as HOF and MBH. Experimental results on four challenging action datasets (i.e., Hollywood2, HMDB51, Olympic Sports and UCF50) significantly outperform the current state of the art.

...read moreread less

3,487 citations

Proceedings Article•DOI•

Visual attention detection in video sequences using spatiotemporal cues

[...]

Yun Zhai¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

23 Oct 2006

TL;DR: The proposed spatiotemporal video attention framework has been applied on over 20 testing video sequences, and attended regions are detected to highlight interesting objects and motions present in the sequences with very high user satisfaction rate.

...read moreread less

Abstract: Human vision system actively seeks interesting regions in images to reduce the search effort in tasks, such as object detection and recognition. Similarly, prominent actions in video sequences are more likely to attract our first sight than their surrounding neighbors. In this paper, we propose a spatiotemporal video attention detection technique for detecting the attended regions that correspond to both interesting objects and actions in video sequences. Both spatial and temporal saliency maps are constructed and further fused in a dynamic fashion to produce the overall spatiotemporal attention model. In the temporal attention model, motion contrast is computed based on the planar motions (homography) between images, which is estimated by applying RANSAC on point correspondences in the scene. To compensate the non-uniformity of spatial distribution of interest-points, spanning areas of motion segments are incorporated in the motion contrast computation. In the spatial attention model, a fast method for computing pixel-level saliency maps has been developed using color histograms of images. A hierarchical spatial attention representation is established to reveal the interesting points in images as well as the interesting regions. Finally, a dynamic fusion technique is applied to combine both the temporal and spatial saliency maps, where temporal attention is dominant over the spatial model when large motion contrast exists, and vice versa. The proposed spatiotemporal attention framework has been applied on over 20 testing video sequences, and attended regions are detected to highlight interesting objects and motions present in the sequences with very high user satisfaction rate.

...read moreread less

983 citations

Book Chapter•DOI•

MVSNet: Depth inference for unstructured multi-view stereo

[...]

Yao Yao¹, Zixin Luo¹, Shiwei Li¹, Tian Fang, Long Quan¹ - Show less +1 more•Institutions (1)

Hong Kong University of Science and Technology¹

08 Sep 2018

TL;DR: This work presents an end-to-end deep learning architecture for depth map inference from multi-view images that flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature.

...read moreread less

Abstract: We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet.

...read moreread less

746 citations

Collapse

Network Information

Performance

Metrics

2,247

Papers

57,300

Citations

No. of papers in the topic in previous years
Year	Papers
2022	3
2021	108
2020	110
2019	145
2018	131
2017	127

Homography (computer vision)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics