scispace - formally typeset
Search or ask a question
Author

Venu Madhav Govindu

Bio: Venu Madhav Govindu is an academic researcher from Indian Institute of Science. The author has contributed to research in topics: Motion estimation & Affine transformation. The author has an hindex of 18, co-authored 42 publications receiving 1552 citations. Previous affiliations of Venu Madhav Govindu include University of Maryland, College Park.

Papers
More filters
Proceedings ArticleDOI
19 Jul 2004
TL;DR: The Lie-algebras of the Special Orthogonal and Special Euclidean groups are used to define averages on the Lie-group which gives statistically meaningful, efficient and accurate algorithms for fusing motion information.
Abstract: While motion estimation has been extensively studied in the computer vision literature, the inherent information redundancy in an image sequence has not been well utilised. In particular as many as N(N-1)/2 pairwise relative motions can be estimated efficiently from a sequence of N images. This highly redundant set of observations can be efficiently averaged resulting in fast motion estimation algorithms that are globally consistent. In this paper we demonstrate this using the underlying Lie-group structure of motion representations. The Lie-algebras of the Special Orthogonal and Special Euclidean groups are used to define averages on the Lie-group which in turn gives statistically meaningful, efficient and accurate algorithms for fusing motion information. Using multiple constraints also controls the drift in the solution due to accumulating error. The performance of the method in estimating camera motion is demonstrated on image sequences.

274 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: The approach works on the Lie group structure of 3D rotations and solves the problem of large-scale robust rotation averaging in two ways, using modern ℓ1 optimizers to carry out robust averaging of relative rotations that is efficient, scalable and robust to outliers.
Abstract: In this paper we address the problem of robust and efficient averaging of relative 3D rotations. Apart from having an interesting geometric structure, robust rotation averaging addresses the need for a good initialization for large scale optimization used in structure-from-motion pipelines. Such pipelines often use unstructured image datasets harvested from the internet thereby requiring an initialization method that is robust to outliers. Our approach works on the Lie group structure of 3D rotations and solves the problem of large-scale robust rotation averaging in two ways. Firstly, we use modern l1 optimizers to carry out robust averaging of relative rotations that is efficient, scalable and robust to outliers. In addition, we also develop a two step method that uses the l1 solution as an initialisation for an iteratively reweighted least squares (IRLS) approach. These methods achieve excellent results on large-scale, real world datasets and significantly outperform existing methods, i.e. the state-of-the-art discrete-continuous optimization method of [3] as well as the Weiszfeld method of [8]. We demonstrate the efficacy of our method on two large scale real world datasets and also provide the results of the two aforementioned methods for comparison.

243 citations

Proceedings ArticleDOI
01 Dec 2001
TL;DR: This paper shows how to linearly solve for consistent global motion models using this highly redundant set of constraints to estimate the motion parameters of an image sequence.
Abstract: In this paper we describe two methods for estimating the motion parameters of an image sequence. For a sequence of N images, the global motion can be described by N-1 independent motion models. On the other hand, in a sequence there exist as many as /sub 2///sup N(N-1)/ pairwise relative motion constraints that can be solve for efficiently. In this paper we show how to linearly solve for consistent global motion models using this highly redundant set of constraints. In the first case, our method involves estimating all available pairwise relative motions and linearly fining a global motion model to these estimates. In the second instance, we exploit the fact that algebraic (i.e. epipolar) constraints between various image pairs are all related to each other by the global motion model. This results in an estimation method that directly computes the motion of the sequence by using all possible algebraic constraints. Unlike using reprojection error, our optimisation method does not solve for the structure of points resulting in a reduction of the dimensionality of the search space. Our algorithms are used for both 3D camera motion estimation and camera calibration. We provide real examples of both applications.

219 citations

Proceedings ArticleDOI
14 Jun 2020
TL;DR: 3DRegNet as discussed by the authors is a novel deep learning architecture for the registration of 3D scans, which uses a set of point correspondences to address the following two challenges: (i) classification of the point correspondence into inliers/outliers, and (ii) regression of the motion parameters that align the scans into a common reference frame.
Abstract: We present 3DRegNet, a novel deep learning architecture for the registration of 3D scans. Given a set of 3D point correspondences, we build a deep neural network to address the following two challenges: (i) classification of the point correspondences into inliers/outliers, and (ii) regression of the motion parameters that align the scans into a common reference frame. With regard to regression, we present two alternative approaches: (i) a Deep Neural Network (DNN) registration and (ii) a Procrustes approach using SVD to estimate the transformation. Our correspondence-based approach achieves a higher speedup compared to competing baselines. We further propose the use of a refinement network, which consists of a smaller 3DRegNet as a refinement to improve the accuracy of the registration. Extensive experiments on two challenging datasets demonstrate that we outperform other methods and achieve state-of-the-art results. The code is available.

129 citations

Journal ArticleDOI
TL;DR: This paper proposes a generalized framework of relative rotation averaging that can use different robust loss functions and jointly optimizes for all the unknown camera rotations, and uses a quasi-Newton optimization which results in an efficient iteratively reweighted least squares (IRLS) formulation that works in the Lie algebra of the 3D rotation group.
Abstract: This paper addresses the problem of robust and efficient relative rotation averaging in the context of large-scale Structure from Motion. Relative rotation averaging finds global or absolute rotations for a set of cameras from a set of observed relative rotations between pairs of cameras. We propose a generalized framework of relative rotation averaging that can use different robust loss functions and jointly optimizes for all the unknown camera rotations. Our method uses a quasi-Newton optimization which results in an efficient iteratively reweighted least squares (IRLS) formulation that works in the Lie algebra of the 3D rotation group. We demonstrate the performance of our approach on a number of large-scale data sets. We show that our method outperforms existing methods in the literature both in terms of speed and accuracy.

128 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A review of recent as well as classic image registration methods to provide a comprehensive reference source for the researchers involved in image registration, regardless of particular application areas.

6,842 citations

Journal ArticleDOI
TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Abstract: Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes.

4,463 citations

Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Book ChapterDOI
03 Sep 2001
TL;DR: The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision, comparing the running times of several standard algorithms, as well as a new algorithm that is recently developed.
Abstract: After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for energy minimization in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-style "push-relabel" methods and algorithms based on Ford-Fulkerson style augmenting paths. We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and interactive segmentation. In many cases our new algorithm works several times faster than any of the other methods making near real-time performance possible.

3,099 citations

Journal ArticleDOI
TL;DR: Simultaneous localization and mapping (SLAM) as mentioned in this paper consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it.
Abstract: Simultaneous localization and mapping (SLAM) consists in the concurrent construction of a model of the environment (the map ), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications and witnessing a steady transition of this technology to industry. We survey the current state of SLAM and consider future directions. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors’ take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

2,039 citations