scispace - formally typeset
Search or ask a question
Topic

Minimum bounding box

About: Minimum bounding box is a research topic. Over the lifetime, 5561 publications have been published within this topic receiving 138240 citations. The topic is also known as: MBB.


Papers
More filters
Journal ArticleDOI
TL;DR: A way to speed up overlap tests between AABBs, such that for collision detection of rigid models, the difference in performance between the two representations is greatly reduced.
Abstract: We present a scheme for exact collision detection between complex models undergoing rigid motion and deformation. The scheme relies on a hierarchical model representation using axis-aligned bounding boxes (AABBs). Recent work has shown that AABB trees are slower than oriented bounding box (OBB) trees for performing overlap tests. In this paper, we describe a way to speed up overlap tests between AABBs, such that for collision detection of rigid models, the difference in performance between the two representations is greatly reduced. Furthermore, we show how to update an AABB tree quickly as a model is deformed. We thus find AABB trees to be the method of choice for collision detection of complex models undergoing deformation. In fact, because they are not much slower to test, are faster to build, and use less storage than OBB trees, AABB trees might be a reasonable choice for rigid models as well.

859 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.7% on COCO test-dev and extreme point guided segmentation further improves this to 34.6% Mask AP.
Abstract: With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. We group the five keypoints into a bounding box if they are geometrically aligned. Object detection is then a purely appearance-based keypoint estimation problem, without region classification or implicit feature learning. The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.7% on COCO test-dev. In addition, our estimated extreme points directly span a coarse octagonal mask, with a COCO Mask AP of 18.9%, much better than the Mask AP of vanilla bounding boxes. Extreme point guided segmentation further improves this to 34.6% Mask AP.

811 citations

Proceedings ArticleDOI
TL;DR: UnitBox as mentioned in this paper proposes an intersection over union (IoU$) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit.
Abstract: In present object detection systems, the deep convolutional neural networks (CNNs) are utilized to predict bounding boxes of object candidates, and have gained performance advantages over the traditional region proposal methods. However, existing deep CNN methods assume the object bounds to be four independent variables, which could be regressed by the $\ell_2$ loss separately. Such an oversimplified assumption is contrary to the well-received observation, that those variables are correlated, resulting to less accurate localization. To address the issue, we firstly introduce a novel Intersection over Union ($IoU$) loss function for bounding box prediction, which regresses the four bounds of a predicted box as a whole unit. By taking the advantages of $IoU$ loss and deep fully convolutional networks, the UnitBox is introduced, which performs accurate and efficient localization, shows robust to objects of varied shapes and scales, and converges fast. We apply UnitBox on face detection task and achieve the best performance among all published methods on the FDDB benchmark.

781 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places is introduced, and a generalization of bundle adjustment that incorporates object-to-object correspondences is introduced.
Abstract: Existing scene understanding datasets contain only a limited set of views of a place, and they lack representations of complete 3D spaces. In this paper, we introduce SUN3D, a large-scale RGB-D video database with camera pose and object labels, capturing the full 3D extent of many places. The tasks that go into constructing such a dataset are difficult in isolation -- hand-labeling videos is painstaking, and structure from motion (SfM) is unreliable for large spaces. But if we combine them together, we make the dataset construction task much easier. First, we introduce an intuitive labeling tool that uses a partial reconstruction to propagate labels from one frame to another. Then we use the object labels to fix errors in the reconstruction. For this, we introduce a generalization of bundle adjustment that incorporates object-to-object correspondences. This algorithm works by constraining points for the same object from different frames to lie inside a fixed-size bounding box, parameterized by its rotation and translation. The SUN3D database, the source code for the generalized bundle adjustment, and the web-based 3D annotation tool are all available at http://sun3d.cs.princeton.edu.

779 citations

Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this paper, a hybrid discrete-continuous loss is proposed to estimate 3D bounding box dimensions and geometric constraints provided by a 2D object bounding boxes. But this method requires a large amount of training data and is computationally expensive.
Abstract: We present a method for 3D object detection and pose estimation from a single image. In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box. The first network output estimates the 3D object orientation using a novel hybrid discrete-continuous loss, which significantly outperforms the L2 loss. The second output regresses the 3D object dimensions, which have relatively little variance compared to alternatives and can often be predicted for many object types. These estimates, combined with the geometric constraints on translation imposed by the 2D bounding box, enable us to recover a stable and accurate 3D object pose. We evaluate our method on the challenging KITTI object detection benchmark [2] both on the official metric of 3D orientation estimation and also on the accuracy of the obtained 3D bounding boxes. Although conceptually simple, our method outperforms more complex and computationally expensive approaches that leverage semantic segmentation, instance level segmentation and flat ground priors [4] and sub-category detection [23][24]. Our discrete-continuous loss also produces state of the art results for 3D viewpoint estimation on the Pascal 3D+ dataset[26].

773 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
91% related
Feature extraction
111.8K papers, 2.1M citations
91% related
Image segmentation
79.6K papers, 1.8M citations
90% related
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Deep learning
79.8K papers, 2.1M citations
89% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023543
20221,175
2021695
2020882
2019790
2018516