scispace - formally typeset
Journal ArticleDOI

One Stage Monocular 3D Object Detection Utilizing Discrete Depth and Orientation Representation

- 01 Nov 2022 - 
- Vol. 23, Iss: 11, pp 21630-21640
Reads0
Chats0
TLDR
In this article , a monocular 3D object detection method that utilizes the discrete depth and orientation representation was proposed to predict object locations on 3D space utilizing keypoint detection on the object's center point.
Abstract
On-road object detection is a critical component in an autonomous driving system. The safety of the vehicle can only be as good as the reliability of the on-road object detection system. Thus, developing a fast and robust object detection algorithm has been the primary goal of many automotive industries and institutes. In recent years, multi-purpose vision-based driver assistance systems have gained popularity with the emergence of a deep neural network. A monocular camera has been developed to locate an object in the image plane and estimate the distance of the said object in the real world or the vehicle plane. In this work, we present a monocular 3D object detection method that utilizes the discrete depth and orientation representation. Our proposed method strives to predict object locations on 3D space utilizing keypoint detection on the object’s center point. To improve the point detection, we employ center regression on the objects segmentation mask, reducing the detection offset significantly. The simplicity of our proposed network architecture and its one-stage approach allows our algorithm to achieve competitive speed compared with prior methods. Our proposed method is able to achieve 26.93% detection score on the Cityscapes 3D object detection dataset, outperforming the preceding monocular method by a margin of 2.8 points.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Touchless Head-Control (THC): Head Gesture Recognition for Cursor and Orientation Control

TL;DR: In this paper , the authors used convolutional neural networks with predicted fine-grained feature maps and binned classification were applied to estimate the head pose angles and the mouse pointer or cursor was moved to actual locations on the screen based on head movement and the center position of the face.
Journal ArticleDOI

Real-Time 3D Object Detection and Classification in Autonomous Driving Environment Using 3D LiDAR and Camera Sensors

TL;DR: In this article , an object detection mechanism that fuses the data received from the camera sensor and the 3D LiDAR sensor (OD-C3DL) is proposed, which can provide an average of 89 real-time objects for a frame and reduce the extraction time by a recall rate of 94%.
Proceedings ArticleDOI

Unsupervised Cross-Domain Adaptation through Mutual Mean Learning and GANs for Person Re-identification

TL;DR: In this paper , a mutual learning model is used to transfer knowledge of one model to another model that ultimately improves overall model performance, which significantly achieves improved performance in comparison to state-of-the-art methods.
References
More filters
Proceedings ArticleDOI

Multi-view 3D Object Detection Network for Autonomous Driving

TL;DR: This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.
Proceedings ArticleDOI

Object scene flow for autonomous vehicles

TL;DR: A novel model and dataset for 3D scene flow estimation with an application to autonomous driving by representing each element in the scene by its rigid motion parameters and each superpixel by a 3D plane as well as an index to the corresponding object.
Proceedings ArticleDOI

Deep Ordinal Regression Network for Monocular Depth Estimation

TL;DR: Deep Ordinal Regression Network (DORN) as discussed by the authors discretizes depth and recast depth network learning as an ordinal regression problem by training the network using an ordinary regression loss, which achieves much higher accuracy and faster convergence in synch.
Journal ArticleDOI

On-road vehicle detection: a review

TL;DR: A review of recent vision-based on-road vehicle detection systems where the camera is mounted on the vehicle rather than being fixed such as in traffic/driveway monitoring systems is presented.
Proceedings ArticleDOI

Pyramid Stereo Matching Network

TL;DR: PSMNet as discussed by the authors proposes a pyramid stereo matching network consisting of two main modules: spatial pyramid pooling and 3D CNN to regularize cost volume using stacked multiple hourglass networks in conjunction with intermediate supervision.
Related Papers (5)