scispace - formally typeset
Proceedings ArticleDOI

Are They Going to Cross? A Benchmark Dataset and Baseline for Pedestrian Crosswalk Behavior

Reads0
Chats0
TLDR
A novel dataset is introduced which in addition to providing the bounding box information for pedestrian detection, also includes the behavioral and contextual annotations for the scenes, which allows combining visual and semantic information for better understanding of pedestrians' intentions in various traffic scenarios.
Abstract
Designing autonomous vehicles suitable for urban environments remains an unresolved problem. One of the major dilemmas faced by autonomous cars is how to understand the intention of other road users and communicate with them. The existing datasets do not provide the necessary means for such higher level analysis of traffic scenes. With this in mind, we introduce a novel dataset which in addition to providing the bounding box information for pedestrian detection, also includes the behavioral and contextual annotations for the scenes. This allows combining visual and semantic information for better understanding of pedestrians' intentions in various traffic scenarios. We establish baseline approaches for analyzing the data and show that combining visual and contextual information can improve prediction of pedestrian intention at the point of crossing by at least 20%.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Estimating Pedestrian Crossing States Based on Single 2D Body Pose

TL;DR: The proposed shallow neural network classifier aims to recognize these three states swiftly and recognize them based on single body pose, in contrast to previous C/NC state classifiers which depend on multiple poses or contextual information.
Posted Content

Pedestrian Intention Prediction: A Multi-task Perspective

TL;DR: This work tries to solve the problem of forecasting pedestrians' intentions sufficiently in advance by jointly predicting the intention and visual states of pedestrians by using a recurrent neural network in a multi-task learning approach.
Posted Content

RNN-based Pedestrian Crossing Prediction using Activity and Pose-related Features

TL;DR: Different variations of a deep learning system are proposed to attempt to solve the problem of pedestrian crossing prediction, composed of a CNN-based feature extractor and an RNN module.
Proceedings ArticleDOI

Interactive Prediction for Multiple, Heterogeneous Traffic Participants with Multi-Agent Hybrid Dynamic Bayesian Network

TL;DR: This paper constructs an integrated framework to estimate and predict the behavior of multiple, heterogeneous agents simultaneously and incorporates prior knowledge such as map information and traffic rules into the graph structure and uses Particle Filter to track and predict intentions and trajectories of the agents.
Posted Content

Driving Datasets Literature Review.

TL;DR: This report is a survey of the different autonomous driving datasets which have been published up to date and describes the diverse driving tasks explored by the datasets.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Journal ArticleDOI

Vision meets robotics: The KITTI dataset

TL;DR: A novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research, using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras and a high-precision GPS/IMU inertial navigation system.
Book ChapterDOI

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

TL;DR: This work equips the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.
Related Papers (5)