3D Ball Localization From A Single Calibrated Image

doi:10.1109/cvprw56347.2022.00391

Open AccessProceedings ArticleDOI

3D Ball Localization From A Single Calibrated Image

TLDR

In this article , a small neural network trained on image patches around candidates generated by a conventional ball detector is used to predict the confidence of having a ball in the image patch, and through its confidence output, the model improves the detection rate by filtering the candidates produced by the detector.

Abstract:

Ball 3D localization in team sports has various applications including automatic offside detection in soccer, or shot release localization in basketball. Today, this task is either resolved by using expensive multi-views setups, or by restricting the analysis to ballistic trajectories. In this work, we propose to address the task on a single image from a calibrated monocular camera by estimating ball diameter in pixels and use the knowledge of real ball diameter in meters. This approach is suitable for any game situation where the ball is (even partly) visible. To achieve this, we use a small neural network trained on image patches around candidates generated by a conventional ball detector. Besides predicting ball diameter, our network outputs the confidence of having a ball in the image patch. Validations on 3 basketball datasets reveals that our model gives remarkable predictions on ball 3D localization. In addition, through its confidence output, our model improves the detection rate by filtering the candidates produced by the detector. The contributions of this work are (i) the first model to address 3D ball localization on a single image, (ii) an effective method for ball 3D annotation from single calibrated images, (iii) a high quality 3D ball evaluation dataset annotated from a single viewpoint. In addition, the code to reproduce this research will be made freely available at https://github.com/gabriel-vanzandycke/deepsport

3D Ball Localization From A Single Calibrated Image

Citations

DeepSportradar-v1: Computer Vision Dataset for Sports Understanding with High Quality Annotations

References

Gradient-based learning applied to document recognition

Use of the Hough transformation to detect lines and curves in pictures

Robust Estimation of a Location Parameter

ICNet for Real-Time Semantic Segmentation on High-Resolution Images

A Semi-automatic System for Ground Truth Generation of Soccer Video Sequences

Related Papers (5)

Bayesian Pixel Classification for Human Tracking

Performance of Stereo Methods in Cluttered Scenes

Moving Object Detection for Moving Cameras on Superpixel Level

Detecting Obstacle in 3D Space using Monocular Camera

Monocular 3D vision for a robot assembly environment