scispace - formally typeset
Search or ask a question

Estimating 3D-trajectories from Monocular Video Sequences

01 Jan 2015-
TL;DR: In this article, the authors investigated how past observations from a stereo system can be used to recreate trajectories when video from only one of the cameras is available, and the best method was found to be a nearest neighbors-search optimized by a Kalman filter.
Abstract: Tracking a moving object and reconstructing its trajectory can be done with a stereo camera system, since the two cameras enable depth vision. However, such a system would not work if one of the cameras fails to detect the object. If that happens, it would be beneficial if the system could still use the functioning camera to make an approximate trajectory reconstruction.In this study, I have investigated how past observations from a stereo system can be used to recreate trajectories when video from only one of the cameras is available. Several approaches have been implemented and tested, with varying results. The best method was found to be a nearest neighbors-search optimized by a Kalman filter. On a test set with 10000 golf shots, the algorithm was able to create estimations which on average differed around 3.5 meters from the correct trajectory, with better results for trajec-tories originating close to the camera.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
01 Oct 2019
TL;DR: It is shown theoretically and empirically that a simple motion trajectory analysis suffices to translate from pixel measurements to the person's metric height, reaching a MAE of up to 3.9 cm on jumping motions, and that this works without camera and ground plane calibration.
Abstract: Estimating the metric height of a person from monocular imagery without additional assumptions is ill-posed. Existing solutions either require manual calibration of ground plane and camera geometry, special cameras, or reference objects of known size. We focus on motion cues and exploit gravity on earth as an omnipresent reference 'object' to translate acceleration, and subsequently height, measured in image-pixels to values in meters. We require videos of motion as input, where gravity is the only external force. This limitation is different to those of existing solutions that recover a person's height and, therefore, our method opens up new application fields. We show theoretically and empirically that a simple motion trajectory analysis suffices to translate from pixel measurements to the person's metric height, reaching a MAE of up to 3.9 cm on jumping motions, and that this works without camera and ground plane calibration.

18 citations


Cites background or methods from "Estimating 3D-trajectories from Mon..."

  • ..., when the direction of gravity, camera intrinsic, and extrinsic parameters are calibrated, it is true that q can be further decomposed to compute the object’s distance d and extend in all directions, which was the focus of previous studies [16, 23, 24, 17, 29]....

    [...]

  • ...Our method is inspired by approaches that estimate the 3D trajectory of rigid objects in free fall [16, 23, 24, 17, 29]....

    [...]

Journal ArticleDOI
TL;DR: This work proposes to address 3D ball localization on a single image from a calibrated monocular camera by estimating ball diameter in pixels and use the knowledge of real balliameter in meters, which is suitable for any game situation where the ball is (even partly) visible.
Abstract: Ball 3D localization in team sports has various applications including automatic offside detection in soccer, or shot release localization in basketball. Today, this task is either resolved by using expensive multi-views setups, or by restricting the analysis to ballistic trajectories. In this work, we propose to address the task on a single image from a calibrated monocular camera by estimating ball diameter in pixels and use the knowledge of real ball diameter in meters. This approach is suitable for any game situation where the ball is (even partly) visible. To achieve this, we use a small neural network trained on image patches around candidates generated by a conventional ball detector. Besides predicting ball diameter, our network outputs the confidence of having a ball in the image patch. Validations on 3 basketball datasets reveals that our model gives remarkable predictions on ball 3D localization. In addition, through its confidence output, our model improves the detection rate by filtering the candidates produced by the detector. The contributions of this work are (i) the first model to address 3D ball localization on a single image, (ii) an effective method for ball 3D annotation from single calibrated images, (iii) a high quality 3D ball evaluation dataset annotated from a single viewpoint. In addition, the code to reproduce this research will be made freely available at https://github.com/gabriel-vanzandycke/deepsport

3 citations

01 Jan 2017
TL;DR: This thesis is concerned with the problem of predicting the remaining part of the trajectory of a golf ball as it travels through the air where only the three-dimensional position of the ball is known.
Abstract: This thesis is concerned with the problem of predicting the remaining part of the trajectory of a golf ball as it travels through the air where only the three-dimensional position of the ball is ca ...

3 citations


Cites background from "Estimating 3D-trajectories from Mon..."

  • ...Sköld [40] studied a similar problem, by inferring the 3D-trajectory of a golf ball using data only captured from one camera (a 2D-trajectory) and a database of 3D-trajectories with the corresponding 2D-trajectories....

    [...]

Journal ArticleDOI
TL;DR: The real time algorithm of moving objects tracking and detection using region property and color segmentation is presented, investigating a development of tracking algorithm to the real time affecting bodies with unlike frames of the videotape by use color characteristic and movement.
Abstract: This paper presents the real time algorithm of moving objects tracking and detection using region property and color segmentation. The real time of moving objects pathway finder is a vitaldifficultyproblem in human computersinterface and video observation. The attitude of track and detect a moving objects using color characteristic and movements has introduced with new techniques for automation. The tracking of video is a method of discovery the travelthing over specific reserve by use a color camera to narrate target bodies in successive video borders. Respecting to frame rate, the relationship could be especially troublesome in case of speedy moving of objects.In interchange case, the issue grows of randomness is the time in case of the tracking objects varying the direction following eventually. For this cases, the video tracking design model are classicallyexploit the progress model willportrays process the image of the target when it CHANGE for characteristicimaginableobjects movements. A development of tracking algorithm to the real time affecting bodies with unlike frames of the videotape by use color characteristic and movement is investigated and produced in this work.

Cites background from "Estimating 3D-trajectories from Mon..."

  • ...The tracking technology could be typified as a difficulty of evaluating the route of anthings in the image as it shiftsroughlyat the sight [1]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive survey of deep learning in sports performance, focusing on three main aspects: algorithms, datasets and virtual environments, and challenges, is presented in this article , which provides valuable reference material for researchers interested in deep learning for sports applications.
Abstract: Deep learning has the potential to revolutionize sports performance, with applications ranging from perception and comprehension to decision. This paper presents a comprehensive survey of deep learning in sports performance, focusing on three main aspects: algorithms, datasets and virtual environments, and challenges. Firstly, we discuss the hierarchical structure of deep learning algorithms in sports performance which includes perception, comprehension and decision while comparing their strengths and weaknesses. Secondly, we list widely used existing datasets in sports and highlight their characteristics and limitations. Finally, we summarize current challenges and point out future trends of deep learning in sports. Our survey provides valuable reference material for researchers interested in deep learning in sports applications.
References
More filters
Journal ArticleDOI
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

72,897 citations


"Estimating 3D-trajectories from Mon..." refers background in this paper

  • ...Hochreiter and Schmidhuber [17] proposed a novel network called Long Short-TermMemory Network which overcomes that problem....

    [...]

01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

14,282 citations


"Estimating 3D-trajectories from Mon..." refers background in this paper

  • ...where x and X are given in homogeneous coordinates [15]....

    [...]

  • ...Conversely, if both x and x′ are known, it is possible to calculate the exact position of point X, a method known as triangulation [15]....

    [...]

  • ...It can be described algebraically by a 3× 3 matrix F called the fundamental matrix [15]....

    [...]

  • ...is called the camera calibration matrix or intrinsic matrix [15]....

    [...]

  • ...will be the corresponding 2D-point [15]....

    [...]

Posted Content
TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Abstract: Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT'14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

11,936 citations


"Estimating 3D-trajectories from Mon..." refers background or methods in this paper

  • ...sequence-to-sequence-mapping, speech recognition and similar tasks [12, 27]....

    [...]

  • ...Unfortunately, standard neural networks are inadequate to work with time series since they assume fixed size input [27]....

    [...]

  • ...Artificial neural networks have been used in a variety of machine learning tasks with good results [27]....

    [...]

Proceedings Article
07 Sep 1999
TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.
Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

3,705 citations


"Estimating 3D-trajectories from Mon..." refers methods in this paper

  • ...tive k-nearest neighbors query using locality sensitive hashing (LSH) [13]....

    [...]

Book ChapterDOI
13 Oct 1993
TL;DR: An indexing method for time sequences for processing similarity queries using R * -trees to index the sequences and efficiently answer similarity queries and provides experimental results which show that the method is superior to search based on sequential scanning.
Abstract: We propose an indexing method for time sequences for processing similarity queries. We use the Discrete Fourier Transform (DFT) to map time sequences to the frequency domain, the crucial observation being that, for most sequences of practical interest, only the first few frequencies are strong. Another important observation is Parseval's theorem, which specifies that the Fourier transform preserves the Euclidean distance in the time or frequency domain. Having thus mapped sequences to a lower-dimensionality space by using only the first few Fourier coefficients, we use R * -trees to index the sequences and efficiently answer similarity queries. We provide experimental results which show that our method is superior to search based on sequential scanning. Our experiments show that a few coefficients (1–3) are adequate to provide good performance. The performance gain of our method increases with the number and length of sequences.

2,082 citations


"Estimating 3D-trajectories from Mon..." refers background in this paper

  • ...Indexes of the transformed sequences are commonly stored with multidimensional R-trees [1]....

    [...]