Space-time interest points
read more
Citations
Learning Spatiotemporal Features with 3D Convolutional Networks
3D Convolutional Neural Networks for Human Action Recognition
3D Convolutional Neural Networks for Human Action Recognition
Learning realistic human actions from movies
References
Object recognition from local scale-invariant features
A Combined Corner and Edge Detector
C ONDENSATION —Conditional Density Propagation forVisual Tracking
Performance of optical flow techniques
Related Papers (5)
Frequently Asked Questions (14)
Q2. What are the main obstacles in previous approaches for human motion analysis?
The need of careful initialization and/or simple background have been frequent obstacles in previous approaches for human motion analysis.
Q3. What is the significance of the normalization with respect to scale parameters?
The normalization with respect to scale parameters guarantees the invariance of the derivative responses with respect to image scalings in both the spatial domain and the temporal domain.
Q4. What is the idea of k-means clustering?
The clustering of spatio-temporal neighborhoods is similar to the idea of textons [21] used to describe image texture as well as to detect object parts for spatial recognition [31].
Q5. What are the successful applications of interest point detectors?
Highly successful applications of interest point detectors have been presented for image indexing [25], stereo matching [30, 23, 29], optic flow estimation and tracking [28], and recognition [20, 10].
Q6. What is the way to find the spatial maxima of H?
For sufficiently large values of k, positive local maxima of H correspond to points with high variation of the image gray-values in both the spatial and the temporal dimensions.
Q7. What is the way to find the match between the model and the data?
To find the best match between the model and the data, the authors search for the model state X̃ that minimizes H in (15)X̃ = argminX H(M̃(X), D, t0) (17)using a standard Gauss-Newton optimization method.
Q8. What is the distance between features of different classes?
The distance h between two features of the same class is defined as a Euclidean distance between two points in space-time, where the spatial and the temporal dimensions are weighted with respect to parameter ν as well as by extents of features in space-timeh2(fm, fd) = (xm − xd)2 + (ym − yd)2 (1 − ν)(σm)2 + (tm − td)2 ν(τm)2 .(16) The distance between features of different classes is regarded as infinite.
Q9. What is the semi-group property of the Gaussian kernel?
Using the semi-group property of the Gaussian kernel, it follows that the scale-space representation of f is L(x, y, t; σ2, τ2) = g(x, y, t; σ20 + σ2, τ20 + τ2).
Q10. How can the authors estimate the spatial extent of a Gaussian blob?
As shown in [15], given the appropriate normalization parameters a = 1, b = 1/4, c = 1/2 and d = 3/4, the size of the blob f can be estimated from the scale values σ̃2 and τ̃2 for which ∇2normL assumes local extrema over scales, space and time.
Q11. How can the authors estimate the spatio-temporal extent of a Gaussian blo?
the spatio-temporal extent of the blob can be estimated by detecting local extrema of∇2normL = σ2τ1/2(Lxx + Lyy) + στ3/2Ltt. (10) over both spatial and temporal scales.
Q12. What is the current scheme for defining the walking model?
It follows that the current scheme does not allow for scalings of the model in the temporal direction and enables only the first-order variations of positions and sizes of the model features over time.
Q13. How do the authors estimate the spatio-temporal extent of an event?
The authors do this by iteratively updating the scale and the position of the interest points by (i) selecting the neighboring spatio-temporal scale that maximizes (∇2normL)2 and (ii) re-detecting the space-time location of the interest point at a new scale until the position and the scale converge to the stable values [15].
Q14. What is the simplest way to find spatio-temporal interest points?
Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV’03) 0-7695-1950-4/03 $ 17.00 © 2003 IEEETo illustrate the detection of spatio-temporal interest points on synthetic image sequences, the authors show the spatio-temporal data as 3-D space-time plots where the original signal is represented by a threshold surface while the detected interest points are presented by ellipsoids with semi-axes proportional to corresponding scale parameters σl and τl.