Online Video SEEDS for Temporal Window Objectness
read more
Citations
What Makes for Effective Detection Proposals
Superpixels: An evaluation of the state-of-the-art
Video Segmentation by Non-Local Consensus voting.
Fast action proposals for human action detection and search
Spatio-Temporal Object Detection Proposals
References
The Pascal Visual Object Classes (VOC) Challenge
Efficient Graph-Based Image Segmentation
Contour Detection and Hierarchical Image Segmentation
Spectral grouping using the Nystrom method
Related Papers (5)
Frequently Asked Questions (19)
Q2. How many possible windows in a 25-frame sequence?
If each bounding box could move to 100 nearby positions in each subsequent frame, it leaves around 1050 possible temporal windows in a 25-frame sequence.
Q3. What is the definition of the 3D Under-segmentation Error?
The 3D Under-segmentation Error penalizes temporal superpixels that contain more than one object, the 3D Boundary Recall is the standard recall for temporal object boundaries, and the Explained Variation is a human-independent metric that considers how well the superpixel means represent the information in the video.
Q4. What are the standard metrics used to evaluate the accuracy of the randomized superpixel samples?
The authors use the standard metrics, which are 2D Undersegmentation Error, 2D Boundary Recall, and 2D Segmentation Accuracy for still images, and 3D Undersegmentation Error, 3D Boundary Recall and Explained Variation for video.
Q5. What is the energy function of the color histogram?
Then the energy function isH(s) = ∑k∑{Hj} (cAt:0k (j))2, (1)which is maximal when the histograms have only one nonzero bin for each video superpixel.
Q6. What is the objectness score for the test split?
As baselines, the authors use the output of boundary detectors, instead of using randomized SEEDS, to compute their objectness score in still images.
Q7. How many superpixels can be computed using two levels of integral images?
In the supplementary material, the authors show that the score can be computed very efficiently using two levels of integral images, with only 8 additions, allowing for the evaluation of over 100 million bounding boxes per second.
Q8. How does the amount of variation in video affect the accuracy of the randomized superpixels?
Note that the amount of variation grows faster in videos because it is propagated from the first frame of the video until the end.
Q9. What is the objectness measure based on randomised SEEDS?
The objectness measure based on randomised SEEDS with 5 samples outperforms the one computed using only one sample, which emphasises the usefulness of using Randomized SEEDS.
Q10. What is the objectness score for video objects?
The video objectness score is proposed as a volumetric extension of Eq. (6) in the time dimension, normalized by the tube volume (we denoted as 3D edge score in the experiments).
Q11. What is the main reason for the interest in having similar concepts extracted from time sequences?
With an increasing number of papers on the analysis of videos, the interest in having similar concepts extracted from time sequences is increasing as well.
Q12. Why is the video objectness score able to be seen as a form of multiple samples?
The reason why is that the video objectness score can be seen as a form of multiple samples as well: the score is the sum over 25 samples in time.
Q13. How can the authors evaluate the accuracy of the randomized superpixels?
The result of this experiment is shown in Fig. 8.In both cases (still images and video), a variation between samples of about 20-30% per frame can be induced without significantly affecting the accuracy of the superpixels.
Q14. What is the measure of the objectness in a still image?
In the following, the authors first define the measure of the objectness in a still image, and then the authors introduce how to extend it to temporal windows (tubes of bounding boxes).
Q15. What is the objectness measure introduced by Alexe et al.?
The objectness measure was introduced by Alexe et al. [1] for still images, whereafter [11] and [6] introduced new cues to boost performance.
Q16. What is the objectness score for the video?
to show the usefulness of the video objectness score (noted as 3D edge in the figure), the authors compare with a method that uses only propagation.
Q17. What is the way to extract the superpixel structure?
In practice, the authors use 4 block layers and propagate at the 2nd layer, as shown in Fig. 4.Some superpixel methods offer extra capabilities, such as the extraction of a hierarchy of superpixels [17].
Q18. What is the value of the variable for the uniform random noise in the interval?
This is,int(cBtn , cAt:0m ) + aξ ≥ int(cBtn , cAt:0n \\Btn), (5) where ξ is the variable for the uniform random noise in the interval [−1, 1] and a is a scale factor.
Q19. How can the authors use the algorithm to calculate the objectness of a video?
In this paper the authors have introduced a novel online video superpixel algorithm that is able to run in real-time, with accuracy comparable to offline methods.