A key frames selection algorithm based on three iso-content principles (iso-content distance, iso- Content error and iso- content distortion) is presented, so that the selected key frames are equidistant in video content according to the used principle.
Abstract:
We present a key frames selection algorithm based on three iso-content principles (iso-content distance, iso-content error and iso-content distortion), so that the selected key frames are equidistant in video content according to the used principle. Two automatic approaches for defining the most appropriate number of key frames are proposed by exploiting supervised and unsupervised content criteria. Experimental results and the comparisons with existing methods from literature on large dataset of real-life video sequences illustrate the high performance of the proposed schemata.
TL;DR: This paper formulate the video summarization task with a novel minimum sparse reconstruction (MSR) problem, where the original video sequence can be best reconstructed with as few selected keyframes as possible.
TL;DR: This work proposes a keypoint-based framework to address the keyframe selection problem so that local features can be employed in selecting keyframes, and introduces two criteria, coverage and redundancy, based on keypoint matching in the selection process.
TL;DR: The proposed Bag-of-Importance (BoI) model for static video summarization is able to exploit both the inter-frame and intra-frame properties of feature representations and identify keyframes capturing both the dominant content and discriminative details within a video.
TL;DR: An Eratosthenes Sieve based key-frame extraction approach for video summarization (VS) which can work better for real-time applications and outperform the state-of-the-art models on F-measure.
TL;DR: A new approach which interactively combines visualization of categorical changes over time; various spatial data displays; computational techniques for task-oriented selection of time steps provides an expressive visualization with regard to either the overall evolution over time or unusual changes.
TL;DR: An overview of color and texture descriptors that have been approved for the Final Committee Draft of the MPEG-7 standard is presented, explained in detail by their semantics, extraction and usage.
TL;DR: The results of a performance evaluation and characterization of a number of shot-change detection methods that use color histograms, block motion matching, or MPEG compressed data are presented.
TL;DR: This work proposes techniques to analyze video and build a compact pictorial summary for visual presentation and presents a set of video posters, each of which is a compact, visually pleasant, and intuitive representation of the story content.
TL;DR: A novel method for generating key frames and previews for an arbitrary video sequence by first applying multiple partitional clustering to all frames of a video sequence and then selecting the most suitable clustering option(s) using an unsupervised procedure for cluster-validity analysis.
TL;DR: In this application, video summaries that emphasize both content balance and perceptual quality can be generated directly from a temporal graph that embeds both the structure and attention information.
Q1. What are the popular video sequences that the authors have used?
the authors have used the widely known as MPEG test sequences like coast sequence, the table tennis sequence, hall monitor sequence, etc..
Q2. How long does it take to execute the proposed algorithm?
A typical processing time for the execution of the proposed EP algorithm, when the shot contains 300 images (e.g. coast MPEG sequence) and M = 10, is between 4 to 5 seconds depending on the used principle.
Q3. What is the first approach to extract key frames?
The first exploits a minimization of a cross correlation criterion [7], so that the most uncorrelated frames are extracted as key ones.
Q4. what is the goal of the proposed method?
the goal of the proposed method is to compute M − 2 key frames K , t′i, i ∈ {2, · · · , M − 1}, under the constraint that are equidistant in the sense of the used semimetric function g(x, y),g(t′i−1, t ′ i) = g(t ′ i, t ′ i+1), i ∈ {2, · · · , M − 1}, with t′1 = 0 and t′M = 1.
Q5. What is the definition of the equipartition problem?
Let g(x, y), where x, y ∈ [0, 1] denote normalized time variables, be the used smooth semimetric function between two curve points C(x), C(y).
Q6. What is the simplest way to measure the content distance of two CLDs?
The authors used the following function D to measure the content distance of two CLDs, {DY, DCb, DCr} and {DY ′, DCb′, DCr′}, D = √∑ i (DYi − DY ′i )2 + √∑ i (DCbi − DCb′i)2 +√∑ i (DCri − DCr′i)2, where, (DY, DCb, DCr) represent the ith DCT coefficients of the respective color components.
Q7. What is the simplest way to solve the equipartition problem?
Let M be the number of the selected key frames and t′i ∈ [0, 1], i ∈ {1, · · · , M} be the selected key frames under the normalized time space.
Q8. what is the proposed method for selecting the first and last key frame?
The proposed method selects the first and the last key frame to be the first and the last frame of the shot sequence, respectively (t′0 = 0, t ′ 1 = 1).
Q9. What is the definition of the Iso-Content Distance principle?
If the authors define the total distortion as the maximum of the corresponding distortions maxi∈{1,2,··· ,M−1} d̄(t′i, t ′ i+1), then almost optimal solutions are achieved using the proposed schema.