Unsupervised video summarization framework using keyframe extraction and video skimming

doi:10.1109/ICCCA49541.2020.9250764

Open AccessProceedings ArticleDOI

Unsupervised video summarization framework using keyframe extraction and video skimming

Shruti Jadon, +1 more

- 10 Oct 2019 -

arXiv: Information Retrieval

TLDR

This paper attempts to solve video summarization through unsupervised learning by employing traditional vision-based algorithmic methodologies for accurate feature extraction from video frames and proposes a deep learning-based feature extraction followed by multiple clustering methods to find an effective way of summarizing a video by interesting key-frame extraction.

Abstract:

Video is one of the robust sources of information and the consumption of online and offline videos has reached an unprecedented level in the last few years. A fundamental challenge of extracting information from videos is a viewer has to go through the complete video to understand the context, as opposed to an image where the viewer can extract information from a single frame. Apart from context understanding, it almost impossible to create a universal summarized video for everyone, as everyone has their own bias of keyframe, e.g; In a soccer game, a coach person might consider those frames which consist of information on player placement, techniques, etc; however, a person with less knowledge about a soccer game, will focus more on frames which consist of goals and score-board. Therefore, if we were to tackle problem video summarization through a supervised learning path, it will require extensive personalized labeling of data. In this paper, we attempt to solve video summarization through unsupervised learning by employing traditional vision-based algorithmic methodologies for accurate feature extraction from video frames. We have also proposed a deep learning-based feature extraction followed by multiple clustering methods to find an effective way of summarizing a video by interesting key-frame extraction. We have compared the performance of these approaches on the SumMe dataset and showcased that using deep learning-based feature extraction has been proven to perform better in case of dynamic viewpoint videos.

Unsupervised video summarization framework using keyframe extraction and video skimming

Citations

Improving Siamese Networks for One-Shot Learning Using Kernel-Based Activation Functions

A Multimodal Corpus for Emotion Recognition in Sarcasm

Video Summarization Techniques: A Review

Deep Learning Framework Based on Audio–Visual Features for Video Summarization

VSMCNN-dynamic summarization of videos using salient features from multi-CNN model

References

Histograms of oriented gradients for human detection

ImageNet Large Scale Visual Recognition Challenge

Face Description with Local Binary Patterns: Application to Face Recognition

Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods

Image enhancement based on equal area dualistic sub-image histogram equalization method

Related Papers (5)

Unsupervised video summarization framework using keyframe extraction and video skimming

Video Summarization using Deep Semantic Features

Frame clustering technique towards single video summarization

Comprehensive Video Understanding: Video Summarization with Content-Based Video Recommender Design

Extractive Video Summarizer with Memory Augmented Neural Networks