Bio: Sumana Gupta is an academic researcher from Indian Institute of Technology Kanpur. The author has contributed to research in topic(s): Motion compensation & Motion estimation. The author has an hindex of 11, co-authored 74 publication(s) receiving 338 citation(s).
Topics: Motion compensation, Motion estimation, Image restoration, Automatic summarization, Video tracking
••07 Nov 2009
TL;DR: This paper proposes a semi-automatic process for colorization where the user indicates how each region should be colored by putting the desired color marker in the interior of the region and the proposed algorithm colors the entire video sequence.
Abstract: Colorization is a computer-aided process of adding color to a grayscale image or video. The task of colorizing a grayscale image involves assigning three dimensional (RGB) pixel values to an image which varies along only one dimension (luminance or intensity). Since different colors may have the same luminance value but vary in hue or saturation, mapping between intensity and color is not unique, and colorization is ambiguous in nature, requiring some amount of human interaction or external information. In this paper we propose a semi-automatic process for colorization where the user indicates how each region should be colored by putting the desired color marker in the interior of the region. The algorithm based on the position and color of the markers, segments the image and colors it. In order to colorize videos, few reference frames are chosen manually from a set of automatically generated key frames and colorized using the above marker approach and their chrominance information is then transferred to the other frames in the video using a color transfer technique making use of motion estimation. The colorization results obtained are visually very good. In addition the amount of manual intervention is reduced since the user only has to apply color markers on few selected reference frames and the proposed algorithm colors the entire video sequence.
••08 May 2014
TL;DR: This work proposes a fast algorithm to increase the contrast of an image locally using singular value decomposition (SVD) approach and attempts to define some parameters which can give clues related to the progress of the enhancement process.
Abstract: Image enhancement is a well established field in image processing. The main objective of image enhancement is to increase the perceptual information contained in an image for better representation using some intermediate steps, like, contrast enhancement, debluring, denoising etc. Among them, contrast enhancement is especially important as human eyes are more sensitive to luminance than the chrominance components of an image. Most of the contrast enhancement algorithms proposed till now are global methods. The major drawback of this global approach is that in practical scenarios, the contrast of an image does not deteriorate uniformly and the outputs of the enhancement techniques reach saturation at proper contrast points. That leads to information loss. In fact, to the best of our knowledge, no non-reference perceptual measure of image quality has yet been proposed to measure localized enhancement. We propose a fast algorithm to increase the contrast of an image locally using singular value decomposition (SVD) approach and attempt to define some parameters which can give clues related to the progress of the enhancement process.
TL;DR: This paper introduces for the first time the features of the human visual system within the summarization framework itself to allow for the emphasis of perceptually significant events while simultaneously eliminating perceptual redundancy from the summaries.
Abstract: The enormous growth of video content in recent times has raised the need to abbreviate the content for human consumption. Thus, there is a need for summaries of a quality that meets the requirements of human users. This also means that the summarization must incorporate the peculiar features of human perception. We present a new framework for video summarization in this paper. Unlike many available summarization algorithms that utilize only statistical redundancy, we introduce for the first time the features of the human visual system within the summarization framework itself to allow for the emphasis of perceptually significant events while simultaneously eliminating perceptual redundancy from the summaries. The subjective and objective evaluation scores have evaluated the framework.
••14 Jul 2017
TL;DR: An overview of how to use an optimal summarization framework for surveillance videos is given and a proposal to convert content based video retrieval problem into a content based image retrieval problem is proposed.
Abstract: In recent years, video surveillance technology has become ubiquitous in every sphere of our life. But automated video surveillance generates huge quantities of data, which ultimately does rely upon manual inspection at some stage. The present work aims to address this ever increasing gap between the volumes of actual data generated and the volume that can be reasonably inspected manually. It is laborious and time consuming to scrutinize the salient events from the large video databases. We introduce smart surveillance by using video summarization for various applications. Techniques like video summarization epitomizes the vast content of a video in a succinct manner. In this paper, we give an overview how to use an optimal summarization framework for surveillance videos. In addition to reduce the search time we propose to convert content based video retrieval problem into a content based image retrieval problem. We have performed several experiments on different data sets to validate our proposed approach for smart surveillance.
TL;DR: This is the first time that an accident detection as an optimization problem and filter the frames to be selected, through a single formulation is solved, for vehicular accidents.
Abstract: Roads are the vital mode of transportation for people and goods around the globe and its use has grown dramatically over the years. There is one death every four minutes due to road accidents in the developing nations. This is of deep concern to the entire humanity. Road accident detection and vehicle behavior analysis is of great interest to the research community in intelligent transportation systems. It is very difficult from the state of the art techniques to provide the abstract form of salient parts of accidents from road surveillance videos. To resolve these issues, we present perceptual video summarization techniques to enrich the speed of visualizing the accident content from a stack of videos. The problem of vehicle analysis is formulated as an optimization problem. To the best of our knowledge, this is the first time we solve an accident detection as an optimization problem and filter the frames to be selected, through a single formulation. With the camera in a surrounding infrastructure and capturing a video, we exploited the properties of sub modularity to provide a relevant and condensed key frame summary. We have studied it for various real world traffic surveillance videos comprising of vehicular accidents and thus making it a promising approach.
••01 Jun 2019
TL;DR: A visual-attention-consistent Densely Annotated VSOD (DAVSOD) dataset, which contains 226 videos with 23,938 frames that cover diverse realistic-scenes, objects, instances and motions, and a baseline model equipped with a saliency shift- aware convLSTM, which can efficiently capture video saliency dynamics through learning human attention-shift behavior is proposed.
Abstract: The last decade has witnessed a growing interest in video salient object detection (VSOD). However, the research community long-term lacked a well-established VSOD dataset representative of real dynamic scenes with high-quality annotations. To address this issue, we elaborately collected a visual-attention-consistent Densely Annotated VSOD (DAVSOD) dataset, which contains 226 videos with 23,938 frames that cover diverse realistic-scenes, objects, instances and motions. With corresponding real human eye-fixation data, we obtain precise ground-truths. This is the first work that explicitly emphasizes the challenge of saliency shift, i.e., the video salient object(s) may dynamically change. To further contribute the community a complete benchmark, we systematically assess 17 representative VSOD algorithms over seven existing VSOD datasets and our DAVSOD with totally ~84K frames (largest-scale). Utilizing three famous metrics, we then present a comprehensive and insightful performance analysis. Furthermore, we propose a baseline model. It is equipped with a saliency shift- aware convLSTM, which can efficiently capture video saliency dynamics through learning human attention-shift behavior. Extensive experiments open up promising future directions for model development and comparison.
01 Nov 1985
A.K. Krishnamurthy1•Institutions (1)
TL;DR: You must turn in your code as well as output files to generate a report that contains the code and ouput in a single readable format.
Abstract: You must turn in your code as well as output files. Please generate a report that contains the code and ouput in a single readable format. Getting Started You may want to download Irfanview image viewing software. It handles pretty much any image type, lets you convert, and provides batch processing. Download the sample images from the class website. The following question operates on the city.jpg image. (a) Perform image smoothing using a 7×7 averaging filter and a Gaussian filter with σ = 0.5 and 3. Compare the outputs. (b) Perform edge enhancement using the Sobel operator (Matlab's default parameters). Repeat using the Laplacian and Laplacian of Guassian operators. Compare the outputs 2. Frequency Domain Filtering The following question operates on the city.jpg image. (a) Find the Fourier transform of the image. Be sure to center the frequencies. (b) Perform image smoothing in the frequency domain using the filters defined in the previous problem. Compare the output images from the two methods (spatial and frequency) and the time for operation. (c) Perform edge enhancement using the filters defined in the previous problem. (d) Define a lowpass filter in the frequency domain with radius of 1/4 the height. Show the result. Repeat with a similar sized Guassian and compare the results. Give the σ parameter you used and show the output transform image. (e) Repeat with a rectangular filter with the same dimension as the ideal lowpass. Compare the results between the ideal filter and the rectangular approximation. 3. Canny Edge Detection (a) Give the convolution kernels for determining the gradient. You may examine the function gradient.m to help with the explanation. (It may be easiest to apply the gradient to an impulse and inspect the results. (b) Implement the simplified version of the Canny edge detector (single scale). The syntax of the function should be where E contains the detected edges, M the smoothed gradient magnitude, A contains the gradient angle, I is the input image, sig is the σ parameter for the smoothing filter, and tau= [τ h , τ l ] is the two element vector containing the hysteresis thresholds. See Algorithm 6.4 for non-maximal suppression and Algorithm 6.5 for hysteresis thresh-olding. (It may be more efficient to implement the hysteresis as edge tracking)
01 Jan 2016
TL;DR: The handbook of image and video processing is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for reading handbook of image and video processing. As you may know, people have search numerous times for their favorite novels like this handbook of image and video processing, but end up in infectious downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful bugs inside their laptop. handbook of image and video processing is available in our book collection an online access to it is set as public so you can download it instantly. Our digital library hosts in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the handbook of image and video processing is universally compatible with any devices to read.
•24 Jul 1994
Author's H-index: 11