scispace - formally typeset
Search or ask a question
Topic

Human visual system model

About: Human visual system model is a research topic. Over the lifetime, 8697 publications have been published within this topic receiving 259440 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A new philosophy in designing image and video quality metrics is followed, which uses structural dis- tortion as an estimate of perceived visual distortion as part of full-reference (FR) video quality assessment.
Abstract: Objective image and video quality measures play important roles in a variety of image and video pro- cessing applications, such as compression, communication, printing, analysis, registration, restoration, enhancement and watermarking. Most proposed quality assessment ap- proaches in the literature are error sensitivity-based meth- ods. In this paper, we follow a new philosophy in designing image and video quality metrics, which uses structural dis- tortion as an estimate of perceived visual distortion. A com- putationally ecient approach is developed for full-reference (FR) video quality assessment. The algorithm is tested on the video quality experts group (VQEG) Phase I FR-TV test data set. Keywords—Image quality assessment, video quality assess- ment, human visual system, error sensitivity, structural dis- tortion, video quality experts group (VQEG)

1,083 citations

Proceedings ArticleDOI
25 Apr 2018
TL;DR: This work proposes to extend an object recognition system with an attention based few-shot classification weight generator, and to redesign the classifier of a ConvNet model as the cosine similarity function between feature representations and classification weight vectors.
Abstract: The human visual system has the remarkably ability to be able to effortlessly learn novel concepts from only a few examples. Mimicking the same behavior on machine learning vision systems is an interesting and very challenging research problem with many practical advantages on real world vision applications. In this context, the goal of our work is to devise a few-shot visual learning system that during test time it will be able to efficiently learn novel categories from only a few training data while at the same time it will not forget the initial categories on which it was trained (here called base categories). To achieve that goal we propose (a) to extend an object recognition system with an attention based few-shot classification weight generator, and (b) to redesign the classifier of a ConvNet model as the cosine similarity function between feature representations and classification weight vectors. The latter, apart from unifying the recognition of both novel and base categories, it also leads to feature representations that generalize better on "unseen" categories. We extensively evaluate our approach on Mini-ImageNet where we manage to improve the prior state-of-the-art on few-shot recognition (i.e., we achieve 56.20% and 73.00% on the 1-shot and 5-shot settings respectively) while at the same time we do not sacrifice any accuracy on the base categories, which is a characteristic that most prior approaches lack. Finally, we apply our approach on the recently introduced few-shot benchmark of Bharath and Girshick [4] where we also achieve state-of-the-art results.

1,082 citations

Journal ArticleDOI
TL;DR: A novel observation model based on motion compensated subsampling is proposed for a video sequence and Bayesian restoration with a discontinuity-preserving prior image model is used to extract a high-resolution video still given a short low-resolution sequence.
Abstract: The human visual system appears to be capable of temporally integrating information in a video sequence in such a way that the perceived spatial resolution of a sequence appears much higher than the spatial resolution of an individual frame. While the mechanisms in the human visual system that do this are unknown, the effect is not too surprising given that temporally adjacent frames in a video sequence contain slightly different, but unique, information. This paper addresses the use of both the spatial and temporal information present in a short image sequence to create a single high-resolution video frame. A novel observation model based on motion compensated subsampling is proposed for a video sequence. Since the reconstruction problem is ill-posed, Bayesian restoration with a discontinuity-preserving prior image model is used to extract a high-resolution video still given a short low-resolution sequence. Estimates computed from a low-resolution image sequence containing a subpixel camera pan show dramatic visual and quantitative improvements over bilinear, cubic B-spline, and Bayesian single frame interpolations. Visual and quantitative improvements are also shown for an image sequence containing objects moving with independent trajectories. Finally, the video frame extraction algorithm is used for the motion-compensated scan conversion of interlaced video data, with a visual comparison to the resolution enhancement obtained from progressively scanned frames.

1,058 citations

Journal ArticleDOI
TL;DR: The capability of the human visual system with respect to face identification, analysis of facial expressions, and classification based on physical features of the face are discussed.

1,008 citations

Journal ArticleDOI
TL;DR: A new approach to mask the watermark according to the characteristics of the human visual system (HVS) is presented, which is accomplished pixel by pixel by taking into account the texture and the luminance content of all the image subbands.
Abstract: A watermarking algorithm operating in the wavelet domain is presented. Performance improvement with respect to existing algorithms is obtained by means of a new approach to mask the watermark according to the characteristics of the human visual system (HVS). In contrast to conventional methods operating in the wavelet domain, masking is accomplished pixel by pixel by taking into account the texture and the luminance content of all the image subbands. The watermark consists of a pseudorandom sequence which is adaptively added to the largest detail bands. As usual, the watermark is detected by computing the correlation between the watermarked coefficients and the watermarking code, and the detection threshold is chosen in such a way that the knowledge of the watermark energy used in the embedding phase is not needed, thus permitting one to adapt it to the image at hand. Experimental results and comparisons with other techniques operating in the wavelet domain prove the effectiveness of the new algorithm.

949 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202349
202294
2021279
2020311
2019351
2018348