scispace - formally typeset
Search or ask a question
Topic

Human visual system model

About: Human visual system model is a research topic. Over the lifetime, 8697 publications have been published within this topic receiving 259440 citations.


Papers
More filters
Proceedings ArticleDOI
21 Jul 2017
TL;DR: Zhang et al. as discussed by the authors proposed a Visual Translation Embedding Network (VTransE) for visual relation detection, which places objects in a low-dimensional relation space where a relation can be modeled as a simple vector translation, i.e., subject + predicate.
Abstract: Visual relations, such as person ride bike and bike next to car, offer a comprehensive scene understanding of an image, and have already shown their great utility in connecting computer vision and natural language. However, due to the challenging combinatorial complexity of modeling subject-predicate-object relation triplets, very little work has been done to localize and predict visual relations. Inspired by the recent advances in relational representation learning of knowledge bases and convolutional object detection networks, we propose a Visual Translation Embedding network (VTransE) for visual relation detection. VTransE places objects in a low-dimensional relation space where a relation can be modeled as a simple vector translation, i.e., subject + predicate ≈ object. We propose a novel feature extraction layer that enables object-relation knowledge transfer in a fully-convolutional fashion that supports training and inference in a single forward/backward pass. To the best of our knowledge, VTransE is the first end-toend relation detection network. We demonstrate the effectiveness of VTransE over other state-of-the-art methods on two large-scale datasets: Visual Relationship and Visual Genome. Note that even though VTransE is a purely visual model, it is still competitive to the Lu’s multi-modal model with language priors [27].

484 citations

Journal ArticleDOI
TL;DR: A deep neural network-based approach to image quality assessment (IQA) that allows for joint learning of local quality and local weights in an unified framework and shows a high ability to generalize between different databases, indicating a high robustness of the learned features.
Abstract: We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features.

479 citations

Proceedings ArticleDOI
10 Sep 2000
TL;DR: A new approach that can blindly measure blocking artifacts in images without reference to the originals is proposed, which has the flexibility to integrate human visual system features such as the luminance and the texture masking effects.
Abstract: The objective measurement of blocking artifacts plays an important role in the design, optimization, and assessment of image and video coding systems. We propose a new approach that can blindly measure blocking artifacts in images without reference to the originals. The key idea is to model the blocky image as a non-blocky image interfered with a pure blocky signal. The task of the blocking effect measurement algorithm is then to detect and evaluate the power of the blocky signal. The proposed approach has the flexibility to integrate human visual system features such as the luminance and the texture masking effects.

473 citations

Journal ArticleDOI
TL;DR: These experiments show how top-down visual knowledge, acquired through implicit learning, constrains what to expect and guides where to attend and look and shows that regularities in dynamic visual environments can also be learned to guide search behavior.
Abstract: The visual environment is extremely rich and complex, producing information overload for the visual system. But the environment also embodies structure in the form of redundancies and regularities that may serve to reduce complexity. How do perceivers internalize this complex informational structure? We present new evidence of visual learning that illustrates how observers learn how objects and events covary in the visual world. This information serves to guide visual processes such as object recognition and search. Our first experiment demonstrates that search and object recognition are facilitated by learned associations (covariation) between novel visual shapes. Our second experiment shows that regularities in dynamic visual environments can also be learned to guide search behavior. In both experiments, learning occurred incidentally and the memory representations were implicit. These experiments show how top-down visual knowledge, acquired through implicit learning, constrains what to expect and guide...

469 citations

Journal ArticleDOI
TL;DR: This paper considers the task of detection of a weak signal in a noisy image and suggests the Hotelling model with channels as a useful model observer for the purpose of assessing and optimizing image quality with respect to simple detection tasks.
Abstract: Image quality can be defined objectively in terms of the performance of some "observer" (either a human or a mathematical model) for some task of practical interest. If the end user of the image will be a human, model observers are used to predict the task performance of the human, as measured by psychophysical studies, and hence to serve as the basis for optimization of image quality. In this paper, we consider the task of detection of a weak signal in a noisy image. The mathematical observers considered include the ideal Bayesian, the nonprewhitening matched filter, a model based on linear-discriminant analysis and referred to as the Hotelling observer, and the Hotelling and Bayesian observers modified to account for the spatial-frequency-selective channels in the human visual system. The theory behind these observer models is briefly reviewed, and several psychophysical studies relating to the choice among them are summarized. Only the Hotelling model with channels is mathematically tractable in all cases considered here and capable of accounting for all of these data. This model requires no adjustment of parameters to fit the data and is relatively insensitive to the details of the channel mechanism. We therefore suggest it as a useful model observer for the purpose of assessing and optimizing image quality with respect to simple detection tasks.

465 citations


Network Information
Related Topics (5)
Feature (computer vision)
128.2K papers, 1.7M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202349
202294
2021279
2020311
2019351
2018348