scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Precise No-Reference Image Quality Evaluation Based on Distortion Identification

TL;DR: The difficulty of no-reference image quality assessment (NR IQA) often lies in the lack of knowledge about the distortion in the image, which makes quality assessment blind and thus inefficient.
Abstract: The difficulty of no-reference image quality assessment (NR IQA) often lies in the lack of knowledge about the distortion in the image, which makes quality assessment blind and thus inefficient. To...
Citations
More filters
Proceedings ArticleDOI
06 Apr 2022
TL;DR: This paper proposes a novel framework to explore the 3D Skinned Multi-Person Linear (SMPL) model of the human body for gait recognition, named SMPLGait, and provides 3D SMPL models recovered from video frames which can provide dense 3D information of body shape, viewpoint, and dynamics.
Abstract: Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes. However, humans live and walk in the unconstrained 3D space, so projecting the 3D human body onto the 2D plane will discard a lot of crucial information like the viewpoint, shape, and dynamics for gait recognition. Therefore, this paper aims to explore dense 3D representations for gait recognition in the wild, which is a practical yet neglected problem. In particular, we propose a novel framework to explore the 3D Skinned Multi-Person Linear (SMPL) model of the human body for gait recognition, named SMPLGait. Our framework has two elaborately-designed branches of which one extracts appearance features from silhouettes, the other learns knowledge of 3D viewpoints and shapes from the 3D SMPL model. In addition, due to the lack of suitable datasets, we build the first large-scale 3D representation-based gait recognition dataset, named Gait3D. It contains 4,000 subjects and over 25,000 sequences extracted from 39 cameras in an unconstrained indoor scene. More importantly, it provides 3D SMPL models recovered from video frames which can provide dense 3D information of body shape, viewpoint, and dynamics. Based on Gait3D, we comprehensively compare our method with existing gait recognition approaches, which reflects the superior performance of our framework and the potential of 3D representations for gait recognition in the wild. The code and dataset are available at: https://gait3d.github.io.

30 citations

Journal ArticleDOI
TL;DR: A systematic survey of the literature on the use of big data analytics and data mining methods in IoT to identify the lines of research that should receive more attention in future works and provides a summary of the methods used.

15 citations

Journal ArticleDOI
TL;DR: An efficient super-resolution model based on neural architecture search and attention mechanism that introduces the Bayesian algorithm for hyper-parameter tuning and improves the model’s performance based on the optimal sub-network searched out.
Abstract: Although the current super-resolution model based on deep learning has achieved excellent reconstruction results, the increasing depth of the model results in huge parameters, limiting the further application of the super-resolution deep model. To solve this problem, we propose an efficient super-resolution model based on neural architecture search and attention mechanism. First, we use global residual learning to limit the search to the non-linear mapping part of the network and add a down-sampling to this part to reduce the feature map’s size and computation. Second, we establish a lightweight search space and joint rewards for searching the optimal network structure. The model divides the search into macro search and micro search, which are used to search for the optimal down-sampling position and the optimal cell structure, respectively. In addition, we introduce the Bayesian algorithm for hyper-parameter tuning and further improve the model’s performance based on the optimal sub-network searched out. Detailed experiments show that our model achieves excellent super-resolution performance and high computational efficiency compared with some state-of-the-art models.

8 citations

Journal ArticleDOI
TL;DR: This work establishes a large-scale underwater image dataset, dubbed UID2021, for evaluating no-reference UIQA metrics, and enables ones to evaluate NR UIZA algorithms comprehensively and paves the way for further research onUIQA.
Abstract: Achieving subjective and objective quality assessment of underwater images is of high significance in underwater visual perception and image/video processing. However, the development of underwater image quality assessment (UIQA) is limited for the lack of publicly available underwater image datasets with human subjective scores and reliable objective UIQA metrics. To address this issue, we establish a large-scale underwater image dataset, dubbed UID2021, for evaluating no-reference (NR) UIQA metrics. The constructed dataset contains 60 multiply degraded underwater images collected from various sources, covering six common underwater scenes (i.e., bluish scene, blue-green scene, greenish scene, hazy scene, low-light scene, and turbid scene), and their corresponding 900 quality improved versions are generated by employing 15 state-of-the-art underwater image enhancement and restoration algorithms. Mean opinion scores with 52 observers for each image of UID2021 are also obtained by using the pairwise comparison sorting method. Both in-air and underwater-specific NR IQA algorithms are tested on our constructed dataset to fairly compare their performance and analyze their strengths and weaknesses. Our proposed UID2021 dataset enables ones to evaluate NR UIQA algorithms comprehensively and paves the way for further research on UIQA. The dataset is available at https://github.com/Hou-Guojia/UID2021.

8 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Journal ArticleDOI
01 Nov 1973
TL;DR: These results indicate that the easily computable textural features based on gray-tone spatial dependancies probably have a general applicability for a wide variety of image-classification applications.
Abstract: Texture is one of the important characteristics used in identifying objects or regions of interest in an image, whether the image be a photomicrograph, an aerial photograph, or a satellite image. This paper describes some easily computable textural features based on gray-tone spatial dependancies, and illustrates their application in category-identification tasks of three different kinds of image data: photomicrographs of five kinds of sandstones, 1:20 000 panchromatic aerial photographs of eight land-use categories, and Earth Resources Technology Satellite (ERTS) multispecial imagery containing seven land-use categories. We use two kinds of decision rules: one for which the decision regions are convex polyhedra (a piecewise linear decision rule), and one for which the decision regions are rectangular parallelpipeds (a min-max decision rule). In each experiment the data set was divided into two parts, a training set and a test set. Test set identification accuracy is 89 percent for the photomicrographs, 82 percent for the aerial photographic imagery, and 83 percent for the satellite imagery. These results indicate that the easily computable textural features probably have a general applicability for a wide variety of image-classification applications.

20,442 citations

Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
Abstract: Convolutional networks are at the core of most state of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21:2% top-1 and 5:6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3:5% top-5 error and 17:3% top-1 error on the validation set and 3:6% top-5 error on the official test set.

16,962 citations

Proceedings ArticleDOI
21 Jun 1994
TL;DR: A feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world are proposed.
Abstract: No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the world is still hard. We propose a feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world. These methods are based on a new tracking algorithm that extends previous Newton-Raphson style search methods to work under affine image transformations. We test performance with several simulations and experiments. >

8,432 citations