scispace - formally typeset
Search or ask a question
Author

Suguo Zhu

Bio: Suguo Zhu is an academic researcher from Hangzhou Dianzi University. The author has contributed to research in topics: Deep learning & Computer science. The author has an hindex of 5, co-authored 9 publications receiving 130 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Thorough experiments have been conducted on five standard databases, which show that a significant improvement can be achieved by adopting multi-level deep representations from a very deep DNN model for learning an effective BIQA model, and consequently BLINDER considerably outperforms previous state-of-the-art BIZA methods for authentically distorted images.

107 citations

Journal ArticleDOI
TL;DR: This work presents a novel pulmonary nodule classification framework via attentive and ensemble 3D Dual Path Networks that employs a contextual attention mechanism to model the contextual correlations among adjacent locations, which improves the representativeness of deep features.

41 citations

Journal ArticleDOI
TL;DR: This paper considered five categories of common homogeneous distortion in video suvillance applications, i.e. low-resolution, blurring, additive Gaussian white noise, salt and pepper noise, and Poisson noise and proposed a novel biometric quality assessment (BQA) method for face images and explored its applications in face recognition.

31 citations

Proceedings ArticleDOI
Xuantong Meng1, Fei Gao1, Shengjie Shi1, Suguo Zhu1, Jingjie Zhu1 
01 Nov 2018
TL;DR: This paper builds three multilayer aggregation networks (MLANs) based on different baseline networks, including MobileNet, VGG16, and Inception-v3, respectively, and shows significant superiority over previous state-of-the-art in the aesthetic score prediction task.
Abstract: Image aesthetic assessment aims at computationally evaluating the quality of images based on artistic perceptions. Although existing deep learning based approaches have obtained promising performance, they typically use the high-level features in the convolutional neural networks (CNNs) for aesthetic prediction. However, low-level and intermediate-level features are also highly correlated with image aesthetic. In this paper, we propose to use multi-level features from a CNN for learning effective image aesthetic assessment models. Specially, we extract features from multi-layers and then aggregate them for predicting a image aesthetic score. To evaluate its effectiveness, we build three multilayer aggregation networks (MLANs) based on different baseline networks, including MobileNet, VGG16, and Inception-v3, respectively. Experimental results show that aggregating multilayer features consistently and considerably achieved improved performance. Besides, MLANs show significant superiority over previous state-of-the-art in the aesthetic score prediction task.

9 citations

Journal ArticleDOI
TL;DR: The robust sparse representation reconstructs the features extracted by the network for flexible solution, using the parameters learned from galley, and achieves arbitrary size images in partial person re-identification.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work presents a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images, and proposes a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set.
Abstract: Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images. It is the first in-the-wild database aiming for ecological validity, concerning the authenticity of distortions, the diversity of content, and quality-related indicators. Through the use of crowdsourcing, we obtained 1.2 million reliable quality ratings from 1,459 crowd workers, paving the way for more general IQA models. We propose a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set (0.921 SROCC), to the current state-of-the-art database LIVE-in-the-Wild (0.825 SROCC). The model derives its core performance from the InceptionResNet architecture, being trained at a higher resolution than previous models ( $512\times 384$ ). Correlation analysis shows that KonCept512 performs similar to having 9 subjective scores for each test image.

299 citations

Journal ArticleDOI
TL;DR: An end-to-end network is designed to conduct future frame prediction and reconstruction sequentially, which makes the reconstruction errors large enough to facilitate the identification of abnormal events, while reconstruction helps enhance the predicted future frames from normal events.

144 citations

Journal ArticleDOI
TL;DR: A novel IQA-orientated CNN method is developed for blind IQA (BIQA), which can efficiently represent the quality degradation and the Cascaded CNN with HDC (named as CaHDC) is introduced, demonstrating the superiority of CaH DC compared with existing BIQA methods.
Abstract: The deep convolutional neural network (CNN) has achieved great success in image recognition. Many image quality assessment (IQA) methods directly use recognition-oriented CNN for quality prediction. However, the properties of IQA task is different from image recognition task. Image recognition should be sensitive to visual content and robust to distortion, while IQA should be sensitive to both distortion and visual content. In this paper, an IQA-oriented CNN method is developed for blind IQA (BIQA), which can efficiently represent the quality degradation. CNN is large-data driven, while the sizes of existing IQA databases are too small for CNN optimization. Thus, a large IQA dataset is firstly established, which includes more than one million distorted images (each image is assigned with a quality score as its substitute of Mean Opinion Score (MOS), abbreviated as pseudo-MOS). Next, inspired by the hierarchical perception mechanism (from local structure to global semantics) in human visual system, a novel IQA-orientated CNN method is designed, in which the hierarchical degradation is considered. Finally, by jointly optimizing the multilevel feature extraction, hierarchical degradation concatenation (HDC) and quality prediction in an end-to-end framework, the Cascaded CNN with HDC (named as CaHDC) is introduced. Experiments on the benchmark IQA databases demonstrate the superiority of CaHDC compared with existing BIQA methods. Meanwhile, the CaHDC (with about 0.73M parameters) is lightweight comparing to other CNN-based BIQA models, which can be easily realized in the microprocessing system. The dataset and source code of the proposed method are available at https://web.xidian.edu.cn/wjj/paper.html .

113 citations

Proceedings ArticleDOI
02 Apr 2019
TL;DR: This work proposes the first method that efficiently supports full resolution images as an input, and can be trained on variable input sizes, and significantly improves upon the state of the art on ground-truth mean opinion scores.
Abstract: We propose an effective deep learning approach to aesthetics quality assessment that relies on a new type of pre-trained features, and apply it to the AVA data set, the currently largest aesthetics database. While previous approaches miss some of the information in the original images, due to taking small crops, down-scaling or warping the originals during training, we propose the first method that efficiently supports full resolution images as an input, and can be trained on variable input sizes. This allows us to significantly improve upon the state of the art, increasing the Spearman rank-order correlation coefficient (SRCC) of ground-truth mean opinion scores (MOS) from the existing best reported of 0.612 to 0.756. To achieve this performance, we extract multi-level spatially pooled (MLSP) features from all convolutional blocks of a pre-trained InceptionResNet-v2 network, and train a custom shallow Convolutional Neural Network (CNN) architecture on these new features.

104 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of latest advances in deep learning based visual object detection with a rigorous overview of backbone architectures for object detection followed by a systematic cover up of current learning strategies is provided.

88 citations