Showing papers on "Standard test image published in 2017"

PDF

Open Access

Proceedings Article•DOI•

NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results

[...]

Radu Timofte¹, Eirikur Agustsson¹, Luc Van Gool¹, Ming-Hsuan Yang², Lei Zhang³, Bee Oh Lim⁴, Sanghyun Son⁴, Heewon Kim⁴, Seungjun Nah⁴, Kyoung Mu Lee⁴, Xintao Wang⁵, Yapeng Tian⁶, Ke Yu⁵, Yulun Zhang⁶, Shixiang Wu⁶, Chao Dong, Liang Lin, Yu Qiao⁶, Chen Change Loy⁵, Woong Bae⁷, Jaejun Yoo⁷, Yoseob Han⁷, Jong Chul Ye⁷, Jae-Seok Choi⁷, Munchurl Kim⁷, Yuchen Fan⁸, Jiahui Yu⁸, Wei Han⁸, Ding Liu⁸, Haichao Yu⁸, Zhangyang Wang⁸, Honghui Shi⁸, Xinchao Wang⁸, Thomas S. Huang⁸, Yunjin Chen, Kai Zhang⁹, Wangmeng Zuo⁹, Zhimin Tang¹⁰, Linkai Luo¹⁰, Shaohui Li¹⁰, Min Fu¹⁰, Lei Cao¹⁰, Wen Heng¹¹, Giang Bui¹², Truc Le¹², Ye Duan¹², Dacheng Tao¹³, Ruxin Wang, Xu Lin, Jianxin Pang, Xu Jinchang¹⁴, Yu Zhao¹⁴, Xiangyu Xu², Jinshan Pan², Deqing Sun², Yujin Zhang², Xibin Song¹⁵, Yuchao Dai¹⁶, Xueying Qin¹⁵, Xuan-Phung Huynh¹⁷, Tiantong Guo¹⁸, Hojjat Seyed Mousavi¹⁸, Tiep H. Vu¹⁸, Vishal Monga¹⁸, Cristóvão Cruz¹⁹, Karen Egiazarian¹⁹, Vladimir Katkovnik¹⁹, Rakesh Mehta¹⁹, Arnav Kumar Jain²⁰, Abhinav Agarwalla²⁰, Ch V. Sai Praveen²⁰, Ruofan Zhou²¹, Hongdiao Wen²², Che Zhu²², Zhiqiang Xia²², Zhengtao Wang²², Qi Guo²² - Show less +73 more•Institutions (22)

21 Jul 2017

TL;DR: This paper reviews the first challenge on single image super-resolution (restoration of rich details in an low resolution image) with focus on proposed solutions and results and gauges the state-of-the-art in single imagesuper-resolution.

...read moreread less

Abstract: This paper reviews the first challenge on single image super-resolution (restoration of rich details in an low resolution image) with focus on proposed solutions and results. A new DIVerse 2K resolution image dataset (DIV2K) was employed. The challenge had 6 competitions divided into 2 tracks with 3 magnification factors each. Track 1 employed the standard bicubic downscaling setup, while Track 2 had unknown downscaling operators (blur kernel and decimation) but learnable through low and high res train images. Each competition had ∽100 registered participants and 20 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.

...read moreread less

1,243 citations

Proceedings Article•DOI•

One-Shot Learning for Semantic Segmentation

[...]

Amirreza Shaban¹, Shray Bansal¹, Zhen Liu¹, Irfan Essa¹, Byron Boots¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

11 Sep 2017

TL;DR: In this paper, a network that, given a small set of annotated images, produces parameters for a Fully Convolutional Network (FCN) to perform dense pixel-level prediction on a test image for the new semantic class.

...read moreread less

Abstract: Low-shot learning methods for image classification support learning from sparse data. We extend these techniques to support dense semantic image segmentation. Specifically, we train a network that, given a small set of annotated images, produces parameters for a Fully Convolutional Network (FCN). We use this FCN to perform dense pixel-level prediction on a test image for the new semantic class. Our architecture shows a 25% relative meanIoU improvement compared to the best baseline methods for one-shot segmentation on unseen classes in the PASCAL VOC 2012 dataset and is at least 3 times faster.

...read moreread less

413 citations

Proceedings Article•DOI•

Deep Outdoor Illumination Estimation

[...]

Yannick Hold-Geoffroy¹, Kalyan Sunkavalli², Sunil Hadap², Emiliano Gambaretto², Jean-François Lalonde¹ - Show less +1 more•Institutions (2)

Laval University¹, Adobe Systems²

01 Jul 2017

TL;DR: It is demonstrated that the approach allows the recovery of plausible illumination conditions and enables photorealistic virtual object insertion from a single image and significantly outperforms previous solutions to this problem.

...read moreread less

Abstract: We present a CNN-based technique to estimate high-dynamic range outdoor illumination from a single low dynamic range image. To train the CNN, we leverage a large dataset of outdoor panoramas. We fit a low-dimensional physically-based outdoor illumination model to the skies in these panoramas giving us a compact set of parameters (including sun position, atmospheric conditions, and camera parameters). We extract limited field-of-view images from the panoramas, and train a CNN with this large set of input image–output lighting parameter pairs. Given a test image, this network can be used to infer illumination parameters that can, in turn, be used to reconstruct an outdoor illumination environment map. We demonstrate that our approach allows the recovery of plausible illumination conditions and enables photorealistic virtual object insertion from a single image. An extensive evaluation on both the panorama dataset and captured HDR environment maps shows that our technique significantly outperforms previous solutions to this problem.

...read moreread less

238 citations

Journal Article•DOI•

Stain Normalization using Sparse AutoEncoders (StaNoSA): Application to digital pathology.

[...]

Andrew Janowczyk¹, Ajay Basavanhally, Anant Madabhushi¹•Institutions (1)

Case Western Reserve University¹

01 Apr 2017-Computerized Medical Imaging and Graphics

TL;DR: It is shown how sparse autoencoders can be leveraged to partition images into tissue sub-types, so that color standardization for each can be performed independently.

...read moreread less

181 citations

Journal Article•DOI•

Deep Representation-Based Feature Extraction and Recovering for Finger-Vein Verification

[...]

Huafeng Qin¹, Mounim A. El-Yacoubi¹•Institutions (1)

Université Paris-Saclay¹

01 Aug 2017-IEEE Transactions on Information Forensics and Security

TL;DR: A deep learning model is proposed to extract and recover vein features using limited a priori knowledge to recover missing finger-vein patterns in the segmented image.

...read moreread less

Abstract: Finger-vein biometrics has been extensively investigated for personal verification. Despite recent advances in finger-vein verification, current solutions completely depend on domain knowledge and still lack the robustness to extract finger-vein features from raw images. This paper proposes a deep learning model to extract and recover vein features using limited a priori knowledge. First, based on a combination of the known state-of-the-art handcrafted finger-vein image segmentation techniques, we automatically identify two regions: a clear region with high separability between finger-vein patterns and background, and an ambiguous region with low separability between them. The first is associated with pixels on which all the above-mentioned segmentation techniques assign the same segmentation label (either foreground or background), while the second corresponds to all the remaining pixels. This scheme is used to automatically discard the ambiguous region and to label the pixels of the clear region as foreground or background. A training data set is constructed based on the patches centered on the labeled pixels. Second, a convolutional neural network (CNN) is trained on the resulting data set to predict the probability of each pixel of being foreground (i.e., vein pixel), given a patch centered on it. The CNN learns what a finger-vein pattern is by learning the difference between vein patterns and background ones. The pixels in any region of a test image can then be classified effectively. Third, we propose another new and original contribution by developing and investigating a fully convolutional network to recover missing finger-vein patterns in the segmented image. The experimental results on two public finger-vein databases show a significant improvement in terms of finger-vein verification accuracy.

...read moreread less

170 citations

Journal Article•DOI•

DeepSim: Deep similarity for image quality assessment

[...]

Fei Gao¹, Yi Wang², Panpeng Li¹, Min Tan¹, Jun Yu¹, Yani Zhu¹, Yani Zhu³ - Show less +3 more•Institutions (3)

Hangzhou Dianzi University¹, Zhejiang University City College², Zhejiang University of Technology³

27 Sep 2017-Neurocomputing

TL;DR: Thorough experiments conducted on standard databases show that the proposed novel full-reference IQA framework, codenamed DeepSim, can accurately predict human perceived image quality and outperforms previous state-of-the-art performance.

...read moreread less

135 citations

Posted Content•

One-Shot Learning for Semantic Segmentation

[...]

Amirreza Shaban¹, Shray Bansal¹, Zhen Liu¹, Irfan Essa¹, Byron Boots¹ - Show less +1 more•Institutions (1)

Georgia Institute of Technology¹

11 Sep 2017-arXiv: Computer Vision and Pattern Recognition

...read moreread less

124 citations

Proceedings Article•DOI•

Group emotion recognition with individual facial emotion CNNs and global image based CNNs

[...]

Lianzhi Tan¹, Kaipeng Zhang², Kai Wang¹, Zeng Xiaoxing¹, Xiaojiang Peng¹, Yu Qiao¹ - Show less +2 more•Institutions (2)

Chinese Academy of Sciences¹, National Taiwan University²

03 Nov 2017

TL;DR: This paper presents the approach for group-level emotion recognition in the Emotion Recognition in the Wild Challenge 2017, based on two types of Convolutional Neural Networks, namely individual facial emotion CNNs and global image based CNNs.

...read moreread less

Abstract: This paper presents our approach for group-level emotion recognition in the Emotion Recognition in the Wild Challenge 2017. The task is to classify an image into one of the group emotion such as positive, neutral or negative. Our approach is based on two types of Convolutional Neural Networks (CNNs), namely individual facial emotion CNNs and global image based CNNs. For the individual facial emotion CNNs, we first extract all the faces in an image, and assign the image label to all faces for training. In particular, we utilize a large-margin softmax loss for discriminative learning and we train two CNNs on both aligned and non-aligned faces. For the global image based CNNs, we compare several recent state-of-the-art network structures and data augmentation strategies to boost performance. For a test image, we average the scores from all faces and the image to predict the final group emotion category. We win the challenge with accuracies 83.9% and 80.9% on the validation set and testing set respectively, which improve the baseline results by about 30%.

...read moreread less

72 citations

Proceedings Article•DOI•

A real-time face recognition system based on the improved LBPH algorithm

[...]

XueMei Zhao¹, ChengBing Wei¹•Institutions (1)

Qingdao University¹

01 Aug 2017

TL;DR: A modified LBPH algorithm based on pixel neighborhood gray median(MLBPH) is proposed, and the results show that MLBPH algorithm is superior toLBPH algorithm in recognition rate.

...read moreread less

Abstract: The Local Binary Pattern Histogram(LBPH) algorithm is a simple solution on face recognition problem, which can recognize both front face and side face. However, the recognition rate of LBPH algorithm under the conditions of illumination diversification, expression variation and attitude deflection is decreased. To solve this problem, a modified LBPH algorithm based on pixel neighborhood gray median(MLBPH) is proposed. The gray value of the pixel is replaced by the median value of its neighborhood sampling value, and then the feature value is extracted by the sub blocks and the statistical histogram is established to form the MLBPH feature dictionary, which is used to recognize the human face identity compared with test image. Experiments are carried on FERET standard face database and the creation of new face database, and the results show that MLBPH algorithm is superior to LBPH algorithm in recognition rate.

...read moreread less

62 citations

Journal Article•DOI•

Image splicing localization using PCA-based noise level estimation

[...]

Hui Zeng¹, Yifeng Zhan¹, Xiangui Kang¹, Xiaodan Lin¹•Institutions (1)

Sun Yat-sen University¹

01 Feb 2017-Multimedia Tools and Applications

TL;DR: The experimental results demonstrate the superiority of the proposed method over several state-of-the-art methods, especially for practical image splicing, where the noise difference between the original and spliced regions is typically small.

...read moreread less

Abstract: Image splicing is one of the most common image tampering operations, where the content of the tampered image usually significantly differs from that of the original one. As a consequence, forensic methods aiming to locate the spliced areas are of great realistic significance. Among these methods, the noise based ones, which utilize the fact that images from different sources tend to have various noise levels, have drawn much attention due to their convenience to implement and the relaxation of some operation specific assumptions. However, the performances of the existing noise based image splicing localization methods are unsatisfactory when the noise difference between the original and spliced regions is relatively small. In this paper, through incorporation of a recent developed noise level estimation algorithm, we propose an effective image splicing localization method. The proposed method performs blockwise noise level estimation of a test image with principal component analysis (PCA)-based algorithm, and segments the tampered region from the original region by k-means clustering. The experimental results demonstrate the superiority of the proposed method over several state-of-the-art methods, especially for practical image splicing, where the noise difference between the original and spliced regions is typically small.

...read moreread less

56 citations

Proceedings Article•DOI•

Vision-based parking-slot detection: A benchmark and a learning-based approach

[...]

Li Linshen¹, Lin Zhang¹, Xiyuan Li¹, Xiao Liu¹, Ying Shen¹, Lu Xiong¹ - Show less +2 more•Institutions (1)

Tongji University¹

01 Jul 2017

TL;DR: A learning based parking-slot detection approach is proposed and with this approach, given a test image, the marking-points will be detected at first and then the valid parking-slots can be inferred and its efficacy and efficiency have been corroborated on the database.

...read moreread less

Abstract: Recent years have witnessed a growing interest in developing automatic parking systems in the field of intelligent vehicle. However, how to effectively and efficiently locating parking-slots using a vision-based system is still an unresolved issue. In this paper, we attempt to fill this research gap to some extent and our contributions are twofold. Firstly, to facilitate the study of vision-based parking-slot detection, a large-scale parking-slot image database is established. For each image in this database, the marking-points and parking-slots are carefully labelled. Such a database can serve as a benchmark to design and validate parking-slot detection algorithms. Secondly, a learning based parking-slot detection approach is proposed. With this approach, given a test image, the marking-points will be detected at first and then the valid parking-slots can be inferred. Its efficacy and efficiency have been corroborated on our database. The labeled database and the source codes are publicly available at http://sse.tongji.edu.cn/linzhang/ps/index.htm.

...read moreread less

Journal Article•DOI•

A Unified approach for Conventional Zero-shot, Generalized Zero-shot and Few-shot Learning

[...]

Shafin Rahman¹, Salman H. Khan¹, Fatih Porikli¹•Institutions (1)

Australian National University¹

27 Jun 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: The proposed approach is based on a novel class adapting principal directions’ (CAPDs) concept that allows multiple embeddings of image features into a semantic space and can generalize the seen CAPDs by estimating seen–unseen diversity that significantly improves the performance of generalized zero-shot learning.

...read moreread less

Abstract: Prevalent techniques in zero-shot learning do not generalize well to other related problem scenarios. Here, we present a unified approach for conventional zero-shot, generalized zero-shot and few-shot learning problems. Our approach is based on a novel Class Adapting Principal Directions (CAPD) concept that allows multiple embeddings of image features into a semantic space. Given an image, our method produces one principal direction for each seen class. Then, it learns how to combine these directions to obtain the principal direction for each unseen class such that the CAPD of the test image is aligned with the semantic embedding of the true class, and opposite to the other classes. This allows efficient and class-adaptive information transfer from seen to unseen classes. In addition, we propose an automatic process for selection of the most useful seen classes for each unseen class to achieve robustness in zero-shot learning. Our method can update the unseen CAPD taking the advantages of few unseen images to work in a few-shot learning scenario. Furthermore, our method can generalize the seen CAPDs by estimating seen-unseen diversity that significantly improves the performance of generalized zero-shot learning. Our extensive evaluations demonstrate that the proposed approach consistently achieves superior performance in zero-shot, generalized zero-shot and few/one-shot learning problems.

...read moreread less

Proceedings Article•DOI•

Efficient Image Set Classification Using Linear Regression Based Image Reconstruction

[...]

Syed Afaq Ali Shah, Uzair Nadeem, Mohammed Bennamoun, Ferdous Sohel¹, Roberto Togneri² - Show less +1 more•Institutions (2)

Murdoch University¹, University of Western Australia²

01 Jul 2017

TL;DR: In this paper, the authors propose a novel image set classification technique using linear regression models, where the gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step.

...read moreread less

Abstract: We propose a novel image set classification technique using linear regression models. Downsampled gallery image sets are interpreted as subspaces of a high dimensional space to avoid the computationally expensive training step. We estimate regression models for each test image using the class specific gallery subspaces. Images of the test set are then reconstructed using the regression models. Based on the minimum reconstruction error between the reconstructed and the original images, a weighted voting strategy is used to classify the test set. We performed extensive evaluation on the benchmark UCSD/Honda, CMU Mobo and YouTube Celebrity datasets for face classification, and ETH-80 dataset for object classification. The results demonstrate that by using only a small amount of training data, our technique achieved competitive classification accuracy and superior computational speed compared with the state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Papaya Tree Detection with UAV Images Using a GPU-Accelerated Scale-Space Filtering Method

[...]

Hao Jiang, Shuisen Chen, Dan Li, Chongyang Wang, Ji Yang - Show less +1 more

13 Jul 2017-Remote Sensing

TL;DR: This study incorporated SSF with a Lab color transformation to reduce over-detection problems associated with the original luminance image and ported four of the most time-consuming processes to the graphics processing unit (GPU) to improve computational efficiency.

...read moreread less

Abstract: The use of unmanned aerial vehicles (UAV) can allow individual tree detection for forest inventories in a cost-effective way. The scale-space filtering (SSF) algorithm is commonly used and has the capability of detecting trees of different crown sizes. In this study, we made two improvements with regard to the existing method and implementations. First, we incorporated SSF with a Lab color transformation to reduce over-detection problems associated with the original luminance image. Second, we ported four of the most time-consuming processes to the graphics processing unit (GPU) to improve computational efficiency. The proposed method was implemented using PyCUDA, which enabled access to NVIDIA’s compute unified device architecture (CUDA) through high-level scripting of the Python language. Our experiments were conducted using two images captured by the DJI Phantom 3 Professional and a most recent NVIDIA GPU GTX1080. The resulting accuracy was high, with an F-measure larger than 0.94. The speedup achieved by our parallel implementation was 44.77 and 28.54 for the first and second test image, respectively. For each 4000 × 3000 image, the total runtime was less than 1 s, which was sufficient for real-time performance and interactive application.

...read moreread less

Journal Article•DOI•

Development of photo forensics algorithm by detecting photoshop manipulation using error level analysis

[...]

Teddy Surya Gunawan¹, Siti Amalina Mohammad Hanafiah¹, Mira Kartiwi¹, Nanang Ismail, Nor Farahidah Za'bah¹, Anis Nurashikin Nordin¹ - Show less +2 more•Institutions (1)

International Islamic University Malaysia¹

01 Jul 2017-Indonesian Journal of Electrical Engineering and Computer Science

TL;DR: The objective of this paper is to develop a photo forensics algorithm which can detect any photo manipulation and showed that the proposed algorithm could identify successfully the modified image as well as showing the exact location of modifications.

...read moreread less

Abstract: Nowadays, image manipulation is common due to the availability of image processing software, such as Adobe Photoshop or GIMP. The original image captured by digital camera or smartphone normally is saved in the JPEG format due to its popularity. JPEG algorithm works on image grids, compressed independently, having size of 8x8 pixels. For unmodified image, all 8x8 grids should have a similar error level. For resaving operation, each block should degrade at approximately the same rate due to the introduction of similar amount of errors across the entire image. For modified image, the altered blocks should have higher error potential compred to the remaining part of the image. The objective of this paper is to develop a photo forensics algorithm which can detect any photo manipulation. The error level analysis (ELA) was further enhanced using vertical and horizontal histograms of ELA image to pinpoint the exact location of modification. Results showed that our proposed algorithm could identify successfully the modified image as well as showing the exact location of modifications.

...read moreread less

Proceedings Article•DOI•

Automatic visual inspection of printed circuit board for defect detection and classification

[...]

Vikas Chaudhary, Ishan R. Dave, Kishor P. Upla

22 Mar 2017

TL;DR: All 14 types of defects are detected and are classified in all possible classes using referential inspection approach and it shows that the proposed af algorithm is suitable for automatic visual inspection of PCBs.

...read moreread less

Abstract: Inspection of printed circuit board (PCB) has been a crucial process in the electronic manufacturing industry to guarantee product quality & reliability, cut manufacturing cost and to increase production. The PCB inspection involves detection of defects in the PCB and classification of those defects in order to identify the roots of defects. In this paper, all 14 types of defects are detected and are classified in all possible classes using referential inspection approach. The proposed algorithm is mainly divided into five stages: Image registration, Pre-processing, Image segmentation, Defect detection and Defect classification. The algorithm is able to perform inspection even when captured test image is rotated, scaled and translated with respect to template image which makes the algorithm rotation, scale and translation in-variant. The novelty of the algorithm lies in its robustness to analyze a defect in its different possible appearance and severity. In addition to this, algorithm takes only 2.528 s to inspect a PCB image. The efficacy of the proposed algorithm is verified by conducting experiments on the different PCB images and it shows that the proposed afgorithm is suitable for automatic visual inspection of PCBs.

...read moreread less

Journal Article•DOI•

Segmentation Comparing Eggs Watermarking Image and Original Image

[...]

Anton Yudhana¹, Sunardi Sunardi¹, Shoffan Saifullah¹•Institutions (1)

Universitas Ahmad Dahlan¹

01 Mar 2017-Bulletin of Electrical Engineering and Informatics

TL;DR: The research identified small area in eggs properly and compared preprocessing, the methods, and the results of image processing by using centroid and the bounding box for determining the object and the small area of chicken eggs.

...read moreread less

Abstract: The research used watermarking techniques to obtain the image originality. The aims of the research were to identify small area in eggs properly and compared preprocessing, the methods, and the results of image processing. The study has been improved from the previous papers by combined all methods and analysis was obtained.This study was conducted by using centroid and the bounding box for determining the object and the small area of chicken eggs. The segmentation method was used to compare the original image and the watermarked image. Image processing using image data that are subject watermark to maintain the authenticity of the images used in the study will the impact in delivering the desired results. In the identification of chicken eggs using watermark image using several methods are expected to provide results as desired. Segmentation also deployed to process the Image and counted the objects. The results showed that the process of segmentation and objects counting determined that the original image and watermarked image had the same value and recognized eggs. Identification had determined percentage of 100% for all the samples.

...read moreread less

Journal Article•DOI•

Tree species recognition system based on macroscopic image analysis

[...]

Imanurfatiehah Ibrahim¹, Anis Salwa Mohd Khairuddin¹, Mohamad Sofian Abu Talip¹, Hamzah Arof¹, Rubiyah Yusof² - Show less +1 more•Institutions (2)

University of Malaya¹, Universiti Teknologi Malaysia²

01 Mar 2017-Wood Science and Technology

TL;DR: In this work, a fuzzy pre-classifier is used to complement a set of support vector machines (SVM) to manage the large wood database and classify the wood species efficiently.

...read moreread less

Abstract: An automated wood texture recognition system of 48 tropical wood species is presented. For each wood species, 100 macroscopic texture images are captured from different timber logs where 70 images are used for training while 30 images are used for testing. In this work, a fuzzy pre-classifier is used to complement a set of support vector machines (SVM) to manage the large wood database and classify the wood species efficiently. Given a test image, a set of texture pore features is extracted from the image and used as inputs to a fuzzy pre-classifier which assigns it to one of the four broad categories. Then, another set of texture features is extracted from the image and used with the SVM dedicated to the selected category to further classify the test image to a particular wood species. The advantage of dividing the database into four smaller databases is that when a new wood species is added into the system, only the SVM classifier of one of the four databases needs to be retrained instead of those of the entire database. This shortens the training time and emulates the experts’ reasoning when expanding the wood database. The results show that the proposed model is more robust as the size of wood database is increased.

...read moreread less

Proceedings Article•DOI•

Head pose estimation through multi-class face segmentation

[...]

Khalil Khan¹, Massimo Mauro, Pierangelo Migliorati¹, Riccardo Leonardi¹•Institutions (1)

University of Brescia¹

01 Jul 2017

TL;DR: A multi-class face segmentation algorithm is implemented and a model for each considered pose is trained, achieving competitive results when compared to most recent methods, according to mean absolute error and accuracy metrics.

...read moreread less

Abstract: The aim of this work is to explore the usefulness of face semantic segmentation for head pose estimation. We implement a multi-class face segmentation algorithm and we train a model for each considered pose. Given a new test image, the probabilities associated to face parts by the different models are used as the only information for estimating the head orientation. A simple algorithm is proposed to exploit such probabilites in order to predict the pose. The proposed scheme achieves competitive results when compared to most recent methods, according to mean absolute error and accuracy metrics. Moreover, we release and make publicly available a face segmentation dataset1 consisting of 294 images belonging to 13 different poses, manually labeled into six semantic regions, which we used to train the segmentation models.

...read moreread less

Journal Article•DOI•

Image annotation using multi-view non-negative matrix factorization with different number of basis vectors

[...]

Roya Rad¹, Mansour Jamzad¹•Institutions (1)

Sharif University of Technology¹

01 Jul 2017-Journal of Visual Communication and Image Representation

TL;DR: An AIA system using Non-negative Matrix Factorization (NMF) framework, which discovers a latent space, by factorizing data into a set of non-negative basis and coefficients and is competitive with the current state-of-the-art methods.

...read moreread less

Journal Article•DOI•

No-Reference Image Quality Assessment Using Image Statistics and Robust Feature Descriptors

[...]

Mariusz Oszust¹•Institutions (1)

Rzeszów University of Technology¹

20 Sep 2017-IEEE Signal Processing Letters

TL;DR: A novel NR-IQA measure is introduced in which quality-aware statistics are used as perceptual features for the quality prediction, which demonstrates that the proposed technique outperforms the state-of-the-art NR measures.

...read moreread less

Abstract: The aim of no-reference image quality assessment (NR-IQA) techniques is to measure the perceptual quality of an image without access to the reference image. In this letter, a novel NR-IQA measure is introduced in which quality-aware statistics are used as perceptual features for the quality prediction. In the method, the distorted image is converted to grayscale and filtered using gradient operators. Then, the speeded-up robust feature (SURF) technique is employed to detect and describe keypoints in obtained images. The SURF interest point detection method is affected by distortions in the filtered image. Therefore, it can be used to reflect the decreased attention of the human visual system caused by image distortions. In the method, statistics are calculated for processed images and their SURF descriptors. Finally, they are mapped into subjective opinion scores using a support vector regression technique. The experimental evaluation conducted on four demanding large benchmark datasets, which contain images corrupted by single and multiple distortions, demonstrates that the proposed technique outperforms the state-of-the-art NR measures.

...read moreread less

Proceedings Article•DOI•

An automatic detection of helmeted and non-helmeted motorcyclist with license plate extraction using convolutional neural network

[...]

Jimit Mistry¹, Aashish K. Misraa¹, Meenu Agarwal¹, Ayushi Vyas¹, Vishal Chudasama¹, Kishor P. Upla¹ - Show less +2 more•Institutions (1)

Sardar Vallabhbhai National Institute of Technology, Surat¹

01 Nov 2017

TL;DR: An approach for automatic detection of helmeted and non-helmeted motorcyclist using convolutional neural network (CNN) and uses detection of person class instead of motorcycle in order to increase the accuracy of helmet detection in the input image.

...read moreread less

Abstract: Detection of helmeted and non-helmeted motorcyclist is mandatory now-a-days in order to ensure the safety of riders on the road. However, due to many constraints such as poor video quality, occlusion, illumination, and other varying factors it becomes very difficult to detect them accurately. In this paper, we introduce an approach for automatic detection of helmeted and non-helmeted motorcyclist using convolutional neural network (CNN). During the past several years, the advancements in deep learning models have drastically improved the performance of object detection. One such model is YOLOv2 [1] which combines both classification and object detection in a single architecture. Here, we use YOLOv2 at two different stages one after another in order to improve the helmet detection accuracy. At the first stage, YOLOv2 model is used to detect different objects in the test image. Since this model is trained on COCO dataset, it can detect all classes of the COCO dataset. In the proposed approach, we use detection of person class instead of motorcycle in order to increase the accuracy of helmet detection in the input image. The cropped images of detected persons are used as input to second YOLOv2 stage which was trained on our dataset of helmeted images. The non-helmeted images are processed further to extract license plate by using OpenALPR. In the proposed approach, we use two different datasets i.e., COCO and helmet datasets. We tested the potential of our approach on different helmeted and non-helmeted images. Experimental results show that the proposed method performs better when compared to other existing approaches with 94.70% helmet detection accuracy.

...read moreread less

Proceedings Article•DOI•

Facial recognition using histogram of gradients and support vector machines

[...]

J. Kulandai Josephine Julina¹, T. Sree Sharmila¹•Institutions (1)

Sri Sivasubramaniya Nadar College of Engineering¹

01 Jan 2017

TL;DR: The main focus of this paper is to recognize whether a given face input corresponds to a registered person in the database by using Histogram of Oriented Gradients technique in AT & T database.

...read moreread less

Abstract: Face recognition is widely used in computer vision and in many other biometric applications where security is a major concern. The most common problem in recognizing a face arises due to pose variations, different illumination conditions and so on. The main focus of this paper is to recognize whether a given face input corresponds to a registered person in the database. Face recognition is done using Histogram of Oriented Gradients (HOG) technique in AT & T database with an inclusion of a real time subject to evaluate the performance of the algorithm. The feature vectors generated by HOG descriptor are used to train Support Vector Machines (SVM) and results are verified against a given test input. The proposed method checks whether a test image in different pose and lighting conditions is matched correctly with trained images of the facial database. The results of the proposed approach show minimal false positives and improved detection accuracy.

...read moreread less

Posted Content•

Visual Explanations for Convolutional Neural Networks via Input Resampling.

[...]

Benjamin J. Lengerich, Sandeep Konam, Eric P. Xing, Stephanie Rosenthal, Manuela Veloso - Show less +1 more

30 Jul 2017-arXiv: Learning

TL;DR: A framework to analyze predictions in terms of the model's internal features by inspecting information flow through the network by comparing the sets of neurons selected by two metrics, which suggests a way to investigate the internal attention mechanisms of convolutional neural networks.

...read moreread less

Abstract: The predictive power of neural networks often costs model interpretability. Several techniques have been developed for explaining model outputs in terms of input features; however, it is difficult to translate such interpretations into actionable insight. Here, we propose a framework to analyze predictions in terms of the model's internal features by inspecting information flow through the network. Given a trained network and a test image, we select neurons by two metrics, both measured over a set of images created by perturbations to the input image: (1) magnitude of the correlation between the neuron activation and the network output and (2) precision of the neuron activation. We show that the former metric selects neurons that exert large influence over the network output while the latter metric selects neurons that activate on generalizable features. By comparing the sets of neurons selected by these two metrics, our framework suggests a way to investigate the internal attention mechanisms of convolutional neural networks.

...read moreread less

Journal Article•DOI•

Grain size measurement in optical microstructure using support vector regression

[...]

K. Gajalakshmi¹, S. Palanivel¹, N. J. Nalini¹, S. Saravanan¹, Krishnamurthy Raghukandan¹ - Show less +1 more•Institutions (1)

Annamalai University¹

01 Jun 2017-Optik

TL;DR: The experimental results show that the canny edge detection based feature extraction achieves low root mean square error (RMSE) when compared to Otsu method for detecting the grain count.

...read moreread less

Proceedings Article•DOI•

An automatic visible watermark removal technique using image inpainting algorithms

[...]

Chaoran Xu, Yao Lu¹, Yuanpin Zhou¹•Institutions (1)

Sun Yat-sen University¹

01 Nov 2017

TL;DR: A technique to remove visible watermark automatically using image inpainting algorithms through investigating the sparsity of natural image patches and a statistical method to detect the watermark region is proposed.

...read moreread less

Abstract: This paper introduces a technique to remove visible watermark automatically using image inpainting algorithms. The pending images which need watermark re-moval are assumed to have same resolution and watermark region and we will show this assumption is reasonable. Our proposed technique includes two basic step. The first step is detecting the watermark region, we propose a statistical method to detect the watermark region. Thresholding algorithm for segmentation proceeds at the accumulation image which is calculated by accumulation of the gray-scale maps of pending images. The second step is removing the watermark using image inpainting algorithms. Since watermarks are usually with large re-gion areas, an exemplar-based inpainting algorithm through investigating the sparsity of natural image patches is proposed for this step. Experiments were im-plemented in a test image set of 889 images downloaded from a shopping web-site with the resolution of 800∗800 and same watermark regions.

...read moreread less

Journal Article•DOI•

Cascaded Elastically Progressive Model for Accurate Face Alignment

[...]

Wenming Yang¹, Xiang Sun¹, Qingmin Liao¹•Institutions (1)

Tsinghua University¹

01 Sep 2017-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This correspondence paper proposes a new approach called cascaded elastically progressive model aiming for pixel-wise landmark localization and shows advantages for accurate landmark localization compared with prevailing methods.

...read moreread less

Abstract: While recently published face alignment algorithms mainly focused on occlusion, low image quality, and complex head poses, subtle variances of facial components were often overlooked. In this correspondence paper, we propose a new approach called cascaded elastically progressive model aiming for pixel-wise landmark localization. First of all, elastically progressive model (EPM) is designed to synthesize the prior knowledge of face shape and appearance of test image. More specifically, a novel framework referred to as inherent linear structure (ILS) is explored for capturing the characteristics of the shape, which is more plastic and flexible than extensively used principle component analysis-based modeling. A locally linear support vector machine (LL-SVM) is used as local expert for searching candidate feature points. In order to optimally integrate ILS with localization results of LL-SVM, we introduce Kalman filter (KF) to dynamically estimate the true shape in the sense of least mean square error. Two schemes are utilized based on our modeling of KF. First, we embedded heuristic line-like search strategy into the framework to guarantee and accelerate the convergence. Second, Kalman gain is manipulated adaptively in accordance with the confidence of the localizers so that poorly localized points are more subject to global constraint than well localized ones. To further improve robustness to initializations, two EPMs are cascaded, in which primary EPM detects the global structure and secondary EPM captures the details. Validation experiments are conducted on in-the-wild LFPW and HELEN databases. Our method shows advantages for accurate landmark localization compared with prevailing methods.

...read moreread less

Proceedings Article•DOI•

A Deep Learning Pipeline for Semantic Facade Segmentation

[...]

Radwa Fathalla, George Vogiatzis

30 Sep 2017

TL;DR: An algorithm that provides a pixel-wise classification of building facades that integrates appearance and layout cues in a single framework and is on par with the reported performance results.

...read moreread less

Abstract: We propose an algorithm that provides a pixel-wise classification of building facades. Building facades provide a rich environment for testing semantic segmentation techniques. They come in a variety of styles that reflect both appearance and layout characteristics. On the other hand, they exhibit a degree of stability in the arrangement of structures across different instances. We integrate appearance and layout cues in a single framework. The most likely label based on appearance is obtained through applying the state-of-the-art deep convolution networks. This is further optimized through Restricted Boltzmann Machines (RBM), applied on vertical and horizontal scanlines of facade models. Learning the probability distributions of the models via the RBMs is utilized in two settings. Firstly, we use them in learning from pre-seen facade samples, in the traditional training sense. Secondly, we learn from the test image at hand, in a way the allows the transfer of visual knowledge of the scene from correctly classified areas to others. Experimentally, we are on par with the reported performance results. However, we do not explicitly specify any hand-engineered features that are architectural scene dependent, nor do we include any dataset specific heuristics/thresholds.

...read moreread less

Patent•

Radar echo extrapolation method based on dynamic convolution neural network

[...]

Li Qian, Shi En, Gu Daquan

23 Jun 2017

TL;DR: In this paper, a radar echo extrapolation method based on a dynamic convolution neural network is proposed, which comprises a step of offline convolutional neural network training which comprises the steps of carrying out data preprocessing on a given training image set to obtain a training sample set.

...read moreread less

Abstract: The invention discloses a radar echo extrapolation method based on a dynamic convolution neural network. The method comprises a step of offline convolutional neural network training which comprises the steps of carrying out data preprocessing on a given training image set to obtain a training sample set, initializing a dynamic convolution neural network model, training a dynamic convolution neural network by using the training sample set, calculating an output value through network forward propagation, and updating network parameters through backward propagation such that the dynamic convolution neural network converges. The method also comprises a step of online radar echo extrapolation which comprises the steps of converting a test image set into a test sample set through data preprocessing, testing the trained dynamic convolution neural network by using the test sample set, and carrying out convolution of a laster radar echo image inputted into an image sequence and a probability vector obtained in the network forward propagation to obtain a predicted radar echo extrapolation image.

...read moreread less

Journal Article•DOI•

Joint Chroma Subsampling and Distortion-Minimization-Based Luma Modification for RGB Color Images With Application

[...]

Kuo-Liang Chung¹, Tsu-Chun Hsu¹, Chi-Chao Huang¹•Institutions (1)

National Taiwan University of Science and Technology¹

26 Jun 2017-IEEE Transactions on Image Processing

TL;DR: The experimental results demonstrate that the proposed hybrid method has substantial quality improvement, in terms of the CPSNR quality, visual effect, CPSNR-bitrate trade-off, and Bjøntegaard delta PSNR performance, of the reconstructed RGB images when compared with existing chroma subsampling schemes.

...read moreread less

Abstract: In this paper, we propose a novel and effective hybrid method, which joins the conventional chroma subsampling and the distortion-minimization-based luma modification together, to improve the quality of the reconstructed RGB full-color image. Assume the input RGB full-color image has been transformed to a YUV image, prior to compression. For each $2\times 2$ UV block, one 4:2:0 subsampling is applied to determine the one subsampled U and V components, $U_{s}$ and $V_{s}$ . Based on $U_{s}$ , $V_{s}$ , and the corresponding $2\times 2$ original RGB block, a main theorem is provided to determine the ideally modified $2\times 2$ luma block in constant time such that the color peak signal-to-noise ratio (CPSNR) quality distortion between the original $2\times 2$ RGB block and the reconstructed $2\times 2$ RGB block can be minimized in a globally optimal sense. Furthermore, the proposed hybrid method and the delivered theorem are adjusted to tackle the digital time delay integration images and the Bayer mosaic images whose Bayer CFA structure has been widely used in modern commercial digital cameras. Based on the IMAX, Kodak, and screen content test image sets, the experimental results demonstrate that in high efficiency video coding, the proposed hybrid method has substantial quality improvement, in terms of the CPSNR quality, visual effect, CPSNR-bitrate trade-off, and Bjontegaard delta PSNR performance, of the reconstructed RGB images when compared with existing chroma subsampling schemes.

...read moreread less

Collapse