Showing papers on "Standard test image published in 2019"

PDF

Open Access

Proceedings Article•DOI•

GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction

[...]

Baris Gecer¹, Stylianos Ploumpis¹, Irene Kotsia², Stefanos Zafeiriou¹•Institutions (2)

Imperial College London¹, Hellenic Open University²

20 Jun 2019

TL;DR: This paper utilizes GANs to train a very powerful generator of facial texture in UV space and revisits the original 3D Morphable Models (3DMMs) fitting approaches making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective.

...read moreread less

Abstract: In the past few years, a lot of work has been done towards reconstructing the 3D facial structure from single images by capitalizing on the power of Deep Convolutional Neural Networks (DCNNs). In the most recent works, differentiable renderers were employed in order to learn the relationship between the facial identity features and the parameters of a 3D morphable model for shape and texture. The texture features either correspond to components of a linear texture space or are learned by auto-encoders directly from in-the-wild images. In all cases, the quality of the facial texture reconstruction of the state-of-the-art methods is still not capable of modeling textures in high fidelity. In this paper, we take a radically different approach and harness the power of Generative Adversarial Networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images. That is, we utilize GANs to train a very powerful generator of facial texture in UV space. Then, we revisit the original 3D Morphable Models (3DMMs) fitting approaches making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. We optimize the parameters with the supervision of pretrained deep identity features through our end-to-end differentiable framework. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, to the best of our knowledge, facial texture reconstruction with high-frequency details.

...read moreread less

291 citations

Journal Article•DOI•

Local Learning With Deep and Handcrafted Features for Facial Expression Recognition

[...]

Mariana-Iuliana Georgescu¹, Radu Tudor Ionescu¹, Marius Popescu¹•Institutions (1)

University of Bucharest¹

16 May 2019

TL;DR: Zhang et al. as discussed by the authors proposed an approach that combines automatic features learned by convolutional neural networks (CNN) and handcrafted features computed by the bag-of-visual-words (BOVW) model in order to achieve the state of the art results in facial expression recognition (FER).

...read moreread less

Abstract: We present an approach that combines automatic features learned by convolutional neural networks (CNN) and handcrafted features computed by the bag-of-visual-words (BOVW) model in order to achieve the state-of-the-art results in facial expression recognition (FER). To obtain automatic features, we experiment with multiple CNN architectures, pre-trained models, and training procedures, e.g., Dense–Sparse–Dense. After fusing the two types of features, we employ a local learning framework to predict the class label for each test image. The local learning framework is based on three steps. First, a k-nearest neighbors model is applied in order to select the nearest training samples for an input test image. Second, a one-versus-all support vector machines (SVM) classifier is trained on the selected training samples. Finally, the SVM classifier is used to predict the class label only for the test image it was trained for. Although we have used local learning in combination with handcrafted features in our previous work, to the best of our knowledge, local learning has never been employed in combination with deep features. The experiments on the 2013 FER Challenge data set, the FER+ data set, and the AffectNet data set demonstrate that our approach achieves the state-of-the-art results. With a top accuracy of 75.42% on the FER 2013, 87.76% on the FER+, 59.58% on the AffectNet eight-way classification, and 63.31% on the AffectNet seven-way classification, we surpass the state-of-the-art methods by more than 1% on all data sets.

...read moreread less

223 citations

Journal Article•DOI•

Multi-sensor cloud and cloud shadow segmentation with a convolutional neural network

[...]

Marc Wieland¹, Yu Li¹, Sandro Martinis¹•Institutions (1)

German Aerospace Center¹

01 Sep 2019-Remote Sensing of Environment

TL;DR: A data-driven approach to semantic segmentation of cloud and cloud shadow in single date images based on a modified U-Net convolutional neural network that consistently outperforms Fmask and a traditional Random Forest classifier on a globally distributed multi-sensor test dataset in terms of accuracy, Cohen's Kappa coefficient, Dice coefficient and inference speed.

...read moreread less

111 citations

Proceedings Article•DOI•

Deep Sky Modeling for Single Image Outdoor Lighting Estimation

[...]

Yannick Hold-Geoffroy¹, Akshaya Athawale², Jean-François Lalonde³•Institutions (3)

Adobe Systems¹, Indian Institute of Technology Dhanbad², Laval University³

15 Jun 2019

TL;DR: This work proposes a data-driven learned sky model, which is used for outdoor lighting estimation from a single image, and shows that it can be used to recover plausible illumination, leading to visually pleasant virtual object insertions.

...read moreread less

Abstract: We propose a data-driven learned sky model, which we use for outdoor lighting estimation from a single image. As no large-scale dataset of images and their corresponding ground truth illumination is readily available, we use complementary datasets to train our approach, combining the vast diversity of illumination conditions of SUN360 with the radiometrically calibrated and physically accurate Laval HDR sky database. Our key contribution is to provide a holistic view of both lighting modeling and estimation, solving both problems end-to-end. From a test image, our method can directly estimate an HDR environment map of the lighting without relying on analytical lighting models. We demonstrate the versatility and expressivity of our learned sky model and show that it can be used to recover plausible illumination, leading to visually pleasant virtual object insertions. To further evaluate our method, we capture a dataset of HDR 360° panoramas and show through extensive validation that we significantly outperform previous state-of-the-art.

...read moreread less

91 citations

Journal Article•DOI•

A Simple Guidance Template-Based Defect Detection Method for Strip Steel Surfaces

[...]

Heying Wang¹, Jiawei Zhang¹, Ying Tian², Haiyong Chen¹, Hexu Sun¹, Kun Liu¹ - Show less +2 more•Institutions (2)

Hebei University of Technology¹, Tianjin University²

01 May 2019-IEEE Transactions on Industrial Informatics

TL;DR: A novel template establishment is presented and a simple guidance template-based algorithm for strip steel surface defect detection is proposed, which achieves a better average detection rate of 96.2% on a data set including 1500 test images.

...read moreread less

Abstract: Automatic defect detection on strip steel surfaces is a challenging task in computer vision, owing to miscellaneous patterns of defects, disturbance of pseudodefects, and random arrangement of gray-level in background. In this paper, a novel template establishment is presented. Further, a simple guidance template-based algorithm for strip steel surface defect detection is proposed. First, a large number of defect-free images are collected to obtain the statistical characteristic of normal textures. Second, for each given test image, the initial template is built according to the statistical characteristic and the size of test image. Then, a sorting operation is applied to the given test image. Further, by updating the initial template, a unique guidance template is generated based on specific intensity distribution of the sorted test image. So far, the background of each test image is approximately reconstructed in the guidance template. Finally, based on pixel-wise detection, the defects can be located accurately by subtraction operation between the guidance template and sorted test image, reverse sorting operation, and adaptive threshold determination. Experimental results show that the proposed method is both efficient and effective. It achieves a better average detection rate of 96.2% on a data set including 1500 test images.

...read moreread less

80 citations

Journal Article•DOI•

Surrogate-Assisted Retinal OCT Image Classification Based on Convolutional Neural Networks

[...]

Yibiao Rong¹, Dehui Xiang¹, Weifang Zhu¹, Kai Yu¹, Fei Shi¹, Zhun Fan², Xinjian Chen¹ - Show less +3 more•Institutions (2)

Soochow University (Suzhou)¹, Shantou University²

01 Jan 2019-IEEE Journal of Biomedical and Health Informatics

TL;DR: A surrogate-assisted classification method to classify retinal OCT images automatically based on convolutional neural networks (CNNs) that has been evaluated on different databases and shows that the proposed method is a very promising tool for classifying the retinal Oct images automatically.

...read moreread less

Abstract: Optical Coherence Tomography (OCT) is beco-ming one of the most important modalities for the noninvasive assessment of retinal eye diseases. As the number of acquired OCT volumes increases, automating the OCT image analysis is becoming increasingly relevant. In this paper, we propose a surrogate-assisted classification method to classify retinal OCT images automatically based on convolutional neural networks (CNNs). Image denoising is first performed to reduce the noise. Thresholding and morphological dilation are applied to extract the masks. The denoised images and the masks are then employed to generate a lot of surrogate images, which are used to train the CNN model. Finally, the prediction for a test image is determined by the average of the outputs from the trained CNN model on the surrogate images. The proposed method has been evaluated on different databases. The results (AUC of 0.9783 in the local database and AUC of 0.9856 in the Duke database) show that the proposed method is a very promising tool for classifying the retinal OCT images automatically.

...read moreread less

71 citations

Proceedings Article•DOI•

Zero-Shot Restoration of Back-lit Images Using Deep Internal Learning

[...]

Lin Zhang¹, Lijun Zhang¹, Xiao Liu¹, Ying Shen¹, Shaoming Zhang¹, Shengjie Zhao¹ - Show less +2 more•Institutions (1)

Tongji University¹

15 Oct 2019

TL;DR: This paper proposes a "zero-shot" scheme for back-lit image restoration, which exploits the power of deep learning, but does not rely on any prior image examples or prior training, and is the first unsupervised CNN-based back- lit image restoration method.

...read moreread less

Abstract: How to restore back-lit images still remains a challenging task. State-of-the-art methods in this field are based on supervised learning and thus they are usually restricted to specific training data. In this paper, we propose a "zero-shot" scheme for back-lit image restoration, which exploits the power of deep learning, but does not rely on any prior image examples or prior training. Specifically, we train a small image-specific CNN, namely ExCNet (short for Exposure Correction Network) at test time, to estimate the "S-curve" that best fits the test back-lit image. Once the S-curve is estimated, the test image can be then restored straightforwardly. ExCNet can adapt itself to different settings per image. This makes our approach widely applicable to different shooting scenes and kinds of back-lighting conditions. Statistical studies performed on 1512 real back-lit images demonstrate that our approach can outperform the competitors by a large margin. To the best of our knowledge, our scheme is the first unsupervised CNN-based back-lit image restoration method. To make the results reproducible, the source code is available at https://cslinzhang.github.io/ExCNet/.

...read moreread less

70 citations

Journal Article•DOI•

One class based feature learning approach for defect detection using deep autoencoders

[...]

A. Mujeeb¹, Wenting Dai¹, Marius Erdt, Alexei Sourin¹•Institutions (1)

Nanyang Technological University¹

01 Oct 2019-Advanced Engineering Informatics

TL;DR: This approach based on deep learning which uses autoencoders for extraction of discriminative features can detect different defects without using any defect samples during training, and it can be used to detect different types of defects with minimum customization.

...read moreread less

67 citations

Journal Article•DOI•

Defect Detection in Electronic Surfaces Using Template-Based Fourier Image Reconstruction

[...]

Du-Ming Tsai¹, Chih-Kai Huang¹•Institutions (1)

Yuan Ze University¹

01 Jan 2019-IEEE Transactions on Components, Packaging and Manufacturing Technology

TL;DR: A global Fourier image reconstruction method to detect and localize small defects in nonperiodical pattern images that is invariant to translation and illumination, and can detect subtle defects as small as 1-pixel wide in a wide variety of non periodical patterns found in the electronic industry.

...read moreread less

Abstract: For defect detection in nonperiodical pattern images, such as printed circuit boards or integrated circuit dies found in the electronic industry, template matching could be the only applicable method to tackle the problem. The traditional template matching techniques work in the spatial domain and rely on the local pixel information. They are sensitive to geometric and lighting changes, and random product variations. The currently available Fourier-based methods mainly work for plain and periodical texture surfaces. In this paper, we propose a global Fourier image reconstruction method to detect and localize small defects in nonperiodical pattern images. It is based on the comparison of the whole Fourier spectra between the template and the inspection image. It retains only the frequency components associated with the local spatial anomaly. The inverse Fourier transform is then applied to reconstruct the test image, where the local anomaly will be restored and the common pattern will be removed as a uniform surface. The proposed method is invariant to translation and illumination, and can detect subtle defects as small as 1-pixel wide in a wide variety of nonperiodical patterns found in the electronic industry.

...read moreread less

59 citations

Posted Content•

Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects

[...]

Yang Xiao¹, Xuchong Qiu², Pierre-Alain Langlois³, Mathieu Aubry⁴, Renaud Marlet⁵ - Show less +1 more•Institutions (5)

PSL Research University¹, ESIEE Paris², École des ponts ParisTech³, École Normale Supérieure⁴, French Institute for Research in Computer Science and Automation⁵

12 Jun 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a completely generic deep pose estimation approach, which does not require the network to have been trained on relevant categories, nor objects in a category to have a canonical pose, and demonstrates that this method boosts performances for supervised category pose estimation on standard benchmarks.

...read moreread less

Abstract: Most deep pose estimation methods need to be trained for specific object instances or categories. In this work we propose a completely generic deep pose estimation approach, which does not require the network to have been trained on relevant categories, nor objects in a category to have a canonical pose. We believe this is a crucial step to design robotic systems that can interact with new objects in the wild not belonging to a predefined category. Our main insight is to dynamically condition pose estimation with a representation of the 3D shape of the target object. More precisely, we train a Convolutional Neural Network that takes as input both a test image and a 3D model, and outputs the relative 3D pose of the object in the input image with respect to the 3D model. We demonstrate that our method boosts performances for supervised category pose estimation on standard benchmarks, namely Pascal3D+, ObjectNet3D and Pix3D, on which we provide results superior to the state of the art. More importantly, we show that our network trained on everyday man-made objects from ShapeNet generalizes without any additional training to completely new types of 3D objects by providing results on the LINEMOD dataset as well as on natural entities such as animals from ImageNet.

...read moreread less

50 citations

Journal Article•DOI•

CNN-feature based automatic image annotation method

[...]

Yanchun Ma¹, Yongjian Liu¹, Qing Xie¹, Lin Li¹•Institutions (1)

Wuhan University of Technology¹

01 Feb 2019-Multimedia Tools and Applications

TL;DR: This work combined the CNN feature of an image into a proposed model which is referred as SEM by using a famous CNN model-AlexNet and extracted a CNN feature by removing its final layer and it is proved to be useful in the authors' SEM model.

...read moreread less

Abstract: Automatic image annotation(AIA) methods are considered as a kind of efficient schemes to solve the problem of semantic-gap between the original images and their semantic information. However, traditional annotation models work well only with finely crafted manual features. To address this problem, we combined the CNN feature of an image into our proposed model which we referred as SEM by using a famous CNN model-AlexNet. We extracted a CNN feature by removing its final layer and it is proved to be useful in our SEM model. Additionally, based on the experience of the traditional KNN models, we propose a model to address the problem of simultaneously addressing the image tag refinement and assignment while maintaining the simplicity of the KNN model. The proposed model divides the images which have similar features into a semantic neighbor group. Moreover, utilizing a self-defined Bayesian-based model, we distribute the tags which belong to the neighbor group to the test images according to the distance between the test image and the neighbors. At last, the experiments are performed on three typical image datasets corel5k, espGame and laprtc12, which verify the effectiveness of the proposed model.

...read moreread less

Journal Article•DOI•

Effective and Efficient Blind Quality Evaluator for Contrast Distorted Images

[...]

Guanghui Yue¹, Chunping Hou¹, Tianwei Zhou¹, Xinfeng Zhang²•Institutions (2)

Tianjin University¹, University of Southern California²

01 Aug 2019-IEEE Transactions on Instrumentation and Measurement

TL;DR: A blind quality assessment method that can effectively and efficiently evaluate the quality of contrast distorted images without requiring reference information is developed and is more consistent with subjective evaluation results than the state-of-the-art image quality assessment methods and requires a lower computational complexity.

...read moreread less

Abstract: This paper mainly focuses on developing a blind quality assessment method that can effectively and efficiently evaluate the quality of contrast distorted images without requiring reference information Through experiments, we discover and validate that the global intensity change is the main characteristic of contrast distorted images and has a close relationship to the perceptual quality With these observations, two elements are utilized to quantify this characteristic, ie, the maximum information entropy of intensity values and the Kullback–Leibler (K–L) divergence between the test image’s intensity histogram and the prior one based on the statistical experiment over a great number of high-quality images To be specific, the entropy represents the valuable information of an image and the K–L divergence reflects the change degree of intensity distribution In view of these, the proposed method is generated by combining these two elements linearly Extensive experiments on three publicly available databases demonstrate the superiority of the proposed method More specifically, it is more consistent with subjective evaluation results than the state-of-the-art image quality assessment methods and requires a lower computational complexity

...read moreread less

Journal Article•DOI•

Morphology-based defect detection in machined surfaces with circular tool-mark patterns

[...]

Du-Ming Tsai¹, Daniel E. Rivera Molina¹•Institutions (1)

Yuan Ze University¹

01 Feb 2019-Measurement

TL;DR: The proposed morphological operations with arc-shaped SEs can efficiently intensify local defects and remove the tool-mark background in the circular machined surface and can achieve high detection accuracy for various small defects, including scratch, bump and edge burst.

...read moreread less

Journal Article•DOI•

PSACNN: Pulse sequence adaptive fast whole brain segmentation

[...]

Amod Jog¹, Andrew Hoopes¹, Douglas N. Greve¹, Koen Van Leemput, Bruce Fischl¹ - Show less +1 more•Institutions (1)

Harvard University¹

01 Oct 2019-NeuroImage

TL;DR: A CNN-based segmentation algorithm that, in addition to being highly accurate and fast, is also resilient to variation in the input acquisition, and consistent across a wide range of acquisition protocols is proposed.

...read moreread less

Journal Article•DOI•

A Chaotic Electromagnetic Field Optimization Algorithm Based on Fuzzy Entropy for Multilevel Thresholding Color Image Segmentation

[...]

Suhang Song¹, Heming Jia¹, Jun Ma¹•Institutions (1)

Northeast Forestry University¹

15 Apr 2019-Entropy

TL;DR: An effective technique of Electromagnetic Field Optimization (EFO) algorithm based on a fuzzy entropy criterion is proposed, and in addition, a novel chaotic strategy is embedded into EFO to develop a new algorithm named CEFO to evaluate the robustness of the proposed algorithm.

...read moreread less

Abstract: Multilevel thresholding segmentation of color images is an important technology in various applications which has received more attention in recent years. The process of determining the optimal threshold values in the case of traditional methods is time-consuming. In order to mitigate the above problem, meta-heuristic algorithms have been employed in this field for searching the optima during the past few years. In this paper, an effective technique of Electromagnetic Field Optimization (EFO) algorithm based on a fuzzy entropy criterion is proposed, and in addition, a novel chaotic strategy is embedded into EFO to develop a new algorithm named CEFO. To evaluate the robustness of the proposed algorithm, other competitive algorithms such as Artificial Bee Colony (ABC), Bat Algorithm (BA), Wind Driven Optimization (WDO), and Bird Swarm Algorithm (BSA) are compared using fuzzy entropy as the fitness function. Furthermore, the proposed segmentation method is also compared with the most widely used approaches of Otsu's variance and Kapur's entropy to verify its segmentation accuracy and efficiency. Experiments are conducted on ten Berkeley benchmark images and the simulation results are presented in terms of peak signal to noise ratio (PSNR), mean structural similarity (MSSIM), feature similarity (FSIM), and computational time (CPU Time) at different threshold levels of 4, 6, 8, and 10 for each test image. A series of experiments can significantly demonstrate the superior performance of the proposed technique, which can deal with multilevel thresholding color image segmentation excellently.

...read moreread less

Journal Article•DOI•

Vehicle Detection From High-Resolution Remote Sensing Imagery Using Convolutional Capsule Networks

[...]

Yongtao Yu, Tiannan Gu, Haiyan Guan¹, Dilong Li², Shenghua Jin - Show less +1 more•Institutions (2)

Nanjing University of Information Science and Technology¹, Wuhan University²

08 May 2019-IEEE Geoscience and Remote Sensing Letters

TL;DR: Comparative studies with three existing methods confirm that the proposed convolutional capsule network for detecting vehicles from high-resolution remote sensing images effectively performs in detecting vehicles of various conditions.

...read moreread less

Abstract: Vehicle detection plays an important role in a variety of traffic-related applications. However, due to the scale and orientation variations and partial occlusions of vehicles, it is still challengeable to accurately detect vehicles from remote sensing images. This letter proposes a convolutional capsule network for detecting vehicles from high-resolution remote sensing images. First, a test image is segmented into superpixels to generate meaningful and nonredundant patches. Then, these patches are input to a convolutional capsule network to label them into vehicles or the background. Finally, nonmaximum suppression is adopted to eliminate repetitive detections. Quantitative evaluations on four test data sets show that average completeness, correctness, quality, and F1-measure of 0.93, 0.97, 0.90, and 0.95, respectively, are obtained. Comparative studies with three existing methods confirm that the proposed method effectively performs in detecting vehicles of various conditions.

...read moreread less

Journal Article•DOI•

KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment

[...]

Vlad Hosu¹, Hanhe Lin², Tamás Szirányi¹, Dietmar Saupe¹•Institutions (2)

University of Konstanz¹, Hungarian Academy of Sciences²

14 Oct 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The KonIQ-10k dataset as mentioned in this paper is the first in-the-wild dataset for image quality assessment (IQA), consisting of 10,073 quality scored images.

...read moreread less

Abstract: Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images. It is the first in-the-wild database aiming for ecological validity, concerning the authenticity of distortions, the diversity of content, and quality-related indicators. Through the use of crowdsourcing, we obtained 1.2 million reliable quality ratings from 1,459 crowd workers, paving the way for more general IQA models. We propose a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set (0.921 SROCC), to the current state-of-the-art database LIVE-in-the-Wild (0.825 SROCC). The model derives its core performance from the InceptionResNet architecture, being trained at a higher resolution than previous models (512x384). Correlation analysis shows that KonCept512 performs similar to having 9 subjective scores for each test image.

...read moreread less

Book Chapter•DOI•

Multi-task Learning of a Deep K-Nearest Neighbour Network for Histopathological Image Classification and Retrieval

[...]

Tingying Peng¹, Melanie Boxberg¹, Wilko Weichert¹, Nassir Navab¹, Carsten Marr - Show less +1 more•Institutions (1)

Technische Universität München¹

13 Oct 2019

TL;DR: A novel multi-task deep learning framework for simultaneous histopathology image classification and retrieval, leveraging on the classic concept of k-nearest neighbours to improve model interpretability and evaluate the method on colorectal cancer histology slides to show that the confidence estimates are strongly correlated with model performance.

...read moreread less

Abstract: Deep neural networks have achieved tremendous success in image recognition, classification and object detection. However, deep learning is often criticised for its lack of transparency and general inability to rationalise its predictions. The issue of poor model interpretability becomes critical in medical applications: a model that is not understood and trusted by physicians is unlikely to be used in daily clinical practice. In this work, we develop a novel multi-task deep learning framework for simultaneous histopathology image classification and retrieval, leveraging on the classic concept of k-nearest neighbours to improve model interpretability. For a test image, we retrieve the most similar images from our training databases. These retrieved nearest neighbours can be used to classify the test image with a confidence score, and provide a human-interpretable explanation of our classification. Our original framework can be built on top of any existing classification network (and therefore benefit from pretrained models), by (i) combining a triplet loss function with a novel triplet sampling strategy to compare distances between samples and (ii) adding a Cauchy hashing loss function to accelerate neighbour searching. We evaluate our method on colorectal cancer histology slides and show that the confidence estimates are strongly correlated with model performance. Nearest neighbours are intuitive and useful for expert evaluation. They give insights into understanding possible model failures, and can support clinical decision making by comparing archived images and patient records with the actual case.

...read moreread less

Posted Content•

Deep Sky Modeling for Single Image Outdoor Lighting Estimation

[...]

Yannick Hold-Geoffroy¹, Akshaya Athawale², Jean-François Lalonde³•Institutions (3)

Adobe Systems¹, Indian Institute of Technology Dhanbad², Laval University³

10 May 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a data-driven learned sky model is proposed for outdoor lighting estimation from a single image, which can directly estimate an HDR environment map of the lighting without relying on analytical lighting models.

...read moreread less

Journal Article•DOI•

Image quality enhancement via gradient-limited random phase addition in holographic display

[...]

Tao Zhao¹, Juan Liu¹, Junyi Duan¹, Xin Li¹, Yongtian Wang¹ - Show less +1 more•Institutions (1)

Beijing Institute of Technology¹

01 Jul 2019-Optics Communications

TL;DR: In this paper, a gradient-limited random phase addition method is developed to avoid excessively diffusing object information, where an image is segmented into two regions according to its frequency characteristics.

...read moreread less

Journal Article•DOI•

Defocus blur detection based on multiscale SVD fusion in gradient domain

[...]

Huimei Xiao¹, Wei Lu², Wei Lu¹, Ruipeng Li¹, Nan Zhong³, Yuileong Yeung¹, Junjia Chen¹, Fei Xue¹, Wei Sun¹ - Show less +5 more•Institutions (3)

Sun Yat-sen University¹, Chinese Academy of Sciences², South China Agricultural University³

01 Feb 2019-Journal of Visual Communication and Image Representation

TL;DR: A novel blur metric based on Multiscale SVD fusion (M-SVD) fuses different sub-bands of the selected singular values (SVs) in multiscale image windows, which could drastically reduce the chances of false positives for blur detection and overcome the difficulty that the sharp region is misjudged for a blur region because of its smooth texture.

...read moreread less

Proceedings Article•DOI•

Comparative Efficiency Analysis of Gradational Correction Models of Highly Lighted Image

[...]

Kirill Smelyakov¹, Mykyta Hvozdiev¹, Anastasiya Chupryna¹, Denys Sandrkin¹, Vitalii Martovytskyi¹ - Show less +1 more•Institutions (1)

University of Kharkiv¹

01 Oct 2019

TL;DR: A comparative analysis of the application of the most demanded gradational correction models (power, exponential and logarithmic) of a highly lighted digital image, which capable of automatic adaptation to different brightness scales is provided.

...read moreread less

Abstract: The paper provides a comparative analysis of the application of the most demanded gradational correction models (power, exponential and logarithmic) of a highly lighted digital image, which capable of automatic adaptation to different brightness scales, discusses the features of their practical application, sets up an experiment to improve the highly lighted photo. In the experiment, the test image is modified with use of different gradation correction models and different parameters, this helps to show the practical value of such modifications and the coefficient of image enhancement is given to provide comparative analysis of influence of input parameters to final result. After considering the results of the experiment, analysis and recommendations for the practical use of the models are given which helps to solve actual applied tasks of digital image quality improvement

...read moreread less

Proceedings Article•DOI•

GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction

[...]

Baris Gecer¹, Stylianos Ploumpis¹, Irene Kotsia², Stefanos Zafeiriou¹•Institutions (2)

Imperial College London¹, Hellenic Open University²

15 Feb 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) is harnessed to reconstruct the facial texture and shape from single images.

...read moreread less

Journal Article•DOI•

Subdictionary-Based Joint Sparse Representation for SAR Target Recognition Using Multilevel Reconstruction

[...]

Zhi Zhou¹, Zongjie Cao¹, Yiming Pi¹•Institutions (1)

University of Electronic Science and Technology of China¹

26 Apr 2019-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A multilevel reconstruction-based multitask joint sparse representation method, which can not only restrain the background clutter and noise but also augment the data set, is proposed in this paper.

...read moreread less

Abstract: Template-matching-based approaches have been developed for many years in the field of synthetic aperture radar (SAR) automatic target recognition (ATR). However, the performance of template-matching-based approaches is strongly affected by two factors: background clutter and noise and the size of the data set. To solve the problems mentioned above, a multilevel reconstruction-based multitask joint sparse representation method is proposed in this paper. According to the theory of the attributed scattering center (ASC) model, a SAR image exhibits strong point-scatter-like behavior, which can be modeled by scattering centers on the target. As a result, the ASCs can be extracted from SAR images based on the ASC model. Then, ASCs extracted from SAR images are used to reconstruct the SAR target at multilevels based on energy ratio (ER). The multilevel reconstruction is a process of data augmentation, which can not only restrain the background clutter and noise but also augment the data set. Several subdictionaries are designed after multilevel reconstruction according to the label of training samples. Meanwhile, a test image chip is reconstructed into multiple test images. The random projection coefficients associated with multiple reconstructed test images are fed into a multitask joint sparse representation classification framework. The final decision is made in terms of accumulated reconstruction error. Experiments on moving and stationary target acquisition and recognition (MSTAR) data set proved the effectiveness of our method.

...read moreread less

Proceedings Article•DOI•

Distorted Representation Space Characterization Through Backpropagated Gradients

[...]

Gukyeong Kwon¹, Mohit Prabhushankar¹, Dogancan Temel¹, Ghassan AlRegib¹•Institutions (1)

Georgia Institute of Technology¹

01 Sep 2019

TL;DR: In this paper, weight gradients from backpropagation were used to characterize the representation space learned by deep learning algorithms for perceptual image quality assessment and out-of-distribution classification.

...read moreread less

Abstract: In this paper, we utilize weight gradients from backpropagation to characterize the representation space learned by deep learning algorithms. We demonstrate the utility of such gradients in applications including perceptual image quality assessment and out-of-distribution classification. The applications are chosen to validate the effectiveness of gradients as features when the test image distribution is distorted from the train image distribution. In both applications, the proposed gradient based features outperform activation features. In image quality assessment, the proposed approach is compared with other state of the art approaches and is generally the top performing method on TID 2013 and MULTI-LIVE databases in terms of accuracy, consistency, linearity, and monotonic behavior. Finally, we analyze the effect of regularization on gradients using CURE-TSR dataset for out-of-distribution classification.

...read moreread less

Journal Article•DOI•

Finger-Vein Verification Based on LSTM Recurrent Neural Networks

[...]

Huafeng Qin, Peng Wang

24 Apr 2019-Applied Sciences

TL;DR: A deep learning model to extract vein features by combining the Convolutional Neural Networks (CNN) model and Long Short-Term Memory (LSTM) model is proposed, which significantly improves the finger-vein verification accuracy.

...read moreread less

Abstract: Finger-vein biometrics has been extensively investigated for personal verification. A challenge is that the finger-vein acquisition is affected by many factors, which results in many ambiguous regions in the finger-vein image. Generally, the separability between vein and background is poor in such regions. Despite recent advances in finger-vein pattern segmentation, current solutions still lack the robustness to extract finger-vein features from raw images because they do not take into account the complex spatial dependencies of vein pattern. This paper proposes a deep learning model to extract vein features by combining the Convolutional Neural Networks (CNN) model and Long Short-Term Memory (LSTM) model. Firstly, we automatically assign the label based on a combination of known state of the art handcrafted finger-vein image segmentation techniques, and generate various sequences for each labeled pixel along different directions. Secondly, several Stacked Convolutional Neural Networks and Long Short-Term Memory (SCNN-LSTM) models are independently trained on the resulting sequences. The outputs of various SCNN-LSTMs form a complementary and over-complete representation and are conjointly put into Probabilistic Support Vector Machine (P-SVM) to predict the probability of each pixel of being foreground (i.e., vein pixel) given several sequences centered on it. Thirdly, we propose a supervised encoding scheme to extract the binary vein texture. A threshold is automatically computed by taking into account the maximal separation between the inter-class distance and the intra-class distance. In our approach, the CNN learns robust features for vein texture pattern representation and LSTM stores the complex spatial dependencies of vein patterns. So, the pixels in any region of a test image can then be classified effectively. In addition, the supervised information is employed to encode the vein patterns, so the resulting encoding images contain more discriminating features. The experimental results on one public finger-vein database show that the proposed approach significantly improves the finger-vein verification accuracy.

...read moreread less

Patent•

An automatic image annotation method for weakly supervised semantic segmentation

[...]

Qing Chen, Yu Jing, Xiao Chuangbai, Duan Juan

22 Jan 2019

TL;DR: In this article, an automatic image annotation method for weakly supervised semantic segmentation is proposed, where the object border and the semantic label are regarded as a kind of weak supervised semantic label of image level.

...read moreread less

Abstract: An automatic image annotation method for weakly supervised semantic segmentation. The object border is located by an image object detection method, and the semantic label is given. The object border and the semantic label are regarded as a kind of weak supervised semantic label of image level. By using traditional image segmentation method, the whole object region is segmented out, and the segmentation template for training classification network is generated. Then, the segmentation template is used as a supervisory signal to train the classification network. Finally, the trained classification network is used to segment the test image semantically. The technical proposal of the invention utilizes an object detection method to obtain a border and a semantic tag of an object in an image, utilizes a traditional image segmentation method to segment an object region, and combines the semantic tag to serve as a training sample for weak supervision semantic segmentation. The method to automatically generate training samples for weak supervised semantic segmentation, solves the problem of time-consuming and laborious manual labeling of a large number of images.

...read moreread less

Proceedings Article•DOI•

A Machine Vision-based Realtime Anomaly Detection Method for Industrial Products Using Deep Learning

[...]

Yu Jiang¹, Wei Wang¹, Chunhui Zhao¹•Institutions (1)

Zhejiang University¹

01 Nov 2019

TL;DR: The performance evaluation results demonstrate that the two proposed anomaly detection models based on deep learning can well meet the dual requirements of real-time and accuracy for anomaly detection in the high-speed industrial production scenarios.

...read moreread less

Abstract: In the process of industrial production, anomaly detection is the key link to ensure the high quality of the product. This paper deeply studies the method of anomaly detection for industrial products based on deep learning. For the balanced image data set of industrial production products, this paper proposes a supervised anomaly detection model based on YOLOv3. This model constructs the ROI classifier to detect anomaly types. For the unbalanced image data set (only few anomaly images) of industrial production products, this paper proposes a semi-supervised anomaly detection model based on Fast-AnoGAN. This model is built from normal samples only. It uses the trained WGAN-GP model to generate images, and achieves anomaly detection by monitoring the anomaly score which is obtained by calculating the difference between the generated image and the test image. The two proposed anomaly detection models evaluated on both balanced and unbalanced data sets in the real industrial production scenarios. The performance evaluation results demonstrate that the two proposed anomaly detection models based on deep learning can well meet the dual requirements of real-time and accuracy for anomaly detection in the high-speed industrial production scenarios.

...read moreread less

Journal Article•DOI•

Combination of global and local filters for robust SAR target recognition under various extended operating conditions

[...]

Baiyuan Ding¹, Gongjian Wen¹•Institutions (1)

National University of Defense Technology¹

01 Feb 2019-Information Sciences

TL;DR: A robust synthetic aperture radar (SAR) automatic target recognition (ATR) method is proposed by combining the global and local filters, especially aiming to improve the recognition performance under various extended operating conditions (EOCs).

...read moreread less

Proceedings Article•DOI•

Autoencoder-Based Fabric Defect Detection with Cross- Patch Similarity

[...]

Hu Tian¹, Fei Li¹•Institutions (1)

Fujitsu¹

27 May 2019

TL;DR: By exploring similarities between different patches in the whole test image, a novel autoencoder-based fabric defect detection method is proposed and the original encoded latent variable is modified, and the cross-patch similarity is introduced for determining the modification function.

...read moreread less

Abstract: Fabric quality inspection plays an important role in the textile industry. As an effective approach to learn data representations, autoencoder has been adopted for defect detection. With the basic idea that the defect area cannot be recovered by the model trained on non-defective image patches, the residual is often used as an indication for defect judgement. However, usually the texture (non-defect) area in a defective patch also cannot be well reconstructed, which makes the pixel-wise detection inaccurate. In this paper, by exploring similarities between different patches in the whole test image, a novel autoencoder-based fabric defect detection method is proposed. In order to maintain the texture area in the reconstructed patch, the original encoded latent variable is modified, and the cross-patch similarity is introduced for determining the modification function. The whole algorithm is conducted in an iterative way, and the detection results will become better and better. Experimental results on the benchmark datasets demonstrate the effectiveness of our proposal.

...read moreread less

Collapse