Showing papers presented at "International Symposium on Image and Signal Processing and Analysis in 2019"

PDF

Open Access

Proceedings Article•DOI•

Face Attribute Detection with MobileNetV2 and NasNet-Mobile

[...]

Frerk Saxen¹, Philipp Werner¹, Sebastian Handrich¹, Ehsan Othman¹, Laslo Dinges¹, Ayoub Al-Hamadi¹ - Show less +2 more•Institutions (1)

Otto-von-Guericke University Magdeburg¹

01 Sep 2019

TL;DR: A straight forward and fast face alignment technique for preprocessing and estimate the face attributes using MobileNetV2 and Nasnet-Mobile, two lightweight CNN (Convolutional Neural Network) architectures that perform similarly well in terms of accuracy and speed.

...read moreread less

Abstract: In this paper, we propose two simple yet effective methods to estimate facial attributes in unconstrained images. We use a straight forward and fast face alignment technique for preprocessing and estimate the face attributes using MobileNetV2 and Nasnet-Mobile, two lightweight CNN (Convolutional Neural Network) architectures. Both architectures perform similarly well in terms of accuracy and speed. A comparison with state-of-the-art methods with respect to processing time and accuracy shows that our proposed approach perform faster than the best state-of-the-art model and better than the fastest state-of-the-art model. Moreover, our approach is easy to use and capable of being deployed on mobile devices.

...read moreread less

49 citations

Proceedings Article•DOI•

Automatic Crack Detection using Mask R-CNN

[...]

Leanne Attard¹, Carl J. Debono¹, Gianluca Valentino¹, Mario Di Castro², Alessandro Masi², L. Scibile² - Show less +2 more•Institutions (2)

University of Malta¹, CERN²

01 Sep 2019

TL;DR: It is demonstrated that Mask R-CNN can be used to localize cracks on concrete surfaces and obtain their corresponding masks to aid extract other properties that are useful for inspection.

...read moreread less

Abstract: In order to avoid possible failures and prevent damage in civil infrastructures, such as tunnels and bridges, inspection should be done on a regular basis. Cracks are one of the earliest indications of degradation, hence, their detection allows preventive measures to be taken to avoid further damage. In this paper, we demonstrate that Mask R-CNN can be used to localize cracks on concrete surfaces and obtain their corresponding masks to aid extract other properties that are useful for inspection. Such a tool can help mitigate the drawbacks of manual inspection by automating crack detection, lowering time consumption in executing this task, reducing costs and increasing the safety of the personnel. To train Mask R-CNN for crack detection we built a groundtruth database of masks on images from a subset of a standard crack dataset. Tests on the trained model achieved a precision value of 93.94 % and a recall of 77.5 %.

...read moreread less

26 citations

Proceedings Article•DOI•

Underwater Color Restoration Using U-Net Denoising Autoencoder

[...]

Yousif Hashisho¹, Mohamad Albadawi¹, Tom Krause¹, Uwe Freiherr von Lukas²•Institutions (2)

Fraunhofer Society¹, University of Rostock²

22 May 2019

TL;DR: The proposed network takes into consideration the accuracy and the computation cost to enable realtime implementation on underwater visual tasks using end-to-end autoencoder network and is compared to a state-of-the-art method.

...read moreread less

Abstract: Visual inspection of underwater structures by vehicles, e.g. remotely operated vehicles (ROVs), plays an important role in scientific, military, and commercial sectors. However, the automatic extraction of information using software tools is hindered by the characteristics of water which degrade the quality of captured videos. As a contribution for restoring the color of underwater images, Underwater Denoising Autoencoder (UDAE) model is developed using a denoising autoencoder with U-Net architecture. The proposed network takes into consideration the accuracy and the computation cost to enable realtime implementation on underwater visual tasks using end-to-end autoencoder network. Underwater vehicles perception is improved by reconstructing captured frames; hence obtaining better performance in underwater tasks. Related learning methods use generative adversarial networks (GANs) to generate color corrected underwater images, and to our knowledge this paper is the first to deal with a single autoencoder capable of producing same or better results. Moreover, image pairs are constructed for training the proposed network, where it is hard to obtain such dataset from underwater scenery. At the end, the proposed model is compared to a state-of-the-art method.

...read moreread less

24 citations

Proceedings Article•DOI•

Detecting human abnormal behaviour through a video generated model

[...]

Thomas Gatt¹, Dylan Seychell¹, Alexiei Dingli¹•Institutions (1)

University of Malta¹

01 Sep 2019

TL;DR: Evaluated on a challenging realistic video dataset, the results show that both types of models were able to correctly distinguish between normal and abnormal data sequences, with an average F-score of 0.93 and it was determined that pose estimated data compares very well with sensor data.

...read moreread less

Abstract: Detecting human abnormal activities is the process of observing rare events that deviate from normality. In this study, an automated camera-based system that is able to detect irregular human behaviour is proposed. PoseNet and OpenPose, which are pre-trained pose estimation models are used to detect the person in the frame and extract the body keypoints. Such data is used to train two types of AutoEncoders based on LSTM and CNN units in a semi-supervised approach where the goal is to learn a general representation of the normal behaviour. Evaluated on a challenging realistic video dataset, the results show that both types of models were able to correctly distinguish between normal and abnormal data sequences, with an average F-score of 0.93. The results also show that the proposed method outperformed similar work done on the same dataset. Furthermore, it was also determined that pose estimated data compares very well with sensor data. This shows that pose estimated data can be informative enough to understand and classify human actions.

...read moreread less

22 citations

Proceedings Article•DOI•

Features Extraction Applied to the Analysis of the Sounds Emitted by Honey Bees in a Beehive

[...]

Alessandro Terenzi¹, Stefania Cecchi¹, Simone Orcioni¹, Francesco Piazza¹•Institutions (1)

Marche Polytechnic University¹

01 Sep 2019

TL;DR: The proposed work aims at analyzing the bees' sound introducing features extraction useful for sound classification techniques and for determine dangerous situations.

...read moreread less

Abstract: Last years increase in bee mortality has underlined the necessity of an intensive bee hive monitoring in order to better understand the problems which are seriously affecting the honey bee health. Sound emitted inside a beehive is one of the key parameters for a non-invasive monitoring of their health condition. In this context, the proposed work aims at analyzing the bees' sound introducing features extraction useful for sound classification techniques and for determine dangerous situations. Several experiments on a real scenario have been performed focusing on orphaned colony case and highlighting the potentiality of the proposed approach.

...read moreread less

20 citations

Proceedings Article•DOI•

Flaw Detection from Ultrasonic Images using YOLO and SSD

[...]

Luka Posilovic¹, Duje Medak¹, Marko Subasic¹, Tomislav Petković¹, Marko Budimir, Sven Lončarić¹ - Show less +2 more•Institutions (1)

University of Zagreb¹

01 Sep 2019

TL;DR: Two deep learning based methods for flaw detection are presented, YOLO and SSD convolutional neural networks that were tested on a dataset that was acquired by scanning metal blocks containing different types of defects.

...read moreread less

Abstract: Non-destructive ultrasonic testing (UT) of materials is used for monitoring critical parts in power plants, aeronautics, oil and gas industry, and space industry. Due to a vast amount of time needed for a human expert to perform inspection it is practical for a computer to take over that task. Some attempts have been made to produce algorithms for automatic UT scan inspection mainly using older, non-flexible analysis methods. In this paper, two deep learning based methods for flaw detection are presented, YOLO and SSD convolutional neural networks. The methods' performance was tested on a dataset that was acquired by scanning metal blocks containing different types of defects. YOLO achieved average precision (AP) of 89.7% while SSD achieved AP of 84.5 %.

...read moreread less

18 citations

Proceedings Article•DOI•

Yet another de novo genome assembler

[...]

Robert Vaser¹, Mile Šikić²•Institutions (2)

University of Zagreb¹, Genome Institute of Singapore²

31 May 2019

TL;DR: A new tool, called Ra, for de novo genome assembly of long uncorrected reads, is presented, which is a fast and memory friendly assembler based on sequence classification and assembly graphs, developed with large genomes in mind.

...read moreread less

Abstract: Advances in sequencing technologies have pushed the limits of genome assemblies beyond imagination. The sheer amount of long read data that is being generated enables the assembly for even the largest and most complex organism for which efficient algorithms are needed. We present a new tool, called Ra, for de novo genome assembly of long uncorrected reads. It is a fast and memory friendly assembler based on sequence classification and assembly graphs, developed with large genomes in mind. It is freely available at https://github.com/lbcbsci/ra.

...read moreread less

18 citations

Proceedings Article•DOI•

On the Comparison of Classic and Deep Keypoint Detector and Descriptor Methods

[...]

David Bojanic¹, Kristijan Bartol¹, Tomislav Pribanic¹, Tomislav Petkovic¹, Yago Diez Donoso², Joaquim Salvi Mas³ - Show less +2 more•Institutions (3)

University of Zagreb¹, Yamagata University², University of Girona³

01 Sep 2019

TL;DR: In this paper, a performance comparison between several classic hand-crafted and deep keypoint detector and descriptor methods is given, where a subset of all combinations is paired into detector-descriptor pipelines.

...read moreread less

Abstract: The purpose of this study is to give a performance comparison between several classic hand-crafted and deep key-point detector and descriptor methods. In particular, we consider the following classical algorithms: SIFT, SURF, ORB, FAST, BRISK, MSER, HARRIS, KAZE, AKAZE, AGAST, GFTT, FREAK, BRIEF and RootSIFT, where a subset of all combinations is paired into detector-descriptor pipelines. Additionally, we analyze the performance of two recent and perspective deep detector-descriptor models, LF-Net and SuperPoint. Our benchmark relies on the HPSequences dataset that provides real and diverse images under various geometric and illumination changes. We analyze the performance on three evaluation tasks: keypoint verification, image matching and keypoint retrieval. The results show that certain classic and deep approaches are still comparable, with some classic detector-descriptor combinations overperforming pretrained deep models. In terms of the execution times of tested implementations, SuperPoint model is the fastest, followed by ORB.

...read moreread less

18 citations

Proceedings Article•DOI•

Cross-Database Evaluation of Pain Recognition from Facial Video

[...]

Ehsan Othman¹, Philipp Werner¹, Frerk Saxen¹, Ayoub Al-Hamadi¹, Steffen Walter² - Show less +1 more•Institutions (2)

Otto-von-Guericke University Magdeburg¹, University of Ulm²

01 Sep 2019

TL;DR: This paper proposes two distinct methods to classify based on the temporal information of pain, and does cross-database validations on two benchmark pain databases: BioVid and X-ITE.

...read moreread less

Abstract: So far, all studies investigating the facial expression of pain have validated methods on the same database, whereas the cross-database performance is less considered. This may be due to poor performance of well-trained models on other databases. In this paper, we propose two distinct methods to classify based on the temporal information. To explore the generalization capability of pain recognition models, we do cross-database validations on two benchmark pain databases: BioVid and X-ITE. We also experiment with combining both databases. Experimental results (1) show that our methods can be successfully used to classify pain (both methods perform similarly well), (2) demonstrate that the performance is robust by verifying them cross-database, and (3) present that the performance of pain assessment is improved with more data (combined-database).

...read moreread less

17 citations

Proceedings Article•DOI•

MMOD-COG: A Database for Multimodal Cognitive Load Classification

[...]

Igor Mijic, Marko Sarlija¹, Davor Petrinovic¹•Institutions (1)

University of Zagreb¹

01 Sep 2019

TL;DR: The goal of the dataset was to provide an alternative to existing cognitive load focused datasets, usually based around Stroop tasks or working memory tasks, and to implement the cognitive load tasks in a way that would make the responses appropriate for both speech and physiological response analysis, ultimately making it multimodal.

...read moreread less

Abstract: This paper presents a dataset for multimodal classification of cognitive load recorded on a sample of students. The cognitive load was induced by way of performing basic arithmetic tasks, while the multimodal aspect of the dataset comes in the form of both speech and physiological responses to those tasks. The goal of the dataset was two-fold: firstly to provide an alternative to existing cognitive load focused datasets, usually based around Stroop tasks or working memory tasks; and secondly to implement the cognitive load tasks in a way that would make the responses appropriate for both speech and physiological response analysis, ultimately making it multimodal. The paper also presents preliminary classification benchmarks, in which SVM classifiers were trained and evaluated solely on either speech or physiological signals and on combinations of the two. The multimodal nature of the classifiers may provide improvements on results on this inherently challenging machine learning problem because it provides more data about both the intra-participant and inter-participant differences in how cognitive load manifests itself in affective responses.

...read moreread less

13 citations

Proceedings Article•DOI•

Estimating Biological Gender from Panoramic Dental X-Ray Images

[...]

Denis Milosevic¹, Marin Vodanović¹, Ivan Galić, Marko Subasic¹•Institutions (1)

University of Zagreb¹

01 Sep 2019

TL;DR: An automated solution based on deep learning techniques using convolutional neural networks to identify the gender of a person with panoramic dental x-ray images of patients with European origin, with the images being taken by a wide range of orthopantomographs.

...read moreread less

Abstract: Identifying the gender of a person is one of the fundamental tasks in forensic medicine. One possible application is right after a catastrophic event such as a mass disaster with a high victim count. In such cases it is necessary to identify the people involved which can require a high number of forensic experts, depending on the scale of the event. With panoramic dental x-ray images the biological gender of a person can be estimated by analyzing skeletal structures that express sexual dimorphism. Current methods require the manual measurement of a wide array of mandibular parameters which are then manually compared to references based on these measurements and assumed ethnicity of the people involved. We propose an automated solution based on deep learning techniques using convolutional neural networks. Our data consists of 4000 panoramic dental x-ray images of patients with European origin, with the images being taken by a wide range of orthopantomographs. Our automated method can estimate 64 images per second on contemporary hardware, it doesn't require human intervention for estimation and it achieves state-of-the-art results with an accuracy of 96.87% ± 0.96%.

...read moreread less

Proceedings Article•DOI•

Color Cerberus

[...]

Alexey Savchik¹, Egor I. Ershov¹, Simon M. Karpenko•Institutions (1)

Russian Academy of Sciences¹

01 Sep 2019

TL;DR: A simple convolutional neural network was able to win the ISISPA color constancy competition and partial reimplemen-tation of the neural architecture from [1] would have shown even better results in this setup.

...read moreread less

Abstract: A simple convolutional neural network was able to win the ISISPA color constancy competition. Partial reimplemen-tation of the neural architecture from [1] would have shown even better results in this setup.

...read moreread less

Proceedings Article•DOI•

Face morphing using average face for subtle expression recognition

[...]

Junya Ueda¹, Katsunori Okajima¹•Institutions (1)

Yokohama National University¹

01 Sep 2019

TL;DR: This paper proposed a preprocessing method of face morphing to improve the ability of the classifier based on machine learning, by using the average face, which removes person-specific face differences and magnifies the signal of subtle expressions.

...read moreread less

Abstract: Current facial expression classifiers are created based on face images with high-intensity expressions. Therefore, the recognition accuracy is low for subtle facial expressions. In this paper, we proposed a preprocessing method of face morphing to improve the ability of the classifier based on machine learning. That is, by using the average face, it removes person-specific face differences and magnifies the signal of subtle expressions. In addition, we created an artificial subtle facial expression image dataset by using average and morphed faces for both uses in our algorithm and its verification. Finally, our proposed method, as a preprocessing in a machine learning based recognizer, significantly improved the recognition accuracy of a facial expression recognition system such as CNN.

...read moreread less

Proceedings Article•DOI•

Quantifying Influence in Financial Markets via Partial Correlation Network Inference

[...]

Tristan Millington¹, Mahesan Niranjan¹•Institutions (1)

University of Southampton¹

01 Sep 2019

TL;DR: By measuring the centrality of the various sectors in the network it is found that the financial sector is regarded as the most important for the majority of the dataset.

...read moreread less

Abstract: Network based methods to study the financial markets have been popular due to their ability to represent a complex system in a simple manner. We are interested to see if we can measure the influence between various companies by using partial correlation. Calculating partial correlation can be challenging with financial data so to rectify this we use the SPACE estimator. With this estimator we infer networks from daily S&P500 returns, study how these networks vary over time and draw parallels to the macroeconomic events that may explain the changes. We see that companies tend to have more connections to those in the same sector and some sectors tend to be more self contained than others. By measuring the centrality of the various sectors in the network we find that the financial sector is regarded as the most important for the majority of the dataset. Finally we show there is mild negative correlation between the centrality of a company and its out-of-sample risk.

...read moreread less

Proceedings Article•DOI•

Efficient blind deblurring under high noise levels

[...]

Jérémy Anger¹, Mauricio Delbracio², Gabriele Facciolo¹•Institutions (2)

Université Paris-Saclay¹, University of the Republic²

01 Sep 2019

TL;DR: In this article, a fast non-blind deconvolution method is proposed to handle high noise levels while keeping their efficiency, which is equivalent to those obtained with much more computationally demanding methods.

...read moreread less

Abstract: The goal of blind image deblurring is to recover a sharp image from a motion blurred one without knowing the camera motion. Current state-of-the-art methods have a remarkably good performance on images with no noise or very low noise levels. However, the noiseless assumption is not realistic considering that low light conditions are the main reason for the presence of motion blur due to requiring longer exposure times. In fact, motion blur and moderate to high noise often appear together. Most works approach this problem by first estimating the blur kernel k and then deconvolving the noisy blurred image. In this work, we first show that current state-of-the-art kernel estimation methods based on the l 0 gradient prior can be adapted to handle high noise levels while keeping their efficiency. Then, we show that a fast non-blind deconvolution method can be significantly improved by first denoising the blurry image. The proposed approach yields results that are equivalent to those obtained with much more computationally demanding methods.

...read moreread less

Proceedings Article•DOI•

A Geometric Approach for Cross-View Human Action Recognition using Deep Learning

[...]

Antonios Papadakis¹, Eirini Mathe², Evaggelos Spyrou³, Phivos Mylonas²•Institutions (3)

National and Kapodistrian University of Athens¹, Ionian University², National Centre of Scientific Research "Demokritos"³

01 Sep 2019

TL;DR: 3D skeletal joint information is used to create 2D (image) representations which are transformed to the spectral domain using well-known transforms based on a deep Convolutional Neural Network architecture.

...read moreread less

Abstract: In this paper we present an approach for the recognition of human actions which is based on a deep Convolutional Neural Network architecture. More specifically, 3D skeletal joint information is used to create 2D (image) representations. To compensate for potential viewpoint changes, these images are pre-processed using geometric transformations. Then, they are transformed to the spectral domain using well-known transforms. We focus on actions that are close to activities of daily living (ADLs), yet we evaluate our approach using a large-scale action dataset. We cover single-view, cross-view and cross subject cases and thoroughly discuss experimental results and the potential of our approach.

...read moreread less

Proceedings Article•DOI•

Attention-based Convolutional Neural Network for Computer Vision Color Constancy

[...]

Karlo Koscevic¹, Marko Subasic¹, Sven Lončarić¹•Institutions (1)

University of Zagreb¹

27 Oct 2019

TL;DR: In this paper, a new convolutional neural network architecture is proposed to look for the regions, i.e., patches in the image where the most useful information about the scene illumination is contained.

...read moreread less

Abstract: Achieving color constancy is an important part of image preprocessing pipeline of contemporary digital cameras. Its goal is to eliminate the influence of the illumination color on the colors of the objects in the image scene. State-of-the-art results have been achieved with learning-based methods, especially when the deep learning approaches have been applied. Several methods that are combining local patches for global illumination estimations exist. However, in this paper, a new convolutional neural network architecture is proposed. It is trained to look for the regions, i.e., patches in the image where the most useful information about the scene illumination is contained. This is achieved with the attention mechanism stacked on top of the pretrained convolutional neural network. Additionally, the common problem of the lack of data in color constancy benchmark datasets is alleviated utilizing the stage-wise training. Experimental results show that the proposed approach achieves competitive results.

...read moreread less

Proceedings Article•DOI•

Facial Expression Recognition on partial facial sections

[...]

Ryan Melaugh¹, Nazmul Siddique¹, Sonya Coleman¹, Pratheepan Yogarajah¹•Institutions (1)

Ulster University¹

17 Oct 2019

TL;DR: It is shown that using one side of the face only reduces accuracy by 0.34% but at half the computationally time required, and it is demonstrated that using smaller portions of theFace have an expected computation reduction but dont suffer the same degree of accuracy loss.

...read moreread less

Abstract: Research by psychologists have shown that subjects had a preference for a side of a face when it was expressing emotions. This paper seeks to find what accuracies can be attained when only a segment of the face is considered. We show that using one side of the face only reduces accuracy by 0.34% but at half the computationally time required. Various other sections of the face are evaluated for similar performance. We demonstrate that using smaller portions of the face have an expected computation reduction but dont suffer the same degree of accuracy loss. For evaluating we train with a Convolutional Neural Network. To test what portions of a facial image are useful, the full face, half face, eyes, single eye, mouth and half of the mouth are chosen. These images come from the JAFFE, CK+ and KDEF datasets.

...read moreread less

Proceedings Article•DOI•

Cell detection on digitized Pap smear images using ensemble of conventional image processing and deep learning techniques

[...]

Balazs Harangi¹, János Tóth¹, Gergo Bogacsovics¹, David Kupas¹, Laszlo Kovacs¹, Andras Hajdu¹ - Show less +2 more•Institutions (1)

University of Debrecen¹

01 Sep 2019

TL;DR: This paper focuses on the problem of cell segmentation in digitized Pap smear images, which is a prerequisite of automatically detecting cervical cancer in its early stage, and applies an ensemble of FCNN and traditional segmentation approaches that provide sufficiently large diversity according to the most challenging manual annotation-related issues.

...read moreread less

Abstract: In this paper, we focus on the problem of cell segmentation in digitized Pap smear images, which is a prerequisite of automatically detecting cervical cancer in its early stage. According to the trends, we consider deep learning based approaches in the form of applying fully convolutional neural networks (FCNNs). A common bottleneck of deep learning is that large annotated dataset is required for proper training. As large public datasets are not yet available in this field, we have composed a corresponding manually labeled dataset. Though this dataset is quite large, the manual annotation is less reliable in this domain, so we had to apply such a deep learning framework that is able to overcome this issue. Accordingly, we have applied such an ensemble of FCNN and traditional segmentation approaches that provide sufficiently large diversity according to the most challenging manual annotation-related issues, like the inaccurate selection of cell boundaries. We propose ensembles to merge the outputs of the different segmentation methods, which have been proven superior to any of the ensemble members according to our experimental studies.

...read moreread less

Proceedings Article•DOI•

Transfer learning for the classification of video-recorded crowd movements

[...]

Mounir Bendali-Braham¹, Jonathan Weber¹, Germain Forestier¹, Lhassane Idoumghar¹, Pierre-Alain Muller¹ - Show less +1 more•Institutions (1)

University of Upper Alsace¹

01 Sep 2019

TL;DR: This work proposes to fine-tune a model from the TwoStream Inflated 3D architecture, pre-trained on the ImageNet and the Kinetics source datasets, to classify video sequences of crowd movements from the Crowd-11 target dataset, demonstrating its superiority over the state-of-the-art in terms of classification accuracy.

...read moreread less

Abstract: The automatic recognition of a crowd movement captured by a CCTV camera can be of considerable help to security forces whose mission is to ensure the safety of people on the public area. In this context, we propose to fine-tune a model from the TwoStream Inflated 3D architecture, pre-trained on the ImageNet and the Kinetics source datasets, to classify video sequences of crowd movements from the Crowd-11 target dataset. The evaluation of our model demonstrates its superiority over the state-of-the-art in terms of classification accuracy.

...read moreread less

Proceedings Article•DOI•

Cluster-Based Shrinkage of Correlation Matrices for Portfolio Optimization

[...]

Stjepan Begušić¹, Zvonko Kostanjčar¹•Institutions (1)

University of Zagreb¹

24 Oct 2019

TL;DR: It is found that shrinkage methods generally improve out-of-sample portfolio performance, and the proposed cluster-based method yields improved results and portfolios which outperform other considered methods.

...read moreread less

Abstract: The estimation of correlation and covariance matrices from asset return time series is a critical step in financial portfolio optimization. Although sample estimates are reliable when the length of time series is very large compared to the number of assets, in high-dimensional settings estimation issues arise. To reduce estimation errors and mitigate their propagation to out-of-sample performance of portfolios based on noisy estimates, shrinkage methods are applied. In this paper we consider several shrinkage methods for correlation matrix estimation and define a cluster-based shrinkage procedure which introduces information about the structures of communities identified in asset dependence graphs. To test the considered shrinkage methods we apply them in a portfolio optimization scenario using the global minimum variance portfolio, and perform backtests on a large sample of NYSE daily stock return data. We find that shrinkage methods generally improve out-of-sample portfolio performance, and the proposed cluster-based method yields improved results and portfolios which outperform other considered methods.

...read moreread less

Proceedings Article•DOI•

Improving the Generalization of Colorized Image Detection with Enhanced Training of CNN

[...]

Weize Quan¹, Kai Wang¹, Dong-Ming Yan², Denis Pellerin¹, Xiaopeng Zhang² - Show less +1 more•Institutions (2)

University of Grenoble¹, Chinese Academy of Sciences²

23 Sep 2019

TL;DR: This work proposes to first automatically construct negative samples through linear interpolation of paired natural and colorized images and progressively insert these negative samples into the original training dataset and continue to train the network.

...read moreread less

Abstract: Image colorization achieves more and more realistic results with the increasing power of recent deep learning techniques. It becomes more difficult to identify the synthetic colorized images by human eyes. In the literature, handcrafted-feature-based and convolutional neural network (CNN)-based forensic methods are proposed to distinguish between natural images (NIs) and colorized images (CIs). Although a recent CNN-based method achieves very good detection performance, an important issue (i.e., the blind detection problem) still remains and is not thoroughly studied. In this work, we focus on this challenging scenario of blind detection, i.e., no training sample is available from “unknown” colorization algorithm that we may encounter during the testing phase. This blind detection performance can be regarded as the generalization capability of a forensic detector. In this paper, we propose to first automatically construct negative samples through linear interpolation of paired natural and colorized images. Then, we progressively insert these negative samples into the original training dataset and continue to train the network. Experimental results demonstrate that our enhanced training can significantly improve the generalization performance of different CNN models.

...read moreread less

Proceedings Article•DOI•

Deep Learning for Paint Loss Detection with a multiscale, translation invariant network

[...]

Laurens Meeus¹, Shaoguang Huang¹, Bart Devolder², Hélène Dubois¹, Maximiliaan Martens¹, Aleksandra Pizurica¹ - Show less +2 more•Institutions (2)

Ghent University¹, Princeton University²

01 Sep 2019

TL;DR: In this paper, a multiscale deep learning system with dilated convolutions is proposed to detect paint loss in digital paintings, which enables a large receptive field with limited training parameters to avoid overtraining.

...read moreread less

Abstract: We explore the potential of deep learning in digital painting analysis to facilitate condition reporting and to support restoration treatments. We address the problem of paint loss detection and develop a multiscale deep learning system with dilated convolutions that enables a large receptive field with limited training parameters to avoid overtraining. Our model handles efficiently multimodal data that are typically acquired in art investigation. As a case study we use multimodal data of the Ghent Altarpiece. Our results indicate huge potential of the proposed approach in terms of accuracy and also its fast execution, which allows interactivity and continuous learning.

...read moreread less

Proceedings Article•DOI•

On using PointNet Architecture for Human Body Segmentation

[...]

Andrej Jertec¹, David Bojanic¹, Kristijan Bartol¹, Tomislav Pribanic¹, Tomislav Petkovic¹, Slavenka Petrak¹ - Show less +2 more•Institutions (1)

University of Zagreb¹

01 Sep 2019

TL;DR: This work takes advantage of SMPL model which provides human body models in many shapes and sizes in an essentially automatic fashion, therefore avoiding a cumbersome procedure of manual collection and preparation of training data.

...read moreread less

Abstract: In the case of structured data, such as 2D images, many variants of traditional convolution neural network architectures have been successfully proposed. Learning from unstructured sets of data, such as sets of 3D point clouds, is a challenging task due to numerous reasons among which two most important ones are: 3D point cloud is generally (i) unordered and (ii) sparse data set. Therefore, the architectures have been proposed which are invariant to both ordering and number of points in the point cloud. PointNet is one such architecture, originally introduced and demonstrated on the task of classification and segmentation of the ModelNet40 data set. In this work we study the performance of PointNet on an even more demanding task, segmentation of human body parts. Finding enough training data of enough quality is generally a problem in deep learning, and especially for human body segmentation. To that end we take advantage of SMPL model which provides human body models in many shapes and sizes in an essentially automatic fashion, therefore avoiding a cumbersome procedure of manual collection and preparation of training data. Our results show that the proposed PointNet variant trained using SMPL model provides competitive segmentation results on the task of human body segmentation.

...read moreread less

Proceedings Article•DOI•

Fast Fourier Color Constancy and Grayness Index for ISPA Illumination Estimation Challenge

[...]

Yanlin Qian, Ke Chen¹, Huanglin Yu¹•Institutions (1)

South China University of Technology¹

01 Sep 2019

TL;DR: Two submissions to the Illumination Estimation Challenge are introduced: the fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.

...read moreread less

Abstract: We briefly introduce two submissions to the Illumination Estimation Challenge, in the Int'l Workshop on Color Vision, affiliated to the 11th Int'l Symposium on Image and Signal Processing and Analysis. The fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.

...read moreread less

Proceedings Article•DOI•

The Past and the Present of the Color Checker Dataset Misuse

[...]

Nikola Banic¹, Karlo Koscevic¹, Marko Subasic¹, Sven Lončarić¹•Institutions (1)

University of Zagreb¹

01 Sep 2019

TL;DR: A history of the Color Checker dataset usage is given, the origins and reasons for its misuses are described, and the old and new mistakes introduced in the most recent publications that tried to handle the issue are explained to prevent similar future misuses.

...read moreread less

Abstract: The pipelines of digital cameras contain a part for computational color constancy, which aims to remove the influence of the illumination on the scene colors. One of the best known and most widely used benchmark datasets for this problem is the Color Checker dataset. However, due to the improper handling of the black level in its images, this dataset has been widely misused and while some recent publications tried to alleviate the problem, they nevertheless erred and created additional wrong data. This paper gives a history of the Color Checker dataset usage, it describes the origins and reasons for its misuses, and it explains the old and new mistakes introduced in the most recent publications that tried to handle the issue. This should, hopefully, help to prevent similar future misuses.

...read moreread less

Proceedings Article•DOI•

Spline-Like Chebyshev Polynomial Representation for Compressed Sensing

[...]

Tin Vlasic¹, Jelena Ivankovic¹, Azra Tafro¹, Damir Seršić¹•Institutions (1)

University of Zagreb¹

19 Mar 2019

TL;DR: Use of the Chebyshev polynomials of the first kind on intervals for an efficient representation of one-dimensional, continuous-time signals is proposed and offers a new paradigm for efficient processing of analog data on a digital computer.

...read moreread less

Abstract: Compressed sensing (CS) is a technique for signal sampling below the Nyquist rate, based on the assumption that the signal is sparse in some transform domain. The acquired signal is represented in a compressed form that is appropriate for storage, transmission and further processing. In this paper, use of the Chebyshev polynomials of the first kind on intervals for an efficient representation of one-dimensional, continuous-time signals is proposed. To avoid boundary artifacts, a desired number of derivatives are equalized on each interval end in a spline-like fashion. Unlike splines, the proposed system of equations is underdetermined to provide a necessary degree of freedom for achieving sparsity using the l 1 optimization. The obtained parametric model fits into the compressed sensing setup and offers a new paradigm for efficient processing of analog data on a digital computer. Simulation results of the proposed measurement system and an example of data processing are given to prove its potential.

...read moreread less

Proceedings Article•DOI•

Transfer Learning with U-Net type model for Automatic Segmentation of Three Retinal Layers In Optical Coherence Tomography Images

[...]

Ivana Zadro Matovinovic¹, Sven Lončarić¹, Julian Lo², Morgan Heisler², Marinko V. Sarunic² - Show less +1 more•Institutions (2)

University of Zagreb¹, Simon Fraser University²

01 Sep 2019

TL;DR: In this study, the U-Net type model with a ResNet based encoder, pretrained on ImageNet dataset is utilized and the model is combined with postprocessing step to obtain segmented layer boundaries and the one-sided Wilcoxon signed-rank test has shown that the pretrained U-nets type model outperforms the original U- net model for segmenting three regions bounded by four layer boundaries.

...read moreread less

Abstract: Retinal layer analysis on OCT images is a standard procedure used by ophthalmologists to diagnose various diseases. Due to a large number of generated OCT images for each patient, a manual image analysis can be time-consuming and error-prone, which can consequently affect the timeliness and quality of the diagnosis. Therefore, in recent years, a variety of methods, based prevalently on deep learning, have been proposed for the automatic segmentation of retinal layers. In our study, the U-Net type model with a ResNet based encoder, pretrained on ImageNet dataset is utilized. In addition, the model is combined with postprocessing step to obtain segmented layer boundaries. The modified versions of U-Net type model have already been applied to various non-medical imaging segmentation tasks, achieving outstanding results. To investigate whether the pretrained U-Net type model contributes to improvement of retinal layer segmentation, two models are trained and validated on 23 volumes of OCT images with age related macular degeneration (AMD): the U-Net model with pretrained ResNet34 encoder on ImageNet dataset and the original U-Net model, trained from the scratch. The one-sided Wilcoxon signed-rank test has shown that the pretrained U-Net type model outperforms the original U-Net model for segmenting three regions bounded by four layer boundaries.

...read moreread less

Proceedings Article•DOI•

Sparse Time-Frequency Distribution Calculation with an Adaptive Thresholding Algorithm

[...]

Ivan Volaric¹, Victor Sucic¹, Götz Bokelmann²•Institutions (2)

University of Rijeka¹, University of Vienna²

01 Sep 2019

TL;DR: A sparse reconstruction algorithm is applied, which combines the fast intersection of the confidence intervals (FICI) rule and the two step iterative shrinkage/thresholding algorithm (TwIST), denoted as the FICI-TwIST algorithm, to modeled and real-life teleseismic signals to achieve a high resolution sparse TFD with heavily suppressed cross-terms.

...read moreread less

Abstract: Time-frequency distributions (TFD) are powerful tools for the analysis of non-stationary signals; however they are heavily under-used since most of the TFD calculation methods introduce the unwanted artifacts, so called cross-terms. A recently investigated approach of the TFD cross-terms removal enforces the sparsity constraint to the resulting TFD, ultimately leading to a high resolution sparse TFD with heavily suppressed cross-terms. In this paper, we apply a sparse reconstruction algorithm, which combines the fast intersection of the confidence intervals (FICI) rule and the two step iterative shrinkage/thresholding algorithm (TwIST), denoted as the FICI-TwIST algorithm, to modeled and real-life teleseismic signals. The obtained results have been compared, in terms of the resulting TFD concentration and the algorithm execution time, to the state-of-the-art sparse reconstruction algorithms.

...read moreread less

Proceedings Article•DOI•

Deep Learning Approaches for Sparse Recovery in Compressive Sensing

[...]

Elaine Crespo Marques¹, Nilson Maciel¹, Lirida Alves de Barros Naviner¹, Hao Cai², Jun Yang² - Show less +1 more•Institutions (2)

Télécom ParisTech¹, Southeast University²

01 Sep 2019

TL;DR: Improvements to the Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) and a novel neural network are proposed and results show that these propositions can decrease up to 10.8 dB the NMSE value and require fewer layers than if only LISTA is used to estimate the signal.

...read moreread less

Abstract: Compressive sensing enables sparse signals recovery by less measurements than required by the Nyquist rate, so leading to energy and processing saving. Accuracy and complexity improvements can be achieved applying neural network to sparse linear inverse problem. This work focuses on sparse recovery with deep network. Improvements to the Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) and a novel neural network are proposed. Results show that these propositions can decrease up to 10.8 dB the NMSE value and require fewer layers than if only LISTA is used to estimate the signal.

...read moreread less