scispace - formally typeset
Search or ask a question

Showing papers presented at "International Symposium on Image and Signal Processing and Analysis in 2019"


Proceedings ArticleDOI
01 Sep 2019
TL;DR: A straight forward and fast face alignment technique for preprocessing and estimate the face attributes using MobileNetV2 and Nasnet-Mobile, two lightweight CNN (Convolutional Neural Network) architectures that perform similarly well in terms of accuracy and speed.
Abstract: In this paper, we propose two simple yet effective methods to estimate facial attributes in unconstrained images. We use a straight forward and fast face alignment technique for preprocessing and estimate the face attributes using MobileNetV2 and Nasnet-Mobile, two lightweight CNN (Convolutional Neural Network) architectures. Both architectures perform similarly well in terms of accuracy and speed. A comparison with state-of-the-art methods with respect to processing time and accuracy shows that our proposed approach perform faster than the best state-of-the-art model and better than the fastest state-of-the-art model. Moreover, our approach is easy to use and capable of being deployed on mobile devices.

49 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: It is demonstrated that Mask R-CNN can be used to localize cracks on concrete surfaces and obtain their corresponding masks to aid extract other properties that are useful for inspection.
Abstract: In order to avoid possible failures and prevent damage in civil infrastructures, such as tunnels and bridges, inspection should be done on a regular basis. Cracks are one of the earliest indications of degradation, hence, their detection allows preventive measures to be taken to avoid further damage. In this paper, we demonstrate that Mask R-CNN can be used to localize cracks on concrete surfaces and obtain their corresponding masks to aid extract other properties that are useful for inspection. Such a tool can help mitigate the drawbacks of manual inspection by automating crack detection, lowering time consumption in executing this task, reducing costs and increasing the safety of the personnel. To train Mask R-CNN for crack detection we built a groundtruth database of masks on images from a subset of a standard crack dataset. Tests on the trained model achieved a precision value of 93.94 % and a recall of 77.5 %.

26 citations


Proceedings ArticleDOI
22 May 2019
TL;DR: The proposed network takes into consideration the accuracy and the computation cost to enable realtime implementation on underwater visual tasks using end-to-end autoencoder network and is compared to a state-of-the-art method.
Abstract: Visual inspection of underwater structures by vehicles, e.g. remotely operated vehicles (ROVs), plays an important role in scientific, military, and commercial sectors. However, the automatic extraction of information using software tools is hindered by the characteristics of water which degrade the quality of captured videos. As a contribution for restoring the color of underwater images, Underwater Denoising Autoencoder (UDAE) model is developed using a denoising autoencoder with U-Net architecture. The proposed network takes into consideration the accuracy and the computation cost to enable realtime implementation on underwater visual tasks using end-to-end autoencoder network. Underwater vehicles perception is improved by reconstructing captured frames; hence obtaining better performance in underwater tasks. Related learning methods use generative adversarial networks (GANs) to generate color corrected underwater images, and to our knowledge this paper is the first to deal with a single autoencoder capable of producing same or better results. Moreover, image pairs are constructed for training the proposed network, where it is hard to obtain such dataset from underwater scenery. At the end, the proposed model is compared to a state-of-the-art method.

24 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: Evaluated on a challenging realistic video dataset, the results show that both types of models were able to correctly distinguish between normal and abnormal data sequences, with an average F-score of 0.93 and it was determined that pose estimated data compares very well with sensor data.
Abstract: Detecting human abnormal activities is the process of observing rare events that deviate from normality. In this study, an automated camera-based system that is able to detect irregular human behaviour is proposed. PoseNet and OpenPose, which are pre-trained pose estimation models are used to detect the person in the frame and extract the body keypoints. Such data is used to train two types of AutoEncoders based on LSTM and CNN units in a semi-supervised approach where the goal is to learn a general representation of the normal behaviour. Evaluated on a challenging realistic video dataset, the results show that both types of models were able to correctly distinguish between normal and abnormal data sequences, with an average F-score of 0.93. The results also show that the proposed method outperformed similar work done on the same dataset. Furthermore, it was also determined that pose estimated data compares very well with sensor data. This shows that pose estimated data can be informative enough to understand and classify human actions.

22 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: The proposed work aims at analyzing the bees' sound introducing features extraction useful for sound classification techniques and for determine dangerous situations.
Abstract: Last years increase in bee mortality has underlined the necessity of an intensive bee hive monitoring in order to better understand the problems which are seriously affecting the honey bee health. Sound emitted inside a beehive is one of the key parameters for a non-invasive monitoring of their health condition. In this context, the proposed work aims at analyzing the bees' sound introducing features extraction useful for sound classification techniques and for determine dangerous situations. Several experiments on a real scenario have been performed focusing on orphaned colony case and highlighting the potentiality of the proposed approach.

20 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: Two deep learning based methods for flaw detection are presented, YOLO and SSD convolutional neural networks that were tested on a dataset that was acquired by scanning metal blocks containing different types of defects.
Abstract: Non-destructive ultrasonic testing (UT) of materials is used for monitoring critical parts in power plants, aeronautics, oil and gas industry, and space industry. Due to a vast amount of time needed for a human expert to perform inspection it is practical for a computer to take over that task. Some attempts have been made to produce algorithms for automatic UT scan inspection mainly using older, non-flexible analysis methods. In this paper, two deep learning based methods for flaw detection are presented, YOLO and SSD convolutional neural networks. The methods' performance was tested on a dataset that was acquired by scanning metal blocks containing different types of defects. YOLO achieved average precision (AP) of 89.7% while SSD achieved AP of 84.5 %.

18 citations


Proceedings ArticleDOI
31 May 2019
TL;DR: A new tool, called Ra, for de novo genome assembly of long uncorrected reads, is presented, which is a fast and memory friendly assembler based on sequence classification and assembly graphs, developed with large genomes in mind.
Abstract: Advances in sequencing technologies have pushed the limits of genome assemblies beyond imagination. The sheer amount of long read data that is being generated enables the assembly for even the largest and most complex organism for which efficient algorithms are needed. We present a new tool, called Ra, for de novo genome assembly of long uncorrected reads. It is a fast and memory friendly assembler based on sequence classification and assembly graphs, developed with large genomes in mind. It is freely available at https://github.com/lbcbsci/ra.

18 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this paper, a performance comparison between several classic hand-crafted and deep keypoint detector and descriptor methods is given, where a subset of all combinations is paired into detector-descriptor pipelines.
Abstract: The purpose of this study is to give a performance comparison between several classic hand-crafted and deep key-point detector and descriptor methods. In particular, we consider the following classical algorithms: SIFT, SURF, ORB, FAST, BRISK, MSER, HARRIS, KAZE, AKAZE, AGAST, GFTT, FREAK, BRIEF and RootSIFT, where a subset of all combinations is paired into detector-descriptor pipelines. Additionally, we analyze the performance of two recent and perspective deep detector-descriptor models, LF-Net and SuperPoint. Our benchmark relies on the HPSequences dataset that provides real and diverse images under various geometric and illumination changes. We analyze the performance on three evaluation tasks: keypoint verification, image matching and keypoint retrieval. The results show that certain classic and deep approaches are still comparable, with some classic detector-descriptor combinations overperforming pretrained deep models. In terms of the execution times of tested implementations, SuperPoint model is the fastest, followed by ORB.

18 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper proposes two distinct methods to classify based on the temporal information of pain, and does cross-database validations on two benchmark pain databases: BioVid and X-ITE.
Abstract: So far, all studies investigating the facial expression of pain have validated methods on the same database, whereas the cross-database performance is less considered. This may be due to poor performance of well-trained models on other databases. In this paper, we propose two distinct methods to classify based on the temporal information. To explore the generalization capability of pain recognition models, we do cross-database validations on two benchmark pain databases: BioVid and X-ITE. We also experiment with combining both databases. Experimental results (1) show that our methods can be successfully used to classify pain (both methods perform similarly well), (2) demonstrate that the performance is robust by verifying them cross-database, and (3) present that the performance of pain assessment is improved with more data (combined-database).

17 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: The goal of the dataset was to provide an alternative to existing cognitive load focused datasets, usually based around Stroop tasks or working memory tasks, and to implement the cognitive load tasks in a way that would make the responses appropriate for both speech and physiological response analysis, ultimately making it multimodal.
Abstract: This paper presents a dataset for multimodal classification of cognitive load recorded on a sample of students. The cognitive load was induced by way of performing basic arithmetic tasks, while the multimodal aspect of the dataset comes in the form of both speech and physiological responses to those tasks. The goal of the dataset was two-fold: firstly to provide an alternative to existing cognitive load focused datasets, usually based around Stroop tasks or working memory tasks; and secondly to implement the cognitive load tasks in a way that would make the responses appropriate for both speech and physiological response analysis, ultimately making it multimodal. The paper also presents preliminary classification benchmarks, in which SVM classifiers were trained and evaluated solely on either speech or physiological signals and on combinations of the two. The multimodal nature of the classifiers may provide improvements on results on this inherently challenging machine learning problem because it provides more data about both the intra-participant and inter-participant differences in how cognitive load manifests itself in affective responses.

13 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: An automated solution based on deep learning techniques using convolutional neural networks to identify the gender of a person with panoramic dental x-ray images of patients with European origin, with the images being taken by a wide range of orthopantomographs.
Abstract: Identifying the gender of a person is one of the fundamental tasks in forensic medicine. One possible application is right after a catastrophic event such as a mass disaster with a high victim count. In such cases it is necessary to identify the people involved which can require a high number of forensic experts, depending on the scale of the event. With panoramic dental x-ray images the biological gender of a person can be estimated by analyzing skeletal structures that express sexual dimorphism. Current methods require the manual measurement of a wide array of mandibular parameters which are then manually compared to references based on these measurements and assumed ethnicity of the people involved. We propose an automated solution based on deep learning techniques using convolutional neural networks. Our data consists of 4000 panoramic dental x-ray images of patients with European origin, with the images being taken by a wide range of orthopantomographs. Our automated method can estimate 64 images per second on contemporary hardware, it doesn't require human intervention for estimation and it achieves state-of-the-art results with an accuracy of 96.87% ± 0.96%.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A simple convolutional neural network was able to win the ISISPA color constancy competition and partial reimplemen-tation of the neural architecture from [1] would have shown even better results in this setup.
Abstract: A simple convolutional neural network was able to win the ISISPA color constancy competition. Partial reimplemen-tation of the neural architecture from [1] would have shown even better results in this setup.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper proposed a preprocessing method of face morphing to improve the ability of the classifier based on machine learning, by using the average face, which removes person-specific face differences and magnifies the signal of subtle expressions.
Abstract: Current facial expression classifiers are created based on face images with high-intensity expressions. Therefore, the recognition accuracy is low for subtle facial expressions. In this paper, we proposed a preprocessing method of face morphing to improve the ability of the classifier based on machine learning. That is, by using the average face, it removes person-specific face differences and magnifies the signal of subtle expressions. In addition, we created an artificial subtle facial expression image dataset by using average and morphed faces for both uses in our algorithm and its verification. Finally, our proposed method, as a preprocessing in a machine learning based recognizer, significantly improved the recognition accuracy of a facial expression recognition system such as CNN.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: By measuring the centrality of the various sectors in the network it is found that the financial sector is regarded as the most important for the majority of the dataset.
Abstract: Network based methods to study the financial markets have been popular due to their ability to represent a complex system in a simple manner. We are interested to see if we can measure the influence between various companies by using partial correlation. Calculating partial correlation can be challenging with financial data so to rectify this we use the SPACE estimator. With this estimator we infer networks from daily S&P500 returns, study how these networks vary over time and draw parallels to the macroeconomic events that may explain the changes. We see that companies tend to have more connections to those in the same sector and some sectors tend to be more self contained than others. By measuring the centrality of the various sectors in the network we find that the financial sector is regarded as the most important for the majority of the dataset. Finally we show there is mild negative correlation between the centrality of a company and its out-of-sample risk.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this article, a fast non-blind deconvolution method is proposed to handle high noise levels while keeping their efficiency, which is equivalent to those obtained with much more computationally demanding methods.
Abstract: The goal of blind image deblurring is to recover a sharp image from a motion blurred one without knowing the camera motion. Current state-of-the-art methods have a remarkably good performance on images with no noise or very low noise levels. However, the noiseless assumption is not realistic considering that low light conditions are the main reason for the presence of motion blur due to requiring longer exposure times. In fact, motion blur and moderate to high noise often appear together. Most works approach this problem by first estimating the blur kernel k and then deconvolving the noisy blurred image. In this work, we first show that current state-of-the-art kernel estimation methods based on the l 0 gradient prior can be adapted to handle high noise levels while keeping their efficiency. Then, we show that a fast non-blind deconvolution method can be significantly improved by first denoising the blurry image. The proposed approach yields results that are equivalent to those obtained with much more computationally demanding methods.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: 3D skeletal joint information is used to create 2D (image) representations which are transformed to the spectral domain using well-known transforms based on a deep Convolutional Neural Network architecture.
Abstract: In this paper we present an approach for the recognition of human actions which is based on a deep Convolutional Neural Network architecture. More specifically, 3D skeletal joint information is used to create 2D (image) representations. To compensate for potential viewpoint changes, these images are pre-processed using geometric transformations. Then, they are transformed to the spectral domain using well-known transforms. We focus on actions that are close to activities of daily living (ADLs), yet we evaluate our approach using a large-scale action dataset. We cover single-view, cross-view and cross subject cases and thoroughly discuss experimental results and the potential of our approach.

Proceedings ArticleDOI
27 Oct 2019
TL;DR: In this paper, a new convolutional neural network architecture is proposed to look for the regions, i.e., patches in the image where the most useful information about the scene illumination is contained.
Abstract: Achieving color constancy is an important part of image preprocessing pipeline of contemporary digital cameras. Its goal is to eliminate the influence of the illumination color on the colors of the objects in the image scene. State-of-the-art results have been achieved with learning-based methods, especially when the deep learning approaches have been applied. Several methods that are combining local patches for global illumination estimations exist. However, in this paper, a new convolutional neural network architecture is proposed. It is trained to look for the regions, i.e., patches in the image where the most useful information about the scene illumination is contained. This is achieved with the attention mechanism stacked on top of the pretrained convolutional neural network. Additionally, the common problem of the lack of data in color constancy benchmark datasets is alleviated utilizing the stage-wise training. Experimental results show that the proposed approach achieves competitive results.

Proceedings ArticleDOI
17 Oct 2019
TL;DR: It is shown that using one side of the face only reduces accuracy by 0.34% but at half the computationally time required, and it is demonstrated that using smaller portions of theFace have an expected computation reduction but dont suffer the same degree of accuracy loss.
Abstract: Research by psychologists have shown that subjects had a preference for a side of a face when it was expressing emotions. This paper seeks to find what accuracies can be attained when only a segment of the face is considered. We show that using one side of the face only reduces accuracy by 0.34% but at half the computationally time required. Various other sections of the face are evaluated for similar performance. We demonstrate that using smaller portions of the face have an expected computation reduction but dont suffer the same degree of accuracy loss. For evaluating we train with a Convolutional Neural Network. To test what portions of a facial image are useful, the full face, half face, eyes, single eye, mouth and half of the mouth are chosen. These images come from the JAFFE, CK+ and KDEF datasets.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper focuses on the problem of cell segmentation in digitized Pap smear images, which is a prerequisite of automatically detecting cervical cancer in its early stage, and applies an ensemble of FCNN and traditional segmentation approaches that provide sufficiently large diversity according to the most challenging manual annotation-related issues.
Abstract: In this paper, we focus on the problem of cell segmentation in digitized Pap smear images, which is a prerequisite of automatically detecting cervical cancer in its early stage. According to the trends, we consider deep learning based approaches in the form of applying fully convolutional neural networks (FCNNs). A common bottleneck of deep learning is that large annotated dataset is required for proper training. As large public datasets are not yet available in this field, we have composed a corresponding manually labeled dataset. Though this dataset is quite large, the manual annotation is less reliable in this domain, so we had to apply such a deep learning framework that is able to overcome this issue. Accordingly, we have applied such an ensemble of FCNN and traditional segmentation approaches that provide sufficiently large diversity according to the most challenging manual annotation-related issues, like the inaccurate selection of cell boundaries. We propose ensembles to merge the outputs of the different segmentation methods, which have been proven superior to any of the ensemble members according to our experimental studies.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work proposes to fine-tune a model from the TwoStream Inflated 3D architecture, pre-trained on the ImageNet and the Kinetics source datasets, to classify video sequences of crowd movements from the Crowd-11 target dataset, demonstrating its superiority over the state-of-the-art in terms of classification accuracy.
Abstract: The automatic recognition of a crowd movement captured by a CCTV camera can be of considerable help to security forces whose mission is to ensure the safety of people on the public area. In this context, we propose to fine-tune a model from the TwoStream Inflated 3D architecture, pre-trained on the ImageNet and the Kinetics source datasets, to classify video sequences of crowd movements from the Crowd-11 target dataset. The evaluation of our model demonstrates its superiority over the state-of-the-art in terms of classification accuracy.

Proceedings ArticleDOI
24 Oct 2019
TL;DR: It is found that shrinkage methods generally improve out-of-sample portfolio performance, and the proposed cluster-based method yields improved results and portfolios which outperform other considered methods.
Abstract: The estimation of correlation and covariance matrices from asset return time series is a critical step in financial portfolio optimization. Although sample estimates are reliable when the length of time series is very large compared to the number of assets, in high-dimensional settings estimation issues arise. To reduce estimation errors and mitigate their propagation to out-of-sample performance of portfolios based on noisy estimates, shrinkage methods are applied. In this paper we consider several shrinkage methods for correlation matrix estimation and define a cluster-based shrinkage procedure which introduces information about the structures of communities identified in asset dependence graphs. To test the considered shrinkage methods we apply them in a portfolio optimization scenario using the global minimum variance portfolio, and perform backtests on a large sample of NYSE daily stock return data. We find that shrinkage methods generally improve out-of-sample portfolio performance, and the proposed cluster-based method yields improved results and portfolios which outperform other considered methods.

Proceedings ArticleDOI
23 Sep 2019
TL;DR: This work proposes to first automatically construct negative samples through linear interpolation of paired natural and colorized images and progressively insert these negative samples into the original training dataset and continue to train the network.
Abstract: Image colorization achieves more and more realistic results with the increasing power of recent deep learning techniques. It becomes more difficult to identify the synthetic colorized images by human eyes. In the literature, handcrafted-feature-based and convolutional neural network (CNN)-based forensic methods are proposed to distinguish between natural images (NIs) and colorized images (CIs). Although a recent CNN-based method achieves very good detection performance, an important issue (i.e., the blind detection problem) still remains and is not thoroughly studied. In this work, we focus on this challenging scenario of blind detection, i.e., no training sample is available from “unknown” colorization algorithm that we may encounter during the testing phase. This blind detection performance can be regarded as the generalization capability of a forensic detector. In this paper, we propose to first automatically construct negative samples through linear interpolation of paired natural and colorized images. Then, we progressively insert these negative samples into the original training dataset and continue to train the network. Experimental results demonstrate that our enhanced training can significantly improve the generalization performance of different CNN models.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this paper, a multiscale deep learning system with dilated convolutions is proposed to detect paint loss in digital paintings, which enables a large receptive field with limited training parameters to avoid overtraining.
Abstract: We explore the potential of deep learning in digital painting analysis to facilitate condition reporting and to support restoration treatments. We address the problem of paint loss detection and develop a multiscale deep learning system with dilated convolutions that enables a large receptive field with limited training parameters to avoid overtraining. Our model handles efficiently multimodal data that are typically acquired in art investigation. As a case study we use multimodal data of the Ghent Altarpiece. Our results indicate huge potential of the proposed approach in terms of accuracy and also its fast execution, which allows interactivity and continuous learning.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work takes advantage of SMPL model which provides human body models in many shapes and sizes in an essentially automatic fashion, therefore avoiding a cumbersome procedure of manual collection and preparation of training data.
Abstract: In the case of structured data, such as 2D images, many variants of traditional convolution neural network architectures have been successfully proposed. Learning from unstructured sets of data, such as sets of 3D point clouds, is a challenging task due to numerous reasons among which two most important ones are: 3D point cloud is generally (i) unordered and (ii) sparse data set. Therefore, the architectures have been proposed which are invariant to both ordering and number of points in the point cloud. PointNet is one such architecture, originally introduced and demonstrated on the task of classification and segmentation of the ModelNet40 data set. In this work we study the performance of PointNet on an even more demanding task, segmentation of human body parts. Finding enough training data of enough quality is generally a problem in deep learning, and especially for human body segmentation. To that end we take advantage of SMPL model which provides human body models in many shapes and sizes in an essentially automatic fashion, therefore avoiding a cumbersome procedure of manual collection and preparation of training data. Our results show that the proposed PointNet variant trained using SMPL model provides competitive segmentation results on the task of human body segmentation.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: Two submissions to the Illumination Estimation Challenge are introduced: the fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.
Abstract: We briefly introduce two submissions to the Illumination Estimation Challenge, in the Int'l Workshop on Color Vision, affiliated to the 11th Int'l Symposium on Image and Signal Processing and Analysis. The fourier-transform-based submission is ranked 3rd, and the statistical Gray-pixel-based one ranked 6th.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A history of the Color Checker dataset usage is given, the origins and reasons for its misuses are described, and the old and new mistakes introduced in the most recent publications that tried to handle the issue are explained to prevent similar future misuses.
Abstract: The pipelines of digital cameras contain a part for computational color constancy, which aims to remove the influence of the illumination on the scene colors. One of the best known and most widely used benchmark datasets for this problem is the Color Checker dataset. However, due to the improper handling of the black level in its images, this dataset has been widely misused and while some recent publications tried to alleviate the problem, they nevertheless erred and created additional wrong data. This paper gives a history of the Color Checker dataset usage, it describes the origins and reasons for its misuses, and it explains the old and new mistakes introduced in the most recent publications that tried to handle the issue. This should, hopefully, help to prevent similar future misuses.

Proceedings ArticleDOI
19 Mar 2019
TL;DR: Use of the Chebyshev polynomials of the first kind on intervals for an efficient representation of one-dimensional, continuous-time signals is proposed and offers a new paradigm for efficient processing of analog data on a digital computer.
Abstract: Compressed sensing (CS) is a technique for signal sampling below the Nyquist rate, based on the assumption that the signal is sparse in some transform domain. The acquired signal is represented in a compressed form that is appropriate for storage, transmission and further processing. In this paper, use of the Chebyshev polynomials of the first kind on intervals for an efficient representation of one-dimensional, continuous-time signals is proposed. To avoid boundary artifacts, a desired number of derivatives are equalized on each interval end in a spline-like fashion. Unlike splines, the proposed system of equations is underdetermined to provide a necessary degree of freedom for achieving sparsity using the l 1 optimization. The obtained parametric model fits into the compressed sensing setup and offers a new paradigm for efficient processing of analog data on a digital computer. Simulation results of the proposed measurement system and an example of data processing are given to prove its potential.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this study, the U-Net type model with a ResNet based encoder, pretrained on ImageNet dataset is utilized and the model is combined with postprocessing step to obtain segmented layer boundaries and the one-sided Wilcoxon signed-rank test has shown that the pretrained U-nets type model outperforms the original U- net model for segmenting three regions bounded by four layer boundaries.
Abstract: Retinal layer analysis on OCT images is a standard procedure used by ophthalmologists to diagnose various diseases. Due to a large number of generated OCT images for each patient, a manual image analysis can be time-consuming and error-prone, which can consequently affect the timeliness and quality of the diagnosis. Therefore, in recent years, a variety of methods, based prevalently on deep learning, have been proposed for the automatic segmentation of retinal layers. In our study, the U-Net type model with a ResNet based encoder, pretrained on ImageNet dataset is utilized. In addition, the model is combined with postprocessing step to obtain segmented layer boundaries. The modified versions of U-Net type model have already been applied to various non-medical imaging segmentation tasks, achieving outstanding results. To investigate whether the pretrained U-Net type model contributes to improvement of retinal layer segmentation, two models are trained and validated on 23 volumes of OCT images with age related macular degeneration (AMD): the U-Net model with pretrained ResNet34 encoder on ImageNet dataset and the original U-Net model, trained from the scratch. The one-sided Wilcoxon signed-rank test has shown that the pretrained U-Net type model outperforms the original U-Net model for segmenting three regions bounded by four layer boundaries.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A sparse reconstruction algorithm is applied, which combines the fast intersection of the confidence intervals (FICI) rule and the two step iterative shrinkage/thresholding algorithm (TwIST), denoted as the FICI-TwIST algorithm, to modeled and real-life teleseismic signals to achieve a high resolution sparse TFD with heavily suppressed cross-terms.
Abstract: Time-frequency distributions (TFD) are powerful tools for the analysis of non-stationary signals; however they are heavily under-used since most of the TFD calculation methods introduce the unwanted artifacts, so called cross-terms. A recently investigated approach of the TFD cross-terms removal enforces the sparsity constraint to the resulting TFD, ultimately leading to a high resolution sparse TFD with heavily suppressed cross-terms. In this paper, we apply a sparse reconstruction algorithm, which combines the fast intersection of the confidence intervals (FICI) rule and the two step iterative shrinkage/thresholding algorithm (TwIST), denoted as the FICI-TwIST algorithm, to modeled and real-life teleseismic signals. The obtained results have been compared, in terms of the resulting TFD concentration and the algorithm execution time, to the state-of-the-art sparse reconstruction algorithms.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: Improvements to the Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) and a novel neural network are proposed and results show that these propositions can decrease up to 10.8 dB the NMSE value and require fewer layers than if only LISTA is used to estimate the signal.
Abstract: Compressive sensing enables sparse signals recovery by less measurements than required by the Nyquist rate, so leading to energy and processing saving. Accuracy and complexity improvements can be achieved applying neural network to sparse linear inverse problem. This work focuses on sparse recovery with deep network. Improvements to the Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) and a novel neural network are proposed. Results show that these propositions can decrease up to 10.8 dB the NMSE value and require fewer layers than if only LISTA is used to estimate the signal.