scispace - formally typeset
Search or ask a question

Showing papers in "Pattern Recognition and Image Analysis in 2016"


Journal ArticleDOI
TL;DR: Deep learning methods for image classification and object detection are overviewed and such deep models as autoencoders, restricted Boltzmann machines and convolutional neural networks are considered.
Abstract: Deep learning methods for image classification and object detection are overviewed. In particular we consider such deep models as autoencoders, restricted Boltzmann machines and convolutional neural networks. Existing software packages for deep learning problems are compared.

230 citations


Journal ArticleDOI
TL;DR: This paper targets Indian sign recognition area based on dynamic hand gesture recognition techniques in real-time scenario and would be helpful in teaching and communication of hearing impaired persons.
Abstract: Needs and new technologies always inspire people to make new ways to interact with machines. This interaction can be for a specific purpose or a framework which can be applied to many applications. Sign language recognition is a very important area where an easiness in interaction with human or machine will help a lot of people. At this time, India has 2.8M people who can't speak or can't hear properly. This paper targets Indian sign recognition area based on dynamic hand gesture recognition techniques in real-time scenario. The captured video was converted to HSV color space for pre-processing and then segmentation was done based on skin pixels. Also Depth information was used in parallel to get more accurate results. Hu-Moments and motion trajectory were extracted from the image frames and the classification of gestures was done by Support Vector Machine. The system was tested with webcam as well as with MS Kinect. This type of system would be helpful in teaching and communication of hearing impaired persons.

85 citations


Journal ArticleDOI
TL;DR: This paper proposed a method to remove a noise that is popular in biomedicine that can be considered as a combination of Gaussian and Poisson noises, based on the total variation of an image intensity (brightness) function.
Abstract: There are many modern devices are used to create digital images. These devices use optical effects to create images. Therefore, the image quality depends on quality of optical sensors. Because of the limits of technology, these sensors cannot reconstruct the images perfectly, and always include some defects. One from these defects is noise. The noise reduces image quality and result of image processing. The image noises can be classified into some types: Gaussian noise, Poisson noise, speckle noise and so on. Depending on particular noises, we have efficient methods to remove them. There is no existing a universal method to remove all noises effectively. In this paper, we proposed a method to remove a noise that is popular in biomedicine. This noise can be considered as a combination of Gaussian and Poisson noises. Our method is based on the total variation of an image intensity (brightness) function.

44 citations


Journal ArticleDOI
TL;DR: In this paper, a fast way to implement the Niblack binarization algorithm is described, which uses not only the integral image for the local mean values calculation, but also the second order integral image (SOI) for local variance calculation, and a generalization of integral image representation, called ''k-order integral image'' for fast calculation of higher order local statistics.
Abstract: A fast way to implement the Niblack binarization algorithm is described. It uses not only the integral image for the local mean values calculation, but also the second order integral image for the local variance calculation. Following the proposed approach the time of segmentation has been significantly reduced providing the possibility of its use in practice. The generalization of integral image representation, called `k-order integral image' could be used for fast calculation of higher order local statistics. An example of algorithm for the segmentation of cells and Chlamydial inclusions on microscope images, containing the steps for color deconvolution and fast adaptive local binarization is presented.

31 citations


Journal ArticleDOI
TL;DR: In this article, a real-time approach to detect and localise defects in grey-scale textures within a Compressed Sensing framework is presented, where a Gaussian Mixture model is trained with features extracted from a handful of defect-free texture samples and novelty detection of texture samples is performed by comparing each pixel to the likelihood obtained in the training process.
Abstract: We present a real-time approach to detect and localise defects in grey-scale textures within a Compressed Sensing framework. Inspired by recent results in texture classification, we use compressed local grey-scale patches for texture description. In a first step, a Gaussian Mixture model is trained with the features extracted from a handful of defect-free texture samples. In a second step, the novelty detection of texture samples is performed by comparing each pixel to the likelihood obtained in the training process. The inspection stage is embedded into a multi-scale framework to enable real-time defect detection and localisation. The performance of compressed grey-scale patches for texture error detection is evaluated on two independent datasets. The proposed method is able to outperform the performance of non-compressed grey-scale patches in terms of accuracy and speed.

25 citations


Journal ArticleDOI
TL;DR: A procedure for estimating the constant parameters of autoregressive models according to the given type of model and to the real image is presented and whether it is possible to use the estimated parameters for image restoration is investigated.
Abstract: The text considers methods for estimating the parameters of autoregressive models of images. Special attention is given to estimating the internal autoregressions and its parameters in doubly stochastic image models. A procedure for estimating the constant parameters of autoregressive models according to the given type of model and to the real image is presented. We investigate whether it is possible to use the estimated parameters for image restoration. In addition we presents an algorithm for restoring images. This algorithm combines pseudogradient and nonlinear Kalman estimations. The efficiency of different procedures with respect to simulated and real images is analyzed.

25 citations


Journal ArticleDOI
TL;DR: It is shown that the analysis of compactness of metric spaces may lead to some heuristic cluster criteria commonly used in cluster analysis, and a key concept of a ρ-network arises as a subset of points that allows one to estimate an arbitrary distance in an arbitrary metric configuration.
Abstract: In the context of the algebraic approach to recognition of Yu.I. Zhuravlev's scientific school, metric analysis of feature descriptions is necessary to obtain adequate formulations for poorly formalized recognition/classification problems. Formalization of recognition problems is a cross-disciplinary issue between supervised machine learning and unsupervised machine learning. This work presents the results of the analysis of compact metric spaces arising during the formalization of recognition problems. Necessary and sufficient conditions of compactness of metric spaces over lattices of the sets of feature descriptions are analyzed, and approaches to the completion of the discrete metric spaces (completion by lattice expansion or completion by variation of estimate) are formulated. It is shown that the analysis of compactness of metric spaces may lead to some heuristic cluster criteria commonly used in cluster analysis. During the analysis of the properties of compactness, a key concept of a ?-network arises as a subset of points that allows one to estimate an arbitrary distance in an arbitrary metric configuration. The analysis of compactness properties and the conceptual apparatus introduced (?-networks, their quality functionals, the metric range condition, i- and ?-spectra, ?-neighborhood in a metric cone, ?-isomorphism of complete weighted graphs, etc.) allow one to apply the methods of functional analysis, probability theory, metric geometry, and graph theory to the analysis of poorly formalized problems of recognition and classification.

22 citations


Journal ArticleDOI
TL;DR: Using the concepts of ρ-isomorphism and σ-completion of metric configurations, a system of criteria for assessing the properties of “generalized density” is obtained and points to a new plethora of algorithms for searching metric condensations.
Abstract: In order to obtain tractable formal descriptions of poorly formalized problems within the context of the algebraic approach to pattern recognition, we develop methods for analyzing metric configurations. In this paper, using the concepts of ź-isomorphism and ź-completion of metric configurations, a system of criteria for assessing the properties of "generalized density" is obtained. The analysis of the density properties along the axes of a metric configuration allowed us to formulate methods for calculating the topological neighborhoods of points and for finding the "grains" of metric condensations. The theoretical results point to a new plethora of algorithms for searching metric condensations ź methods based on the "restoration" of the set (the condensation searched) using the data on the components of the projection of the set on the axes of the metric configuration. The only mandatory parameters of any algorithm of this family of algorithms are the metric itself and the distribution of ź, which characterizes the accuracy of the values of the metric.

21 citations


Journal ArticleDOI
TL;DR: The technology of facial expression recognition is a challenging problem in the field of intelligent human-computer interaction and an algorithm based on the Gabor wavelet transformation and non-negative matrix factorization (G-NMF) is presented.
Abstract: The technology of facial expression recognition is a challenging problem in the field of intelligent human-computer interaction. An algorithm based on the Gabor wavelet transformation and non-negative matrix factorization (G-NMF) is presented. The main process includes image preprocessing, feature extraction and classification. At first, the face region containing emotional information is obtained and normalized. Then, expressional features are extracted by Gabor wavelet transformation and the high-dimensional data are reduced by non-negative matrix factorization (NMF). Finally, two-layer classifier (TLC) is designed for expression recognition. Experiments are done on JAFFE facial expressions database. The results show that the method proposed has a better performance.

20 citations


Journal ArticleDOI
TL;DR: A system of criteria for estimating association rules is developed that can be used to automate the analysis of properties and to compare various models based on association rules when solving pattern recognition problems.
Abstract: This paper addresses the problem of association rule extraction. To extract quantitative association rules from given sets of observations, a stochastic method is proposed. The developed method improves the reliability and interpretability of recognition models based on association rules, employs the stochastic approach to search through various combinations of sets of elements in association rules, and uses a priori information about the informativity of intervals of feature values. A system of criteria for estimating association rules is developed that can be used to automate the analysis of properties and to compare various models based on association rules when solving pattern recognition problems.

18 citations


Journal ArticleDOI
TL;DR: A new linear method that is based on dual quaternions and extends the work of Daniilid is 1999 (IJRR) for SCARA robots is presented and a subsequent nonlinear optimization is proposed to improve the accuracy.
Abstract: In SCARA robots, which are often used in industrial applications, all joint axes are parallel, covering three degrees of freedom in translation and one degree of freedom in rotation. Therefore, conventional approaches for the hand-eye calibration of articulated robots cannot be used for SCARA robots. In this paper, we present a new linear method that is based on dual quaternions and extends the work of Daniilid is 1999 (IJRR) for SCARA robots. To improve the accuracy, a subsequent nonlinear optimization is proposed. We address several practical implementation issues and show the effectiveness of the method by evaluating it on synthetic and real data.

Journal ArticleDOI
TL;DR: This paper purposes that skin detection in a digital color image can be significantly improved by employing automated color space switching and a system with three robust algorithms has been built based on different color spaces towards automatic skin classification in a 2D image.
Abstract: Skin detection is very popular and has vast applications among researchers in computer vision and human computer interaction. The skin-color changes beyond comparable limits with considerable change in the nature of the light source. Different properties are taken into account when the colors are represented in different color spaces. However, a unique color space has not been found yet to adjust the needs of all illumination changes that can occur to practically similar objects. Therefore a dynamic skin color model must be constructed for robust skin pixel detection, which can cope with natural changes in illumination. This paper purposes that skin detection in a digital color image can be significantly improved by employing automated color space switching. A system with three robust algorithms has been built based on different color spaces towards automatic skin classification in a 2D image. These algorithms are based on the statistical mean of value of the skin pixels in the image. We also take Bayesian approaches to discriminate between skin-alike and non-skin pixels to avoid noise. This work is tested on a set of images which was captured in varying light conditions from highly illuminated to almost dark.

Journal ArticleDOI
TL;DR: The experiment results show that the proposed algorithm effectively reduces the dimension of the classification features and maintains a good classification performance.
Abstract: Local binary patterns was used to distinguish the Photorealistic Computer Graphics and Photographic Images, however the dimension of the extracted features is too high. Accordingly, the Local Ternary Count based on the Local Ternary Patterns and the Local Binary Count was developed in this study. Furthermore, a novel algorithm is presented based on the Local Ternary Count to classify photorealistic Computer Graphics and Photographic images. The experiment results show that the proposed algorithm effectively reduces the dimension of the classification features and maintains a good classification performance.

Journal ArticleDOI
TL;DR: The goal of this paper is to examine possibilities of genetic algorithm application for segmentation of digital image data, implementation of this algorithm, and to create tools for its testing.
Abstract: In consideration of living organisms' ability to endure for years, and their ability to adapt to surrounding environment, the mechanism of evolution is the inspiration for creating a new genetic algorithm. The goal of this paper is to examine possibilities of genetic algorithm application for segmentation of digital image data, implementation of this algorithm, and to create tools for its testing. The next goal is to examine possible choices of algorithm's parameters, and to compare quality of the results with other segmentation methods within various image data.

Journal ArticleDOI
TL;DR: The authors show that the simulation results make it possible to perform the classification of structural elements of the signal using symbolic approximation in the frequency domain.
Abstract: In this work, a model of a geoacoustic emission signal constructed on the basis of methods of sparse approximation is presented and a model identification algorithm is proposed. The authors show that the simulation results make it possible to perform the classification of structural elements of the signal using symbolic approximation in the frequency domain.

Journal ArticleDOI
TL;DR: An evaluation of six well established line segment distance functions within the scope of line segment matching shows analytically, using synthetic data, the properties of the distance functions with respect to rotation, translation, and scaling.
Abstract: In this paper we present an evaluation of six well established line segment distance functions within the scope of line segment matching. We show analytically, using synthetic data, the properties of the distance functions with respect to rotation, translation, and scaling. The evaluation points out the main characteristics of the distance functions. In addition, we demonstrate the practical relevance of line segment matching and introduce a new distance function.

Journal ArticleDOI
D. J. Kim1
TL;DR: A facial expression recognition system using active shape model (ASM) landmark information and appearance-based classification algorithm, i.e., embedded hidden Markov model (EHMM), which shows performance improvements and weight factors of probability resulted from EHMM.
Abstract: Facial expression recognition is a challenging field in numerous researches, and impacts important applications in many areas such as human-computer interaction and data-driven animation, etc. Therefore, this paper proposes a facial expression recognition system using active shape model (ASM) landmark information and appearance-based classification algorithm, i.e., embedded hidden Markov model (EHMM). First, we use ASM landmark information for facial image normalization and weight factors of probability resulted from EHMM. The weight factor is calculated through investigating Kullback-Leibler (KL) divergence of best feature with high discrimination power. Next, we introduce the appearance-based recognition algorithm for classification of emotion states. Here, appearance-based recognition means the EHMM algorithm using two-dimensional discrete cosine transform (2D-DCT) feature vector. The performance evaluation of proposed method was performed with the CK facial expression database and the JAFFE database. As a result, the method using ASM information showed performance improvements of 6.5 and 2.5% compared to previous method using ASM-based face alignment for CK database and JAFFE database, respectively.

Journal ArticleDOI
TL;DR: A novel approach to automated classification of long-bone fractures based on the analysis of an input X-ray image using certain geometric properties of digital curves such as relaxed digital straight line segments (RDSS), arcs, discrete curvature, and concavity index.
Abstract: The classification of fractured of a patient plays an important role in orthopaedic evaluation and diagnosis. It not only aids in assessing the severity of the disease or injury but also serves as a basis of treatment or surgical correction. This paper proposes a novel approach to automated classification of long-bone fractures based on the analysis of an input X-ray image. The method consists of four major steps: (i) extraction of the bone-contour from a given X-ray image, (ii) identification of fracture-points or cracks, (iii) determination of an equivalent set of geometric features in tune with the Muller-AO clinical classification of fractures, and (iv) identification and detailed assessment of the fracture-type. The decision procedure makes use of certain geometric properties of digital curves such as relaxed digital straight line segments (RDSS), arcs, discrete curvature, and concavity index. The proposed method for the analysis of fractures is applied on different types of bone-images and is observed to have produced correct classification in most of the test-cases.

Journal ArticleDOI
TL;DR: An automatic system for classification of welding defects from radiographic images and compare with KNN and SVM classifiers is described and classified and recognized the linear defects such as lack of penetrations, incomplete fusion and external undercut.
Abstract: Welding defects detection and classification is very important to guarantee the welding quality. Over the last 30 years, there has been a large amount of research attempting to develop an automatic (or semiautomatic) system for the detection and classification of weld defects in continuous welds using radiography. In this paper, we describe an automatic system for classification of welding defects from radiographic images and compare with KNN and SVM classifiers. We classify and recognize the linear defects such as lack of penetrations, incomplete fusion and external undercut. Experimental results have shown the classification method is useful for the lengthy defects and obtained through our method is better than the two classifiers methods.

Journal ArticleDOI
TL;DR: A camerabased 3D pose estimation system based on a Kalman-filter is proposed and evaluated against previously published methods for the same problem.
Abstract: Knowledge about relative poses within a tractor/trailer combination is a vital prerequisite for kinematic modelling and trajectory estimation. In case of autonomous vehicles or driver assistance systems, for example, the monitoring of an attached passive trailer is crucial for operational safety. We propose a camerabased 3D pose estimation system based on a Kalman-filter. It is evaluated against previously published methods for the same problem.

Journal ArticleDOI
TL;DR: Efficiency analysis of some information theoretic measures that can be used in image registration as objective functions is carried out and Renyi entropy potentially provides a faster convergence rate and lower variance of parameters’ estimates when using recurrent image registration algorithms.
Abstract: Efficiency analysis of some information theoretic measures that can be used in image registration as objective functions is carried out. Shannon mutual information, Renyi and Tsallis entropy are examined using synthesized images with correlation function, intensity and noise distributions close to Gaussian. Results show that Renyi entropy potentially provides a faster convergence rate and lower variance of parameters' estimates when using recurrent image registration algorithms. According to these criteria, Tsallis entropy provides a little worse results; however, it has a larger effective range. Shannon mutual information loses to both entropy measures. Moreover, it is more sensitive to noise. Nevertheless, Shannon mutual information is more effective in terms of computational complexity.

Journal ArticleDOI
TL;DR: This paper is devoted to construction of a hybrid intelligent system of express-diagnostics of possible information security attackers (HIS DIVNAR) based on a synergy of several sciences and scientific directions.
Abstract: This paper is devoted to construction of a hybrid intelligent system of express-diagnostics of possible information security attackers (HIS DIVNAR) based on a synergy of several sciences and scientific directions: test pattern recognition; discrete mathematics; threshold and fuzzy logic; artificial intelligence; finite state machines (FSM) theory; reliability; theory of separating systems; theory of probability and mathematical statistics; and cognitive means. The proposed approach and basis of the mathematical apparatus are fragmentarily given for constructing HIS DIVNAR; that consists of four components: the first component, called IS DIOS, is designed for the express-diagnostics of organizational stress of the subject; the second (IS DIAPROD) is for the express-diagnostics and prevention of depression; the third (DIDEV) is for the expressdiagnostics and prevention of deviant behavior; and the fourth, intelligent system of express-diagnostics of information security attackers (IS DINARLOG2) is for making and justification of decisions with the use of cognitive means based on earlier revealed different regularities, including fault-tolerant irredundant unconditional diagnostic tests, fault-tolerant mixed diagnostic tests, regularities, and decision rules, which are built by using the applied IS DINARLOG1 constructed on the basis of intelligent instrumental software IMSLOG. Further development of this approach is proposed.

Journal ArticleDOI
TL;DR: The modified method of phase correlation was implemented using parallel processing OpenMP, which allowed to achieve the necessary performance indicators and showed high efficiency and robustness.
Abstract: In this paper the effectiveness of the methods for the determination of objects movement between frames in a video sequence was investigated applying to the task of roundwood parameters control. The phase correlation method shows the best value for the accuracy and performance under the given conditions. It was decided to update this method in order to improve the performance of developing machine vision system in the accuracy and reliability of tracking objects. The modified method of phase correlation was implemented using parallel processing OpenMP, which allowed to achieve the necessary performance indicators. The method was tested on the image database of real technological process of round timber movement on the conveyer belt and showed high efficiency and robustness.

Journal ArticleDOI
TL;DR: In this article, a method of modeling and analysis of the parameters of the ionosphere, which allows prediction of the data and identification of the anomalies during the ionospheric disturbances, is given.
Abstract: This work is directed at creation of methods of study of the processes in the ionospheric---magnetospheric system during increased solar and geomagnetic activity. Method of modeling and analysis of the parameters of the ionosphere, which allows prediction of the data and identification of the anomalies during the ionospheric disturbances, are given. Computational solutions for determination and estimation of the geomagnetic disturbances are described. Method of determination of the anomalous changes in the time course of cosmic rays, which allows qualitative estimations of the moments of their origination, duration, and intensity, is suggested. On the basis of the methods elaborated, the data on the periods of strong and moderate magnetic storms are complexly analyzed. Sharp oscillations in the electron density of the ionosphere with positive and negative phases, which originate in the regions analyzed during an increase in geomagnetic activity, are distinguished. Positive phases of the ionospheric disturbances from several hours to one and a half days long were formed before the beginning of the magnetic storms. At the moments of the increase in the electron concentration, a local increase is observed in the level of cosmic rays (several hours before the magnetic storms) that supported the solar nature of these effects. During the strongest geomagnetic disturbances, the electron concentration in the ionosphere decreased significantly and led to prolonged negative phases of ionospheric storms, which coincided with the decrease in the level of cosmic rays (a Forbush decrease).

Journal ArticleDOI
TL;DR: An efficient action recognition system using Difference Intensity Distance Group Pattern (DIDGP) method and recognition using Support Vector Machines (SVM) classifier is presented.
Abstract: Recognition of human actions is a very important, task in many applications such as Human Computer Interaction, Content based video retrieval and indexing, Intelligent video surveillance, Gesture Recognition, Robot learning and control, etc. An efficient action recognition system using Difference Intensity Distance Group Pattern (DIDGP) method and recognition using Support Vector Machines (SVM) classifier is presented. Initially, Region of Interest (ROI) is extracted from the difference frame, where it represents the motion information. The extracted ROI is divided into two blocks B1 and B2. The proposed DIDGP feature is applied on the maximum intensity block of the ROI to discriminate the each action from video sequences. The feature vectors obtained from the DIDGP are recognized using SVM with polynomial and RBF kernel. The proposed work has been evaluated on KTH action dataset which consists of actions like walking, running, jogging, hand waving, clapping and boxing. The proposed method has been experimentally tested on KTH dataset and an overall accuracy of 94.67% for RBF kernel.

Journal ArticleDOI
TL;DR: In this paper, an algorithm for quantification the degree of receptor expression to steroid hormones by automatic analysis of microscope images of immunocytochemical specimens was presented, and a high correlation between the results of the automatic analysis and visual expert assessment was shown.
Abstract: The paper presents an algorithm for quantification the degree of receptor expression to steroid hormones by automatic analysis of microscope images of immunocytochemical specimens. During experiments a high correlation between the results of the automatic analysis and visual expert assessment was shown and the possibility to apply the proposed algorithm to automate immunocytochemical analysis was confirmed.

Journal ArticleDOI
TL;DR: A complete pipeline that relies on the LBP-TOP descriptors and the Bag-of-Words model for basic expressions classification is developed that achieves the average recognition rate of 97.7%, thus outperforming state- of-the-art methods in terms of accuracy.
Abstract: In this work we investigate the problem of robust dynamic facial expression recognition. We develop a complete pipeline that relies on the LBP-TOP descriptors and the Bag-of-Words (BoW) model for basic expressions classification. Experiments performed on the standard dataset such as the Extended Cohn-Kanade (CK+) database show that the developed approach achieves the average recognition rate of 97.7%, thus outperforming state-of-the-art methods in terms of accuracy. The proposed method is quite robust as it uses only relevant parts of video frames such as areas around mouth, noise, eyes, etc. Ability to work with arbitrary length sequence is also a plus for practical applications, since it means there is no need for complex temporal normalization methods.

Journal ArticleDOI
TL;DR: Experimental applications of the proposed method for image enhancement algorithms show the improvement of image quality.
Abstract: A method to improve the results of image enhancement is proposed. The idea of the method is to warp pixel grid by moving pixels towards the nearest image edges. It makes edges sharper while keeping textured areas almost intact. Experimental applications of the proposed method for image enhancement algorithms show the improvement of image quality.

Journal ArticleDOI
TL;DR: The method of linear decision rules constructing using the Fisher’s criterion is discussed and the efficiency of the method is investigated on the example of complex arrhythmia recognition according to the spectral description of electrocardiosignals.
Abstract: Methods for generating spectral features in biosignal recognition in frequency domain are described. The method of linear decision rules constructing using the Fisher's criterion is discussed. The efficiency of the method is investigated on the example of complex arrhythmia recognition according to the spectral description of electrocardiosignals.

Journal ArticleDOI
TL;DR: A solution based on the partitioning a video into blocks of equal length and detecting objects in the first and last frames of the block is proposed, which is not less efficient than currently used Lucas-Kanade and Tracking-Learning-Detection methods.
Abstract: This paper considers the problem of vehicle video detection and tracking. A solution based on the partitioning a video into blocks of equal length and detecting objects in the first and last frames of the block is proposed. Matching of vehicle locations in the first and last frames helps detect pairs of locations of the same object. Reconstruction of vehicle locations in the intermediate frames allows restoring separate parts of motion tracks. Combination of consecutive segments by matching makes it possible to reconstruct a complete track. Analysis of detection quality shows a true positive rate of more than 75% including partially visible vehicles, while the average number of false positives per frame is less than 0.3. The results of tracking of separate vehicles show that objects are tracked to the final frame. For the majority of them the average overlapping percent is not less efficient than the currently used Lucas-Kanade and Tracking-Learning-Detection methods. The average tracking accuracy of all vehicles makes about 70%.