scispace - formally typeset
Search or ask a question

Showing papers in "Signal, Image and Video Processing in 2018"


Journal ArticleDOI
TL;DR: The best proposal, named DeepBIQ, estimates the image quality by average-pooling the scores predicted on multiple subregions of the original image, having a linear correlation coefficient with human subjective scores of almost 0.91.
Abstract: In this work, we investigate the use of deep learning for distortion-generic blind image quality assessment. We report on different design choices, ranging from the use of features extracted from pre-trained convolutional neural networks (CNNs) as a generic image description, to the use of features extracted from a CNN fine-tuned for the image quality task. Our best proposal, named DeepBIQ, estimates the image quality by average-pooling the scores predicted on multiple subregions of the original image. Experimental results on the LIVE In the Wild Image Quality Challenge Database show that DeepBIQ outperforms the state-of-the-art methods compared, having a linear correlation coefficient with human subjective scores of almost 0.91. These results are further confirmed also on four benchmark databases of synthetically distorted images: LIVE, CSIQ, TID2008, and TID2013.

254 citations


Journal ArticleDOI
TL;DR: Stationary wavelet transform has been used to decompose the segmented multilead electrocardiogram (ECG) signal into different sub-bands and the proposed technique has been scrutinized under both “class-oriented,” and more practical, “subject-oriented” approach.
Abstract: Early and accurate detection of myocardial infarction is imperative for reducing the mortality rate due to heart attack. Present work proposes a novel technique aiming toward accurate and timely detection of inferior myocardial infarction (IMI). Stationary wavelet transform has been used to decompose the segmented multilead electrocardiogram (ECG) signal into different sub-bands. Sample entropy, normalized sub-band energy, log energy entropy, and median slope calculated over selected bands of multilead ECG are used as features. Support vector machine (SVM) and K-nearest neighbor (KNN) have been used to classify between subjects admitted for health control (HC) and patients suffering from IMI, using attributes selected on the basis of gain ratio. The full length ECG of lead II, III, and aVF of all the subjects having IMI or admitted for HC from Physikalisch-Technische Bundesanstalt Database (PTB-DB) has been used in the present work. The proposed technique has been scrutinized under both “class-oriented,” and more practical, “subject-oriented” approach. Under the class-oriented approach, data have been divided into training and test data irrespective of the patients, whereas in subject-oriented approach, data from one patient have been used for test and training has been done on the rest of the subjects. Under the class-oriented approach, area under the receiver operating characteristic curve (Roc), sensitivity (Se%), specificity (Sp%), positive predictivity (+P%), and accuracy (Ac%) is Roc $$=$$ 0.9945, Se% $$=$$ 98.67, Sp% $$=$$ 98.72, +P% $$=$$ 98.79, Ac% $$=$$ 98.69 using KNN and Roc $$=$$ 0.9994, Se% $$=$$ 99.35, Sp% $$=$$ 98.29, +P% $$=$$ 98.41, Ac% $$=$$ 98.84 using SVM. For the subject-oriented approach, an average Ac% $$=$$ 81.71, Se% $$=$$ 79.01, Sp% $$=$$ 79.26, and +P% $$=$$ 80.25 has been achieved. This shows the potential of the proposed technique to work for an unknown subject, on which it has not been trained.

95 citations


Journal ArticleDOI
TL;DR: The results show that the proposed approaches for human identification based on electrocardiogram (ECG) are robust and effective compared with other recent works.
Abstract: This paper presents hybrid approaches for human identification based on electrocardiogram (ECG). The proposed approaches consist of four phases, namely data acquisition, preprocessing, feature extraction and classification. In the first phase, data acquisition phase, data sets are collected from two different databases, ECG-ID and MIT-BIH Arrhythmia database. In the second phase, noise reduction of ECG signals is performed by using wavelet transform and a series of filters used for de-noising. In the third phase, features are obtained by using three different intelligent approaches: a non-fiducial, fiducial and a fusion approach between them. In the last phase, the classification approach, three classifiers are developed to classify subjects. The first classifier is based on artificial neural network (ANN). The second classifier is based on K-nearest neighbor (KNN), relying on Euclidean distance. The last classifier is support vector machine (SVM) classification accuracy of 95% is obtained for ANN, 98 % for KNN and 99% for SVM on the ECG-ID database, while 100% is obtained for ANN, KNN, and SVM on MIT-BIH Arrhythmia database. The results show that the proposed approaches are robust and effective compared with other recent works.

53 citations


Journal ArticleDOI
TL;DR: The experimental results showed that the proposed approach achieved reasonable segmentation results for the indoor and outdoor thermal images, accuracy of the segmented images better than the non-segmented ones, and the entropy-based feature selection method obtained the best classification accuracy.
Abstract: Infrared spectrum-based human recognition systems offer straightforward and robust solutions for achieving an excellent performance in uncontrolled illumination. In this paper, a human thermal face recognition model is proposed. The model consists of four main steps. Firstly, the grey wolf optimization algorithm is used to find optimal superpixel parameters of the quick-shift segmentation method. Then, segmentation-based fractal texture analysis algorithm is used for extracting features and the rough set-based methods are used to select the most discriminative features. Finally, the AdaBoost classifier is employed for the classification process. For evaluating our proposed approach, thermal images from the Terravic Facial infrared dataset were used. The experimental results showed that the proposed approach achieved (1) reasonable segmentation results for the indoor and outdoor thermal images, (2) accuracy of the segmented images better than the non-segmented ones, and (3) the entropy-based feature selection method obtained the best classification accuracy. Generally, the classification accuracy of the proposed model reached to 99% which is better than some of the related work with around 5%.

50 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed method outperforms existing methods based on the UCSD anomaly detection video datasets.
Abstract: In this paper, a new method for detecting abnormal events in public surveillance systems is proposed. In the first step of the proposed method, candidate regions are extracted, and the redundant information is eliminated. To describe appearance and motion of the extracted regions, HOG-LBP and HOF are calculated for each region. Finally, abnormal events are detected using two distinct one-class SVM models. To achieve more accurate anomaly localization, the large regions are divided into non-overlapping cells, and the abnormality of each cell is examined separately. Experimental results show that the proposed method outperforms existing methods based on the UCSD anomaly detection video datasets.

48 citations


Journal ArticleDOI
TL;DR: This work proposes a data augmentation which enables the ConvNets to learn more adequately the patterns of the signal, and proposes a novel ConvNet architecture to explore the patterns among the accelerometer axes throughout the layers that compose the network.
Abstract: An increasing number of works have investigated the use of convolutional neural network (ConvNets) approaches to perform human activity recognition (HAR) based on wearable sensor data. These approaches present state-of-the-art results in HAR, outperforming traditional approaches, such as handcrafted methods and 1D convolutions. Motivated by this, in this work we propose a set of methods to enhance ConvNets for HAR. First, we propose a data augmentation which enables the ConvNets to learn more adequately the patterns of the signal. Second, we exploit the attitude estimation of the accelerometer data to devise a set of novel feature descriptors which allow the ConvNets to better discriminate the activities. Finally, we propose a novel ConvNet architecture to explore the patterns among the accelerometer axes throughout the layers that compose the network. We demonstrate that this is a simpler way of improving the activity recognition instead of proposing more complex architectures, serving as direction to future works with the purpose of building ConvNets architectures. The experimental results show that our proposed methods achieve notable improvements and outperform existing state-of-the-art methods.

47 citations


Journal ArticleDOI
TL;DR: A new contrast enhancement algorithm is proposed, which is based on the fact that, for conventional histogram equalization, a uniform input histogram produces an equalized output histogram, and can improve the contrast while preserving original image features.
Abstract: A new contrast enhancement algorithm is proposed, which is based on the fact that, for conventional histogram equalization, a uniform input histogram produces an equalized output histogram. Hence before applying histogram equalization, we modify the input histogram in such a way that it is close to a uniform histogram as well as the original one. Thus, the proposed method can improve the contrast while preserving original image features. The main steps of the new algorithm are adaptive gamma transform, exposure-based histogram splitting, and histogram addition. The object of gamma transform is to restrain histogram spikes to avoid over-enhancement and noise artifacts effect. Histogram splitting is for preserving mean brightness, and histogram addition is used to control histogram pits. Extensive experiments are conducted on 300 test images. The results are evaluated subjectively as well as by DE, PSNR EBCM, GMSD, and MCSD metrics, on which, except for the PSNR, the proposed algorithm has some improvements of 2.89, 9.83, 28.32, and 26.38% over the second best ESIHE algorithm, respectively. That is to say, the overall image quality is better.

45 citations


Journal ArticleDOI
TL;DR: A novel flame detection algorithm based on CNN in real time by processing the video data generated by an ordinary camera monitoring a scene is proposed and can effectively realize the real-time performance of fire warning in practice.
Abstract: Computer vision-based fire detection is one of the crucial tasks in modern surveillance system. In recent years, the convolutional neural network (CNN) has become an active topic because of its high accuracy recognition rate in a wide range of applications. How to reliably and effectively solve the problems of flame detection, however, has still been a challenging problem in practice. In this paper, we proposed a novel flame detection algorithm based on CNN in real time by processing the video data generated by an ordinary camera monitoring a scene. Firstly, to improve the efficiency of recognition, a candidate target area extraction algorithm is proposed for dealing with the suspected flame area. Secondly, the extracted feature maps of candidate areas are classified by the designed deep neural network model based on CNN. Finally, the corresponding alarm signal is obtained by the classification results. The experimental results show that the proposed method can effectively identify fire and achieve higher alarm rate in the homemade database. The proposed method can effectively realize the real-time performance of fire warning in practice.

45 citations


Journal ArticleDOI
TL;DR: The proposed coarse-to-fine palmprint recognition method is proposed by combining the weighted adaptive center symmetric local binary pattern (WACS-LBP) and weighted sparse representation based classification (WSRC).
Abstract: In order to extract invariant features in the palmprint transformation of scale, rotation and affine distortion, a coarse-to-fine palmprint recognition method is proposed by combining the weighted adaptive center symmetric local binary pattern (WACS-LBP) and weighted sparse representation based classification (WSRC). The method consists of coarse and fine stages. In the coarse stage, using the similarity between the test sample and one sample of each training class, most of the training classes could be excluded and a small number of candidate classes of the test sample are reserved. Thus, the original classification problem becomes clear and simple. In the fine stage, the robust rotation invariant weighted histogram feature vector is extracted from each candidate sample and the test sample by WACS-LBP, and the weighted sparse representation optimal problem is constructed by the similarity between the test sample and each candidate training sample, and the test sample is recognized by the minimum residual. The proposed method is tested and compared with the existing algorithms on the PolyU and CASIA database. The experimental results illustrate better performance and rationale interpretation of the proposed method.

43 citations


Journal ArticleDOI
TL;DR: Automated identification of retinal blood vessels based on whale algorithm seems highly successful through a comprehensive optimization process of operational parameters.
Abstract: The aim was to present a novel automated approach for extracting the vasculature of retinal fundus images. The proposed vasculature extraction method on retinal fundus images consists of two phases: preprocessing phase and segmentation phase. In the first phase, brightness enhancement is applied for the retinal fundus images. For the vessel segmentation phase, a hybrid model of multilevel thresholding along with whale optimization algorithm (WOA) is performed. WOA is used to improve the segmentation accuracy through finding the $$n{-}1$$ optimal n-level threshold on the fundus image. To evaluate the accuracy, sensitivity, specificity, accuracy, receiver operating characteristic (ROC) curve analysis measurements are used. The proposed approach achieved an overall accuracy of 97.8%, sensitivity of 88.9%, and specificity of 98.7% for the identification of retinal blood vessels by using a dataset that was collected from Bostan diagnostic center in Fayoum city. The area under the ROC curve reached a value of 0.967. Automated identification of retinal blood vessels based on whale algorithm seems highly successful through a comprehensive optimization process of operational parameters.

42 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel perception-based seam-cutting approach that considers the nonlinearity and the nonuniformity of human perception into the energy minimization and uses a sigmoid metric to characterize the perception of color discrimination.
Abstract: Image stitching is still challenging in consumer-level photography due to imperfect image captures. Recent works show that seam-cutting approaches can effectively relieve the artifacts generated by local misalignment. Normally, the seam-cutting approach is described in terms of energy minimization. However, few of existing methods consider the human perception in their energy functions, which sometimes causes that there exists another seam that is perceptually better than the one with the minimum energy. In this paper, we propose a novel perception-based seam-cutting approach that considers the nonlinearity and the nonuniformity of human perception into the energy minimization. Our method uses a sigmoid metric to characterize the perception of color discrimination and a saliency weight to simulate that the human eye inclines to pay more attention to the salient objects. In addition, our approach can be easily integrated into other stitching pipelines. Representative experiments demonstrate substantial improvements over the conventional seam-cutting approach.

Journal ArticleDOI
TL;DR: This paper implemented a new skin lesion detection method based on the genetic algorithm for optimizing the neutrosophic set (NS) operation to reduce the indeterminacy on the dermoscopy images using the proposed ONKM method, which achieved the best performance using five fold cross-validation.
Abstract: This paper implemented a new skin lesion detection method based on the genetic algorithm (GA) for optimizing the neutrosophic set (NS) operation to reduce the indeterminacy on the dermoscopy images. Then, k-means clustering is applied to segment the skin lesion regions. Therefore, the proposed method is called optimized neutrosophic k-means (ONKM). On the training images set, an initial value of $$\alpha $$ in the $$\alpha $$ -mean operation of the NS is used with the GA to determine the optimized $$\alpha $$ value. The Jaccard index is used as the fitness function during the optimization process. The GA found the optimal $$\alpha $$ in the $$\alpha $$ -mean operation as $$\alpha _{\mathrm{optimal}} =0.0014$$ in the NS, which achieved the best performance using five fold cross-validation. Afterward, the dermoscopy images are transformed into the neutrosophic domain via three memberships, namely true, indeterminate, and false, using $$\alpha _{\mathrm{optimal}}$$ . The proposed ONKM method is carried out to segment the dermoscopy images. Different random subsets of 50 images from the ISIC 2016 challenge dataset are used from the training dataset during the fivefold cross-validation to train the proposed system and determine $$\alpha _{\mathrm{optimal}}$$ . Several evaluation metrics, namely the Dice coefficient, specificity, sensitivity, and accuracy, are measured for performance evaluation of the test images using the proposed ONKM method with $$\alpha _{\mathrm{optimal}} =0.0014$$ compared to the k-means, and the $$\gamma $$ –k-means methods. The results depicted the dominance of the ONKM method with $$99.29\pm 1.61\%$$ average accuracy compared with k-means and $$\gamma $$ –k-means methods.

Journal ArticleDOI
TL;DR: A pipeline that consists of feature extraction and filtering, shot clustering, and labeling stages, and a deep convolutional network is used as the source of the features, which can be applied to semantic video annotation in real time.
Abstract: The semantic video indexing problem is still underexplored. Solutions to the problem will significantly enrich the experience of video search, monitoring, and surveillance. This paper concerns scene detection and annotation, and specifically, the task of video structure mining for video indexing using deep features. The paper proposes and implements a pipeline that consists of feature extraction and filtering, shot clustering, and labeling stages. A deep convolutional network is used as the source of the features. The pipeline is evaluated using metrics for both scene detection and annotation. The results obtained show high scene detection and annotation quality estimated with various metrics. Additionally, we performed an overview and analysis of contemporary segmentation and annotation metrics. The outcome of this work can be applied to semantic video annotation in real time.

Journal ArticleDOI
TL;DR: It is observed that the resultant images of the proposed fusion scheme show appropriate fusion characteristics and retain the bone, CSF and edema details in the clinical format required for disease evaluation by the radiologists.
Abstract: This research proposes a novel fusion scheme for non-subsampled shearlet transform (NSST) which is based on simplified model of pulse coupled neural network (PCNN). The images to be fused are acquired from Postgraduate Institute of Medical Education and Research, Chandigarh, India, and internet repository. The image database contains computed tomography and T2-weighted magnetic resonance images. The images to be fused are decomposed into approximation and detail sub-bands using NSST. The regional energy-based activity measure with consistency verification is applied to fuse the approximation sub-band of NSST. The novel morphological gradient of detail sub-bands is fed as external stimulus to PCNN to fuse detail sub-bands. The proposed method is compared with five state-of-the-art fusion schemes visually and using five fusion performance parameters. It is observed that the resultant images of the proposed fusion scheme show appropriate fusion characteristics and retain the bone, CSF and edema details in the clinical format required for disease evaluation by the radiologists. The proposed scheme requires lesser computational time than other state-of-the-art PCNN-based fusion schemes.

Journal ArticleDOI
TL;DR: Simulation results in sparse system identification and echo cancellation applications are presented, which demonstrate that the proposed proportionate MCC exhibits outstanding performance under the impulsive noise environments.
Abstract: Proportionate-type adaptive filtering (PtAF) algorithms have been successfully applied to sparse system identification. The major drawback of the traditional PtAF algorithms based on the mean square error (MSE) criterion show poor robustness in the presence of impulsive noises or abrupt changes because MSE is only valid and rational under Gaussian assumption. However, this assumption is not satisfied in most real-world applications. To improve its robustness under non-Gaussian environments, we incorporate the maximum correntropy criterion (MCC) into the update equation of the PtAF to develop proportionate MCC (PMCC) algorithm. The mean and mean square convergence performance analysis are also performed. Simulation results in sparse system identification and echo cancellation applications are presented, which demonstrate that the proposed PMCC exhibits outstanding performance under the impulsive noise environments.

Journal ArticleDOI
TL;DR: The numerical and statistical results indicate that MFE-BSA has higher peak signal-to-noise ratio, lower mean square error for all the images at different thresholding levels, and shows very good segmentation results in terms of preciseness, robustness, and stability.
Abstract: Multilevel thresholding of the color images such as natural and satellite images becomes a challenging task due to the inherent fuzziness and ambiguity in such images. To address this issue, a modified fuzzy entropy (MFE) function is proposed in this paper. MFE function is the difference of adjacent entropies, which is optimized to provide thresholding levels such that all regions have almost equal entropies. To improve the performance of MFE, backtracking search algorithm is used. The numerical and statistical results indicate that MFE-BSA has higher peak signal-to-noise ratio, lower mean square error for all the images at different thresholding levels. Moreover, structural and feature similarity indices for MFE-BSA are closer to unity and the average fitness value obtained using MFE-BSA is minimum (lesser than 0.5). Overall, MFE-BSA shows very good segmentation results in terms of preciseness, robustness, and stability.

Journal ArticleDOI
TL;DR: This paper describes a new approach of the first and the second challenge presented by Pattern Analysis, Statistical Modeling and Computational Learning (PASCAL) Classifying Heart Sounds Challenge by means of the segmentation total error value and the precision of each category.
Abstract: This paper describes a new approach of the first and the second challenge presented by Pattern Analysis, Statistical Modeling and Computational Learning (PASCAL) Classifying Heart Sounds Challenge. The segmentation of phonocardiogram signals into the first heart sound S1 and the second heart sound S2 consists in heart sounds preprocessing, heart sounds peaks detection, extra peaks rejection and S1 and S2 peaks identification. Regarding heart sounds classification into few classes, relevant descriptors have been extracted from phonocardiogram signals, some of which have relied on segmentation results, and used as parameters for an appropriate classifier. The results of this methodology are compared with those of other approaches obtained at PASCAL Classifying Heart Sounds Challenge by means of the segmentation total error value and the precision of each category.

Journal ArticleDOI
TL;DR: This study investigates the computer vision and machine learning methods for classification of brain magnetic resonance (MR) slices using Gabor filter and support vector machines, and proves the overall efficacy of the proposed method.
Abstract: In computational and clinical environments, autoclassification of brain magnetic resonance image (MRI) slices as normal and abnormal is challenging. The purpose of this study is to investigate the computer vision and machine learning methods for classification of brain magnetic resonance (MR) slices. In routine health-care units, MR scanners are being used to generate a massive number of brain slices, underlying the anatomical details. Pathological assessment from this medical data is being carried out manually by the radiologists or neuro-oncologists. It is almost impossible to analyze each slice manually due to the large amount of data produced by MRI devices at each moment. Irrefutably, if an automated protocol performing this task is executed, not only the radiologist will be assisted, but a better pathological assessment process can also be expected. Numerous schemes have been reported to address the issue of autoclassification of brain MRI slices as normal and abnormal, but accuracy, robustness and optimization are still an open issue. The proposed method, using Gabor filter and support vector machines, classifies brain MRI slices as normal or abnormal. Accuracy, sensitivity, specificity and ROC-curve have been used as standard quantitative measures to evaluate the proposed algorithm. To the best of our knowledge, this is the first study in which experiments have been performed on Whole Brain Atlas-Harvard Medical School (HMS) dataset, achieving an accuracy of 97.5%, sensitivity of 99%, specificity of 92% and ROC-curve as 0.99. To test the robustness against medical traits based on ethnicity and to achieve optimization, a locally developed dataset has also been used for experiments and remarkable results with accuracy (96.5%), sensitivity (98%), specificity (92%) and ROC-curve (0.97) were achieved. Comparison with state-of-the-art methods proved the overall efficacy of the proposed method.

Journal ArticleDOI
TL;DR: It is proved that although this regularization term is non-convex, the cost function can maintain convexity by specifying α in a proper range and the effectiveness of MCTV for both 1-D signal and 2-D image denoising is demonstrated.
Abstract: Total variation (TV) denoising is a commonly used method for recovering 1-D signal or 2-D image from additive white Gaussian noise observation In this paper, we define the Moreau enhanced function of $$L_1$$ norm as $${\varPhi }_\alpha (x)$$ and introduce the minmax-concave TV (MCTV) in the form of $${\varPhi }_\alpha (Dx)$$ , where D is the finite difference operator We present that MCTV approaches $$\Vert Dx\Vert _0$$ if the non-convexity parameter $$\alpha $$ is chosen properly and apply it to denoising problem MCTV can strongly induce the signal sparsity in gradient domain, and moreover, its form allows us to develop corresponding fast optimization algorithms We also prove that although this regularization term is non-convex, the cost function can maintain convexity by specifying $$\alpha $$ in a proper range Experimental results demonstrate the effectiveness of MCTV for both 1-D signal and 2-D image denoising

Journal ArticleDOI
TL;DR: A fast and robust seam estimation method (FARSE) is presented by defining gray-weighted distance and gradient-domain region of differences to avoid visible seams and ghosting and results indicate that the FARSE method is scale-invariant and it is fast and more robust than the other methods.
Abstract: Image stitching has a wide range of applications in computer vision/graphics and virtual reality. Seam estimation is one of the key steps in image stitching. This step can relieve ghosts and artifacts that were generated by misalignment or moving objects in the overlap region. This paper presents a fast and robust seam estimation method (FARSE) by defining gray-weighted distance and gradient-domain region of differences to avoid visible seams and ghosting. The optimal seam is estimated by searching in two weighted matrices, namely cost matrix and value matrix. The proposed method could be simply implemented. Results indicate that the FARSE method is scale-invariant and it is fast and more robust than the other methods.

Journal ArticleDOI
TL;DR: It turns out that the debayering performance can be improved quite dramatically after fusion based on extensive evaluations, and none of the seven algorithms can yield the best performance in terms of peak signal-to-noise ratio (PSNR), CIELAB score, and subjective evaluation.
Abstract: Bayer pattern has been widely used in commercial digital cameras. In NASA’s mast camera (Mastcams) onboard the Mars rover Curiosity, Bayer pattern has also been used in capturing the RGB bands. It is well known that debayering, also known as demosaicing in the literature, introduces artifacts such as false colors and zipper edges. In this paper, we first present four fusion approaches, including weighted and the well-known alpha-trimmed mean filtering approaches. Each fusion approach combines demosaicing results from seven debayering algorithms in the literature, which are selected based on their performance mentioned in other survey papers and the availability of open source codes. Second, we present debayering results using two benchmark image data sets: IMAX and Kodak. It was observed that none of the seven algorithms in the literature can yield the best performance in terms of peak signal-to-noise ratio (PSNR), CIELAB score, and subjective evaluation. Although the fusion algorithms are simple, it turns out that the debayering performance can be improved quite dramatically after fusion based on our extensive evaluations. In particular, the average PSNR improvements of the weighted fusion algorithm over the best individual method are 1.1 dB for the IMAX database and 1.8 dB for the Kodak database, respectively. Third, we applied the various algorithms to 36 actual Mastcam images. Subjective evaluation indicates that the fusion algorithms still work well, but not as good as the existing debayering algorithm used by NASA.

Journal ArticleDOI
TL;DR: A method of multiple moving object detection and tracking by combining background subtraction and K-means clustering is proposed and it is capable of handling merging and splitting of moving objects using spatial information.
Abstract: Object detection and tracking is a fundamental, challenging task in computer vision because of the difficulties in tracking. Continuous deformation of objects during movement and background clutter leads to poor tracking. In this paper, a method of multiple moving object detection and tracking by combining background subtraction and K-means clustering is proposed. The proposed method can handle objects occlusion, shadows and camera jitter. Background subtraction filters irrelevant information, and K-means clustering is employed to select the moving object from the remaining information, and it is capable of handling merging and splitting of moving objects using spatial information. Experimental results show that the proposed method is robust when compared to other techniques.

Journal ArticleDOI
TL;DR: A newly fingertip electrocardiogram (ECG) data acquisition device capable of recording the lead-1 ECG signal through the right- and left-hand thumb fingers and a biometric identification method based on combining autocorrelation and discrete cosine transform-based features, cepstral features, and QRS beat information is proposed.
Abstract: In this research work, we present a newly fingertip electrocardiogram (ECG) data acquisition device capable of recording the lead-1 ECG signal through the right- and left-hand thumb fingers. The proposed device is high-sensitive, dry-contact, portable, user-friendly, inexpensive, and does not require using conventional components which are cumbersome and irritating such as wet adhesive Ag/AgCl electrodes. One of the other advantages of this device is to make it possible to record and use the lead-1 ECG signal easily in any condition and anywhere incorporating with any platform to use for advanced applications such as biometric recognition and clinical diagnostics. Furthermore, we proposed a biometric identification method based on combining autocorrelation and discrete cosine transform-based features, cepstral features, and QRS beat information. The proposed method was evaluated on three fingertip ECG signal databases recorded by utilizing the proposed device. The experimental results demonstrate that the proposed biometric identification method achieves person recognition rate values of 100% (30 out of 30), 100 $$\%$$ (45 out of 45), and 98.33 $$\%$$ (59 out of 60) for 30, 45, and 60 subjects, respectively.

Journal ArticleDOI
TL;DR: In this work, a new dehazing algorithm based on dark channel prior mathematical morphology operations (opening and dilation), and a Gaussian filter, is proposed, and the proposed algorithm performance is compared qualitatively and quantitatively against previously reported algorithms.
Abstract: Image pre-processing is a critical stage in computer vision systems, with greater relevance when the input images are captured in outdoor environments because the pictures could contain low contrast and modified colors. A common condition present in outdoor images is haze. In this work, a new dehazing algorithm based on dark channel prior mathematical morphology operations (opening and dilation), and a Gaussian filter, is proposed. Moreover, the proposed algorithm performance is compared qualitatively and quantitatively against previously reported algorithms. Obtained results show that the proposed algorithm requires less processing time providing higher quality dehazing results than other state-of-the-art approaches.

Journal ArticleDOI
TL;DR: Comparative studies of HCAR system identification established efficacy of the designed methodology based on differential evolution over its counterparts, and the performance of meta-heuristic approaches is validated through statistical performance indices based on absolute error, weight deviations and mean squared error.
Abstract: In the present study, strength of stochastic computational paradigms is investigated for parameter estimation of Hammerstein control autoregressive (HCAR) model by exploiting differential evolution, genetic algorithms and pattern search methods. Multidimensional and nonlinear nature of the problem emerging in digital signal systems along with noise makes it a challenging optimization task, which is dealt with robustness and effectiveness of stochastic solvers to ensure convergence and avoid trapping in local minima. The performance of meta-heuristic approaches is validated through statistical performance indices based on absolute error, weight deviations and mean squared error. Comparative studies of HCAR system identification established efficacy of the designed methodology based on differential evolution over its counterparts.

Journal ArticleDOI
TL;DR: A multi-trend binary code descriptor (MTBCD) that captures the global color features, but also reflects the local texture information and exploits the trend of pixels change in four symmetric directions to obtain the texture feature and extracts the spatial correlation information using co-occurrence matrix.
Abstract: With the development of image vision technology, local descriptors have attracted wide attention in the fields of image retrieval and classification. Even though varieties of methods based on local descriptor have achieved excellent performance, most of them cannot effectively represent the trend of pixels change, and they neglect the mutual occurrence of patterns. Therefore, how to construct local descriptors is of vital importance but challenging. In order to solve this problem, this paper proposes a multi-trend binary code descriptor (MTBCD). MTBCD mimics the visual perception of human to describe images by constructing a set of multi-trend descriptors which are encoded with binary codes. The method exploits the trend of pixels change in four symmetric directions to obtain the texture feature, and extracts the spatial correlation information using co-occurrence matrix. These intermediate features are integrated into one histogram using a new fusion strategy. The proposed method not only captures the global color features, but also reflects the local texture information. Extensive experiments have demonstrated the excellent performance of the proposed method.

Journal ArticleDOI
TL;DR: Experimental results show that proposed 3D and 2D representations and deep features extracted from them are robust and efficient, and achieves comparable results with the state of the art methods in the literature.
Abstract: In activity recognition, usage of depth data is a rapidly growing research area. This paper presents a method for recognizing single-person activities and dyadic interactions by using deep features extracted from both 3D and 2D representations, which are constructed from depth sequences. First, a 3D volume representation is generated by considering spatiotemporal information in depth frames of an action sequence. Then, a 3D-CNN is trained to learn features from these 3D volume representations. In addition to this, a 2D representation is constructed from the weighted sum of the depth sequences. This 2D representation is used with a pre-trained CNN model. Features learned from this model and the 3D-CNN model are used in training of the final approach after a feature selection step. Among the various classifiers, an SVM-based model produced the best results. The proposed method was tested on the MSR-Action3D dataset for single-person activities, the SBU dataset for dyadic interactions, and the NTU RGB+D dataset for both types of actions. Experimental results show that proposed 3D and 2D representations and deep features extracted from them are robust and efficient. The proposed method achieves comparable results with the state of the art methods in the literature.

Journal ArticleDOI
TL;DR: Simulations indicate that in contrast to the original VA, the proposed IVA can effectively suppress the SP caused by the intersected IFs and thus can achieve more accurate IFs for the multicomponent signals especially those with monotonous IFs.
Abstract: Viterbi algorithm (VA) applied to time–frequency (TF) representation is a highly performed instantaneous frequency (IF) estimator for discrete-time signals, but it suffers from switch problem (SP) at the intersected points of multicomponents on TF plane. To suppress the SP in VA, an improved VA (IVA) presented in this paper assumes that the IF variation trends between two adjacent IF variation are not large, and then, a novel penalty function is introduced and added to the original VA. To verify the algorithm, the proposed algorithm applied to several multicomponent signals is firstly simulated; then, how parameter in the new penalty function influences the performance is analyzed. Comparison of the proposed algorithm with VA on signals in the background of noise is also made in the next. Simulations indicate that in contrast to the original VA, the proposed IVA can effectively suppress the SP caused by the intersected IFs and thus can achieve more accurate IFs for the multicomponent signals especially those with monotonous IFs.

Journal ArticleDOI
TL;DR: A novel, local feature-based vein representation method based on minutiae features from skeleton images of venous networks to learn the most discriminative regions and features of dorsal hand veins to identify persons who are scanned.
Abstract: This paper presents a novel, local feature-based vein representation method based on minutiae features from skeleton images of venous networks. The main motivation is to learn the most discriminative regions and features of dorsal hand veins to identify persons who are scanned. These minutiae features include end points and the arc lines between the two end points as measured along the boundary of the region of interest. In addition, we propose a dynamic pattern tree to accelerate matching performance and evaluate the discriminatory power of these feature points for verifying a person’s identity. In a comparison with six existing verification algorithms, the proposed method achieved the highest accuracy in the lowest tested matching time.

Journal ArticleDOI
TL;DR: A novel feature extraction method for low-quality fingerprints images is proposed, which mimics the magnetic energy when attracting iron fillings, and this method is based on image energies attracting uniformly distributed points to form the final features that can describe a fingerprint.
Abstract: In fingerprint recognition systems, feature extraction is an important part because of its impact on the final performance of the overall system, particularly, in the case of low-quality images, which poses significant challenges to traditional fingerprint feature extraction methods. In this work, we make two major contributions: First, a novel feature extraction method for low-quality fingerprints images is proposed, which mimics the magnetic energy when attracting iron fillings, and this method is based on image energies attracting uniformly distributed points to form the final features that can describe a fingerprint. Second, we created a new low-quality fingerprints image database to evaluate the proposed method. We used a mobile phone camera to capture the fingerprints of 136 different persons, with five samples for each to obtain 680 fingerprint images in total. To match the computed features, we used the dynamic time warping and evaluated the performance of our system based on k-nearest neighbor classifier. Further, we represent the features using their probability density functions to evaluate the method using some other classifiers. The highest identification accuracy recorded by several experiments reached 95.11% using our in-house database. The experimental results show that the proposed method can be used as a general feature extraction method for other applications.