Showing papers in "Signal, Image and Video Processing in 2015"
TL;DR: The proposed method to fuse source images by weighted average using the weights computed from the detail images that are extracted from the source images using CBF has shown good performance and the visual quality of the fused image by the proposed method is superior to other methods.
Abstract: Like bilateral filter (BF), cross bilateral filter (CBF) considers both gray-level similarities and geometric closeness of the neighboring pixels without smoothing edges, but it uses one image for finding the kernel and other to filter, and vice versa. In this paper, it is proposed to fuse source images by weighted average using the weights computed from the detail images that are extracted from the source images using CBF. The performance of the proposed method has been verified on several pairs of multisensor and multifocus images and compared with the existing methods visually and quantitatively. It is found that, none of the methods have shown consistence performance for all the performance metrics. But as compared to them, the proposed method has shown good performance in most of the cases. Further, the visual quality of the fused image by the proposed method is superior to other methods.
417 citations
TL;DR: A new vision-based fall detection technique that is based on human shape variation where only three points are used to represent a person instead of the conventional ellipse or bounding box, which increases the fall detection rate without increasing the computational complexity.
Abstract: Falls are one of the major health hazards among the aging population aged 65 and above, which could potentially result in a significant hindrance to their independent living. With the advances in medical science in the last few decades, the aging population increases every year, and thus, fall detection system at home is increasingly important. This paper presents a new vision-based fall detection technique that is based on human shape variation where only three points are used to represent a person instead of the conventional ellipse or bounding box. Falls are detected by analyzing the shape change of the human silhouette through the features extracted from the three points. Experiment results show that in comparison with the conventional ellipse and bounding box techniques, the proposed three point–based technique increases the fall detection rate without increasing the computational complexity.
127 citations
TL;DR: The problems ABC algorithm has been applied in signal, image, and video processing fields and how ABC algorithm was used in the approaches for solving these kinds of problems are presented.
Abstract: Artificial bee colony (ABC) algorithm is a swarm intelligence algorithm, which simulates the foraging behavior of honeybees. It has been successfully applied to many optimization problems in different areas. Since 2009, ABC algorithm has been employed for various problems in signal, image, and video processing fields. This paper presents the problems ABC algorithm has been applied in these fields and describes how ABC algorithm was used in the approaches for solving these kinds of problems.
120 citations
TL;DR: In this article, a proximal approach is proposed to deal with a class of convex variational problems involving nonlinear constraints, which can be expressed as the lower-level set of a sum of a convex functions evaluated over different blocks of the linearly transformed signal.
Abstract: We propose a proximal approach to deal with a class of convex variational problems involving nonlinear constraints. A large family of constraints, proven to be effective in the solution of inverse problems, can be expressed as the lower-level set of a sum of convex functions evaluated over different blocks of the linearly transformed signal. For such constraints, the associated projection operator generally does not have a simple form. We circumvent this difficulty by splitting the lower-level set into as many epigraphs as functions involved in the sum. In particular, we focus on constraints involving \(\varvec{\ell }_q\)-norms with \(q\ge 1\), distance functions to a convex set, and \(\varvec{\ell }_{1,p}\)-norms with \(p\in \{2,{+\infty }\}\). The proposed approach is validated in the context of image restoration by making use of constraints based on Non-Local Total Variation. Experiments show that our method leads to significant improvements in term of convergence speed over existing algorithms for solving similar constrained problems.
102 citations
TL;DR: The problem of dictionary learning and its analogy to source separation is addressed and a fast dictionary learning algorithm based on steepest descent method is proposed, which is high speed since both coefficients and dictionary elements are updated simultaneously rather than column-by-column.
Abstract: In this paper, the problem of dictionary learning and its analogy to source separation is addressed. First, we extend the well-known method of K-SVD to incoherent K-SVD, to enforce the algorithm to achieve an incoherent dictionary. Second, a fast dictionary learning algorithm based on steepest descent method is proposed. The main advantage of this method is high speed since both coefficients and dictionary elements are updated simultaneously rather than column-by-column. Finally, we apply the proposed methods to both synthetic and real functional magnetic resonance imaging data for the detection of activated regions in the brain. The results of our experiments confirm the effectiveness of the proposed ideas. In addition, we compare the quality of results and empirically prove the superiority of the proposed dictionary learning methods over the conventional algorithms.
87 citations
TL;DR: An innovative formation of nonlinear element of block cipher is presented, where the authors used fractional Rössler chaotic system, which validate that the designed cryptosystem is consistent for secure communication.
Abstract: In this article, we have presented an innovative formation of nonlinear element of block cipher The suggested construction is chaos based, where we used fractional Rossler chaotic system We have studied various features of our proposed nonlinear component The outcomes of the investigations validate that the designed cryptosystem is consistent for secure communication
80 citations
TL;DR: The simulation results show that the proposed histogram equalization method outperforms other state-of-the-art methods, both in terms of visual and runtime comparison.
Abstract: This paper proposes a new histogram equalization method for effective and efficient mean brightness preservation and contrast enhancement, which prevents intensity saturation and has the ability to preserve image fine details. Basically, the proposed method first separates the test image histogram into two sub-histograms. Then, the plateau limits are calculated from the respective sub-histograms, and they are used to modify those sub-histograms. Histogram equalization is then separately performed on the two sub-histograms to yield a clean and enhanced image. To demonstrate the feasibility of the proposed method, a total of 190 test images are used in simulation and comparison, in which 72 of them are standard test images, while the remainder are made up of real natural images obtained from personal digital camera. The simulation results show that the proposed method outperforms other state-of-the-art methods, both in terms of visual and runtime comparison. Moreover, the simple implementation and fast runtime further underline the importance of the proposed method in consumer electronic products, such as mobile cell-phone, digital camera, and video.
74 citations
TL;DR: This paper proposes a robust watermarking method for medical images to avoid their detachment from the corresponding EPR data in which the watermark is embedded using the digital imaging and communications in medicine standard metadata together with cryptographic techniques.
Abstract: In general, management of medical data is achieved by several issues of medical information such as authentication, security, integrity, privacy, among others. Because medical images and their related electronic patient record (EPR) data are stored separately; the probability of corruption of this information or their detachment from the corresponding EPR data could be very high. Losing data from the corresponding medical image may lead to a wrong diagnostic. Digital watermarking has recently emerged as a suitable solution to solve some of the problems associated with the management of medical images. This paper proposes a robust watermarking method for medical images to avoid their detachment from the corresponding EPR data in which the watermark is embedded using the digital imaging and communications in medicine standard metadata together with cryptographic techniques. In order to provide a high robustness of the watermark while preserving at the same time a high quality of the watermarked images, the generated watermark is embedded into the magnitude of the middle frequencies of the discrete Fourier transform of the original medical image. During the detection process, the watermark data bits are recovered and detected using the bit correct rate criterion. Extensive experiments were carried out, and the performance of the proposed method is evaluated in terms of imperceptibility, payload, robustness and detachment detection. Quantitative evaluation of the watermarked images is performed by using three of the more common metrics: the peak signal-to-noise ratio, structural similarity index and visual information fidelity. Experimental results show the watermark robustness against several of the more aggressive geometric and signal processing distortions. The receiver operating characteristics curves also show the desirable detachment detection performance of the proposed method. A comparison with the previously reported methods with similar purposes respect to the proposed method is also provided.
65 citations
TL;DR: This work attempts to present an overview of compressive sensing aspects of CS, particularly when CS is applied in monostatic pulse-Doppler and MIMO type of radars.
Abstract: Modern radar systems tend to utilize high bandwidth, which requires high sampling rate, and in many cases, these systems involve phased array configurations with a large number of transmit–receive elements. In contrast, the ultimate goal of a radar system is often to estimate only a limited number of target parameters. Thus, there is a pursuit to find better means to perform the radar signal acquisition as well as processing with much reduced amount of data and power requirement. Recently, there has been a great interest to consider compressive sensing (CS) for radar system design; CS is a novel technique which offers the framework for sparse signal detection and estimation for optimized data handling. In radars, CS enables the achievement of better range-Doppler resolution in comparison with the traditional techniques. However, CS requires the selection of suitable (sparse) signal model, the design of measurement system as well as the implementation of appropriate signal recovery method. This work attempts to present an overview of these CS aspects, particularly when CS is applied in monostatic pulse-Doppler and MIMO type of radars. Some of the associated challenges, e.g., grid mismatch and detector design issues, are also discussed.
61 citations
TL;DR: The results achieved by the method outperformed the auto-correlation (AC)/discrete cosine transform (DCT) method where the DCT coefficients are derived from the AC of ECG segments and fed into the RBF network for classification.
Abstract: This paper proposes a discrete wavelet feature extraction method for an electrocardiogram (ECG)-based biometric system. In this method, the RR intervals are extracted and decomposed using discrete biorthogonal wavelet in wavelet coefficient structures. These structures are reduced by excluding the non-informative coefficients, and then, they are fed into a radial basis functions (RBF) neural network for classification. Moreover, the ability of using only the QT or QRS intervals instead of the RR intervals is also investigated. Finally, the results achieved by our method outperformed the auto-correlation (AC)/discrete cosine transform (DCT) method where the DCT coefficients are derived from the AC of ECG segments and fed into the RBF network for classification. The conducted experiments were validated using four Physionet databases. Critical issues like stability overtime, the ability to reject impostors, scalability and generalization to other datasets have also been addressed.
58 citations
TL;DR: The experimental results showed that the proposed segmentation technique achieves good agreement with the gold standard and the ensemble classifier is highly effective in the diagnosis of brain tumor with an accuracy of 99.09 % (sensitivity 100 % and specificity 98.21 %).
Abstract: The manual analysis of brain tumor on magnetic resonance (MR) images is time-consuming and subjective. Thus, to avoid human errors in brain tumor diagnosis, this paper presents an automatic and accurate computer-aided diagnosis (CAD) system based on ensemble classifier for the characterization of brain tumors on MR images as benign or malignant. Brain tumor tissue was automatically extracted from MR images by the proposed segmentation technique. A tumor is represented by extracting its texture, shape, and boundary features. The most significant features are selected by using information gain-based feature ranking and independent component analysis techniques. Next, these features are used to train the ensemble classifier consisting of support vector machine, artificial neural network, and $$k$$
-nearest neighbor classifiers to characterize the tumor. Experiments were carried out on a dataset consisting of T1-weighted post-contrast and T2-weighted MR images of 550 patients. The developed CAD system was tested using the leave-one-out method. The experimental results showed that the proposed segmentation technique achieves good agreement with the gold standard and the ensemble classifier is highly effective in the diagnosis of brain tumor with an accuracy of 99.09 % (sensitivity 100 % and specificity 98.21 %). Thus, the proposed system can assist radiologists in an accurate diagnosis of brain tumors.
TL;DR: The simulation results show higher performance of the proposed blind watermarking scheme as compared to the similar existing techniques under different geometric and nongeometric attacks such as amplification, median filtering, sharpening, scaling, rotation, Gaussian noise, salt and paper noise,Gaussian filter and JPEG compression.
Abstract: In this paper, a blind watermarking scheme based on significant difference of lifting wavelet transform coefficients has been proposed. The difference between two maximum coefficients in a block is called as significant difference. Embedding of binary watermark has been done based on the largest coefficient of randomly shuffled blocks of CH3 sub-band. This sub-band is quantized using the predefined threshold value by comparing the significant difference value with the average of significant difference value of all blocks. The watermarked image shows no perceptual degradation as the PSNR value exceeds 42 dB. An adaptive-thresholding-based method is used for watermark extraction. In the proposed technique, the benefit of using lifting wavelet over traditional wavelet is the maximum energy compaction property, which helps in resisting different attacks. The simulation results show higher performance of the proposed technique as compared to the similar existing techniques under different geometric and nongeometric attacks such as amplification, median filtering, sharpening, scaling, rotation, Gaussian noise, salt and paper noise, Gaussian filter and JPEG compression.
TL;DR: A general scheme for analyzing the performance of a generic localization algorithm for multilateration (MLAT) systems (or for other distributed sensor, passive localization technology) is presented and a set of data models and numerical methods that can describe most localization algorithms are reviewed.
Abstract: We present a general scheme for analyzing the performance of a generic localization algorithm for multilateration (MLAT) systems (or for other distributed sensor, passive localization technology). MLAT systems are used for airport surface surveillance and are based on time difference of arrival measurements of Mode S signals (replies and 1,090 MHz extended squitter, or 1090ES). In the paper, we propose to consider a localization algorithm as composed of two components: a data model and a numerical method, both being properly defined and described. In this way, the performance of the localization algorithm can be related to the proper combination of statistical and numerical performances. We present and review a set of data models and numerical methods that can describe most localization algorithms. We also select a set of existing localization algorithms that can be considered as the most relevant, and we describe them under the proposed classification. We show that the performance of any localization algorithm has two components, i.e., a statistical one and a numerical one. The statistical performance is related to providing unbiased and minimum variance solutions, while the numerical one is related to ensuring the convergence of the solution. Furthermore, we show that a robust localization (i.e., statistically and numerically efficient) strategy, for airport surface surveillance, has to be composed of two specific kind of algorithms. Finally, an accuracy analysis, by using real data, is performed for the analyzed algorithms; some general guidelines are drawn and conclusions are provided.
TL;DR: A new transformation function is developed based on the existing sigmoid function and the tanh functions which have very interesting properties in enhancing images which are suffering from low illuminations or non-uniform lighting conditions.
Abstract: Images captured with insufficient illumination generally have dark shadows and low contrast. This problem seriously affects other forms of image processing schemes such as face detection, security surveillance, image fusion. In this paper, a new image enhancement algorithm using the important features of the contourlet transform is presented. A new transformation function is developed based on the existing sigmoid function and the tanh functions which have very interesting properties in enhancing images which are suffering from low illuminations or non-uniform lighting conditions. Literature dictates that contourlet transform has better performance in representing the image salient features such as edges, lines, curves, and contours than wavelets for its anisotropy and directionality and is therefore well suited for multiscale edge-based image enhancement. The algorithm works for gray scale and color images. For a color image, it is first converted from RGB (red, green, and blue) to HSI (hue, saturation, and intensity) color model. Then, the intensity component of the HSI color space is adjusted the preserving the original color using a new nonlinear transformation function. The simulation results show that this approach gives encouraging results for images taken in low-light and/or non-uniform lighting conditions. The results obtained are compared with other enhancement algorithms based on wavelet transform, curvelet transform, bandlet transform, histogram equalization (HE), and contrast limited adaptive histogram equalization. The performance of the enhancement based on the contourlet transform method is superior. The algorithm is checked for a total of 151 test images. A total of 120 of them are used for subjective evaluation and 31 are used for objective evaluation. For over 90 % of the cases, the system is superior over the other enhancement methods.
TL;DR: This study introduces a novel watermarking scheme based on the discrete wavelet transform (DWT) in combination with the chirp z-transform (CZT) and the singular value decomposition (SVD).
Abstract: Digital watermarking has attracted increasing attentions as it has been the current solution to copyright protection and content authentication which has become an issue to be addressed in multimedia technology. This study introduces a novel watermarking scheme based on the discrete wavelet transform (DWT) in combination with the chirp z-transform (CZT) and the singular value decomposition (SVD). Firstly, the image is decomposed into its frequency subbands by using 1-level DWT. Then, the high-frequency subband is transformed into z-domain by using CZT. Afterward by SVD, the watermark is added to the singular matrix of the transformed image. Finally, the watermarked image is obtained by using inverse of CZT and inverse of DWT. This algorithm combines the advantages of all three algorithms. The experimental result shows that the algorithm is imperceptible and robust to several attacks and signal processing operations.
TL;DR: A novel watermarking algorithm is proposed to embed the color image watermark in the direct current (DC) coefficients and the alternating current (AC) coefficients of the color host image by utilizing the two-level DCT.
Abstract: With the widespread use of color images in many areas, the colorful logo or mark is gradually used as watermark to protect the copyright in recent years. Since the color image watermark has more bit information, it is a challenging work to design a robust color watermarking scheme. By utilizing the two-level DCT, a novel watermarking algorithm is proposed to embed the color image watermark in the direct current (DC) coefficients and the alternating current (AC) coefficients of the color host image. Firstly, the host image is divided into $$8 \times 8$$
non-overlapping blocks, and these blocks are transformed by one-level DCT. Secondly, its upper-left $$4 \times 4$$
coefficients are further transformed by two-level DCT, and the transformed coefficients are ordered by zigzag arrangement. Thirdly, according to human visual system (HVS), the digital watermarks are embedded into the DC coefficient and the first seven AC coefficients of these blocks, respectively. Experimental results show that the proposed watermarking algorithm is robust to many common image processing attacks and geometric attacks, and the performance of the proposed method outperforms other color watermarking methods considered in this paper.
TL;DR: Two approaches using the 2D+T curvelet transform are presented and compared using three new large databases and feature vectors used for recognition are described as well as their relevance, and performances of the different methods are discussed.
Abstract: The research context of this article is the recognition and description of dynamic textures. In image processing, the wavelet transform has been successfully used for characterizing static textures. To our best knowledge, only two works are using spatio-temporal multiscale decomposition based on the tensor product for dynamic texture recognition. One contribution of this article is to analyze and compare the ability of the 2D+T curvelet transform, a geometric multiscale decomposition, for characterizing dynamic textures in image sequences. Two approaches using the 2D+T curvelet transform are presented and compared using three new large databases. A second contribution is the construction of these three publicly available benchmarks of increasing complexity. Existing benchmarks are either too small not available or not always constructed using a reference database. Feature vectors used for recognition are described as well as their relevance, and performances of the different methods are discussed. Finally, future prospects are exposed.
TL;DR: It is shown that in many cases, mixed-resolution coding achieves a similar subjective quality to that of symmetric stereoscopic video coding, while the computational complexity is significantly reduced.
Abstract: In asymmetric stereoscopic video compression, the views are coded with different qualities. According to the binocular suppression theory, the perceived quality is closer to that of the higher-fidelity view. Hence, a higher compression ratio is potentially achieved through asymmetric coding. Furthermore, when mixed-resolution coding is applied, the complexity of the coding and decoding is reduced. In this paper, we study whether asymmetric stereoscopic video coding achieves the mentioned claimed benefits. Two sets of systematic subjective quality evaluation experiments are presented in the paper. In the first set of the experiments, we analyze the extent of downsampling for the lower-resolution view in mixed-resolution stereoscopic videos. We show that the lower-resolution view becomes dominant in the subjective quality rating at a certain downsampling ratio, and this is dependent on the sequence, the angular resolution, and the angular width. In the second set of the experiments, we compare symmetric stereoscopic video coding, quality-asymmetric stereoscopic video coding, and mixed-resolution coding subjectively. We show that in many cases, mixed-resolution coding achieves a similar subjective quality to that of symmetric stereoscopic video coding, while the computational complexity is significantly reduced.
TL;DR: A new motion history representation that incorporates both optical flow and a revised MHI is proposed that yields 100 % recognition rates on both test datasets with a fast processing rate of 47 fps on $$200\times 150$$200×150 images.
Abstract: The motion history image (MHI) is a global spatiotemporal representation for video sequences It is computationally very simple and efficient It has been widely used for many real-time action recognition tasks However, the conventional MHI assigns a fixed motion strength to each detected foreground point and then updates it with a small constant for the background point Local body parts with different movement speeds and durations will then have the same intensity in the MHI Similar actions may generate indistinguishable MHI patterns In this paper, we propose a new motion history representation that incorporates both optical flow and a revised MHI The motion strength of each pixel point is adaptively accumulated by the optical flow length at that location It is then exponentially updated over time It can better describe local movements of body parts in the global temporal template The motion duration is implicitly given by the update rate for better description of various actions in the scene For action classification, a set of training action samples are first collected and form the basis templates An action sequence is then constructed as the linear combination of the basis templates The coefficients of the combination give the feature vector The Euclidean distance is finally used to evaluate the similarity between the feature vectors Experimental results on the widely used KTH and Weizmann datasets have shown that the proposed scheme yields 100 % recognition rates on both test datasets with a fast processing rate of 47 fps on $$200\times 150$$
images
TL;DR: A new method for segmentation of moving object which is based on double change detection technique applied on Daubechies complex wavelet coefficients of three consecutive frames is introduced to have high degree of segmentation accuracy than the other state-of-the-art methods.
Abstract: Motion segmentation is a crucial step in video analysis and is associated with a number of computer vision applications This paper introduces a new method for segmentation of moving object which is based on double change detection technique applied on Daubechies complex wavelet coefficients of three consecutive frames Daubechies complex wavelet transform for segmentation of moving object has been chosen as it is approximate shift invariant and has a better directional selectivity as compared to real valued wavelet transform Double change detection technique is used to obtain video object plane by inter-frame difference of three consecutive frames Double change detection technique also provides automatic detection of appearance of new objects The proposed method does not require any other parameter except Daubechies complex wavelet coefficients Results of the proposed method for segmentation of moving objects are compared with results of other state-of-the-art methods in terms of visual performance and a number of quantitative performance metrics viz Misclassification Penalty, Relative Foreground Area Measure, Pixel Classification Based Measure, Normalized Absolute Error, and Percentage of Correct Classification The proposed method is found to have high degree of segmentation accuracy than the other state-of-the-art methods
TL;DR: A function for calculating Euclidean distance transform in large binary images of dimension three or higher in Matlab that significantly outperforms the Matlab’s standard distance transform function “bwdist” both in terms of the computation time and the possible data sizes.
Abstract: In this note, we introduce a function for calculating Euclidean distance transform in large binary images of dimension three or higher in Matlab. This function uses transparent and fast line-scan algorithm that can be efficiently implemented on vector processing architectures such as Matlab and significantly outperforms the Matlab’s standard distance transform function “bwdist” both in terms of the computation time and the possible data sizes. The described function also can be used to calculate the distance transform of the data with anisotropic voxel aspect ratios. These advantages make this function especially useful for high-performance scientific and engineering applications that require distance transform calculations for large multidimensional and/or anisotropic datasets in Matlab. The described function is publicly available from the Matlab Central website under the name “bwdistsc”, “Euclidean Distance Transform for Variable Data Aspect Ratio”.
TL;DR: The proposed iterative bilateral filter improves the denoising efficiency, preserves the fine structures and also reduces the bias due to Rician noise.
Abstract: Noise removal from magnetic resonance images is important for further processing and visual analysis. Bilateral filter is known for its effectiveness in edge-preserved image denoising. In this paper, an iterative bilateral filter for filtering the Rician noise in the magnitude magnetic resonance images is proposed. The proposed iterative bilateral filter improves the denoising efficiency, preserves the fine structures and also reduces the bias due to Rician noise. The visual and diagnostic quality of the image is well preserved. The quantitative analysis based on the standard metrics like peak signal-to-noise ratio and mean structural similarity index matrix shows that the proposed method performs better than the other recently proposed denoising methods for MRI.
TL;DR: This paper develops multiresolution analysis associated with the FRWT and derives a construction of orthogonal wavelets for theFRWT, and some applications of the derived results are discussed.
Abstract: The fractional wavelet transform (FRWT), which generalizes the classical wavelet transform, has been shown to be potentially useful for signal processing. Many fundamental results of this transform are already known, but the theory of multiresolution analysis and orthogonal wavelets is still missing. In this paper, we first develop multiresolution analysis associated with the FRWT and then derive a construction of orthogonal wavelets for the FRWT. Several fractional wavelets are also presented. Moreover, some applications of the derived results are discussed.
TL;DR: By employing the proposed method which enhances the fingerprint images using the better enhancing filter in each part, the experimental results show that the whole finger print is better enhanced, and consequently, it leads to a better recognition rate.
Abstract: Fingerprints are the best biometric identity mark due to the consistency during life time and uniqueness. To increase the classification accuracy of fingerprint images, it is necessary to improve image quality which is a key role for correct recognition. In other words, enhancing the fingerprint images leads us to obtain better results in classification of fingerprint images. Although Gabor filter and fast Fourier transform (FFT) are used to enhance fingerprint images, Gabor filter acts better than FFT in detection of incorrect ridge endings and ridge bifurcation, while FFT tries to connect broken ridges together and fill the created holes. This paper tries to enhance gray-scale fingerprint images by combining the Gabor filter and FFT in order to get benefit from the advantages of each enhancing filter (Gabor filter and FFT). A method is proposed for fingerprint image segmentation based on the image histogram and density. By employing the proposed method which enhances the fingerprint images using the better enhancing filter in each part, the experimental results show that the whole finger print is better enhanced, and consequently, it leads to a better recognition rate.
TL;DR: This paper explores the use of Zernike moment-based global features for initial landmark estimation and computing small expectation window for each landmark and local template matching based on ring and central projection method for a closer approximation of landmark position.
Abstract: Cephalometry is an essential clinical and research tool in orthodontics. It has been used for decades to obtain absolute and relative measures of the craniofacial skeleton. Since manual identification of predefined anatomical landmarks is a very tedious approach, there is a strong need for automated methods. This paper explores the use of Zernike moment-based global features for initial landmark estimation and computing small expectation window for each landmark. Using this expectation window and local template matching based on ring and central projection method, a closer approximation of landmark position is obtained. A smaller search window based on this approximation is used to find the exact location of landmark positions based on template matching using a combination of sum of squared distance and normalized cross-correlation. The system was tested on 18 commonly used landmarks using a dataset of 85 randomly selected cephalograms. A total of 89.5 % of the localization of 18 selected landmarks are within a window of $$\le \!\!\pm 2\text{ mm}$$
. The average mean error for the 18 landmarks is 1.84 mm and average SD of mean error is 1.24.
TL;DR: This paper investigates the feasibility of using 3-channel forehead biosignals as informative channels for emotion recognition during music listening as well as employing two parallel cascade-forward neural networks as arousal and valence classifiers.
Abstract: Emotion recognition systems are helpful in human–machine interactions and clinical applications. This paper investigates the feasibility of using 3-channel forehead biosignals (left temporalis, frontalis, and right temporalis channel) as informative channels for emotion recognition during music listening. Classification of four emotional states (positive valence/low arousal, positive valence/high arousal, negative valence/high arousal, and negative valence/low arousal) in arousal–valence space was performed by employing two parallel cascade-forward neural networks as arousal and valence classifiers. The inputs of the classifiers were obtained by applying a fuzzy rough model feature evaluation criterion and sequential forward floating selection algorithm. An averaged classification accuracy of 87.05 % was achieved, corresponding to average valence classification accuracy of 93.66 % and average arousal classification accuracy of 93.29 %.
TL;DR: An evolutionary approach for designing an SVM-based classifier (ESVM) by optimization of automatic parameter tuning using genetic algorithm is proposed and it is shown that ESVM can obtain a high accuracy using tenfold cross-validation for the EMG datasets.
Abstract: Support vector machines (SVMs) have been widely used in many pattern recognition problems. Generally, the performance of SVM classifiers is affected by the selection of the kernel parameters. However, SVM does not offer the mechanism for proper setting of their control parameters. The objective of this research is to optimize the parameters without degrading the SVM classification accuracy in diagnosis of neuromuscular disorders. An evolutionary approach for designing an SVM-based classifier (ESVM) by optimization of automatic parameter tuning using genetic algorithm is proposed. To illustrate and evaluate the efficiency of ESVM, a typical application to EMG signals classification using normal, myopathic, and neurogenic datasets is adopted. In the proposed method, the EMG signals were decomposed into the frequency sub-bands using discrete wavelet transform (DWT), and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients. It is shown that ESVM can obtain a high accuracy of 97 % using tenfold cross-validation for the EMG datasets. ESVM is developed as an efficient tool, so that various SVMs can be used conveniently as the core of ESVM for diagnosis of neuromuscular disorders.
TL;DR: The experimental results showed that the proposed CBIR system had better performance versus the other related ones and the proposed approach tested by using Corel and VisTex image data sets and the results were satisfactory.
Abstract: In this paper, we have proposed a content-based image retrieval (CBIR) system based on two kinds of features: intra-class and inter-class. Intra-class features are a new layout for color distribution of an image in RGB color space. This layout has been proposed based on the concept of co-occurrence matrix and called Distribution of Color Ton. Inter-class features are extracted using dual-tree complex wavelet transform, singular value decomposition (SVD), and conceptual segmentation based on human vision system. In the proposed method, these two kinds of features together are followed by self-organizing map as classifier to have an efficient CBIR system which considers both structural and signal processing feature descriptors advantages. The proposed approach tested by using Corel and VisTex image data sets and the results were satisfactory. The experimental results showed that the proposed method had better performance versus the other related ones.
TL;DR: An efficient automated method for facial expression recognition based on the histogram of oriented gradient (HOG) descriptor, which is higher than the recognition rates for almost all other single-image- or video-based methods for facial emotion recognition.
Abstract: This article proposes an efficient automated method for facial expression recognition based on the histogram of oriented gradient (HOG) descriptor. This subject-independent method was designed for recognizing six prototyping emotions. It recognizes emotions by calculating differences on a level of feature descriptors between a neutral expression and a peak expression of an observed person. The parameters for the HOG descriptor were determined by using a genetic algorithm. Support vector machines (SVM) were applied during the recognition phase, whereat one SVM classifier was trained for one emotion. Each classifier was trained using difference vectors obtained by subtraction of HOG feature vectors calculated for the neutral and apex emotion subjects image. The proposed method was tested by using a leave-one-subject-out validation strategy for 106 subjects on 1232 images from the Cohn Kanade, and for 10 subjects on 192 images from the JAFFE database. A mean recognition rate of 95.64 % was obtained using the Cohn Kanade database, which is higher than the recognition rates for almost all other single-image- or video-based methods for facial emotion recognition.
TL;DR: A new fusion framework for spatially registered visual and infrared images is described, which utilizes the properties of fractal dimension and phase congruency in the non-subsampled contourlet transform (NSCT) domain.
Abstract: The night-vision image fusion plays a critical role in detecting targets and obstructions in low light or total darkness, which has great importance for pedestrian recognition, vehicle navigation, surveillance and monitoring applications. The central idea is to fuse low-light visible and infrared imagery into a single output. In this paper, we describe a new fusion framework for spatially registered visual and infrared images. The proposed framework utilizes the properties of fractal dimension and phase congruency in the non-subsampled contourlet transform (NSCT) domain. The proposed framework applies multiscale NSCT on visual and IR images to get low- and high-frequency bands. The varied frequency bands of the transformed images are then fused while exploiting their characteristics. Finally, the inverse NSCT is performed to get the fused image. The performance of the proposed framework is validated by extensive experiments on different scene imaginary, where the definite advantages are demonstrated subjectively and objectively.