scispace - formally typeset
Search or ask a question

Showing papers in "Signal & Image Processing : An International Journal in 2017"


Journal ArticleDOI
TL;DR: A robust approach for human action recognition is proposed by extracting stable spatio-temporal features in terms of pairwise local binary pattern (P-LBP) and scale invariant feature transform (SIFT).
Abstract: Human action recognition is still a challenging problem and researchers are focusing to investigate this problem using different techniques. We propose a robust approach for human action recognition. This is achieved by extracting stable spatio-temporal features in terms of pairwise local binary pattern (P-LBP) and scale invariant feature transform (SIFT). These features are used to train an MLP neural network during the training stage, and the action classes are inferred from the test videos during the testing stage. The proposed features well match the motion of individuals and their consistency, and accuracy is higher using a challenging dataset. The experimental evaluation is conducted on a benchmark dataset commonly used for human action recognition. In addition, we show that our approach outperforms individual features i.e. considering only spatial and only temporal feature.

18 citations


Journal ArticleDOI
TL;DR: Improvements achievable in the accuracy of post-fire effects mapping with machine learning algorithms that use hyperspatial (sub-decimeter) drone imagery are investigated.
Abstract: A variety of machine learning algorithms have been used to map wildland fire effects, but previous attempts to map post-fire effects have been conducted using relatively low-resolution satellite imagery. Small unmanned aircraft systems (sUAS) provide opportunities to acquire imagery with much higher spatial resolution than is possible with satellites or manned aircraft. This effort investigates improvements achievable in the accuracy of post-fire effects mapping with machine learning algorithms that use hyperspatial (sub-decimeter) drone imagery. Spatial context using a variety of texture metrics were also evaluated in order to determine the inclusion of spatial context as an additional input to the analytic tools along with the three-color bands. This analysis shows that the addition of texture as an additional fourth input increases classifier accuracy when mapping post-fire effects.

9 citations



Journal ArticleDOI
TL;DR: This work proposes a novel generative robust model that trains a Deep Neural Network to learn about image features after extracting information about the content of images, and uses the novel combination of CNN and LSTM.
Abstract: In this digital world, artificial intelligence has provided solutions to many problems, likewise to encounter problems related to digital images and operations related to the extensive set of images. We should learn how to analyze an image, and for that, we need feature extraction of the content of that image. Image description methods involve natural language processing and concepts of computer vision. The purpose of this work is to provide an efficient and accurate image description of an unknown image by using deep learning methods. We propose a novel generative robust model that trains a Deep Neural Network to learn about image features after extracting information about the content of images, for that we used the novel combination of CNN and LSTM. We trained our model on MSCOCO dataset, which provides set of annotations for a particular image, and after the model is fully automated, we tested it by providing raw images. And also several experiments are performed to check efficiency and robustness of the system, for that we have calculated BLEU Score.

5 citations


Journal ArticleDOI
TL;DR: A simple and effective method for decreasing the effect of noise on the autocorrelation of the clean signal by inserting a speech/noise cross correlation term into the equations used for the estimation of clean speech autoc orrelation is discussed.
Abstract: Previous research has found autocorrelation domain as an appropriate domain for signal and noise separation. This paper discusses a simple and effective method for decreasing the effect of noise on the autocorrelation of the clean signal. This could later be used in extracting mel cepstral parameters for speech recognition. Two different methods are proposed to deal with the effect of error introduced by considering speech and noise completely uncorrelated. The basic approach deals with reducing the effect of noise via estimation and subtraction of its effect from the noisy speech signal autocorrelation. In order to improve this method, we consider inserting a speech/noise cross correlation term into the equations used for the estimation of clean speech autocorrelation, using an estimate of it, found through Kernel method. Alternatively, we used an estimate of the cross correlation term using an averaging approach. A further improvement was obtained through introduction of an overestimation parameter in the basic method. We tested our proposed methods on the Aurora 2 task. The Basic method has shown considerable improvement over the standard features and some other robust autocorrelation-based features. The proposed techniques have further increased the robustness of the basic autocorrelation-based method.

3 citations


Journal ArticleDOI
TL;DR: A diffusion-steered model is proposed that gives an effective interplay between total variation and Perona-Malik models and can be evolved over a longer time without smudging critical image features.
Abstract: Ultrasonograms refer to images generated through ultrasonography, a technique that applies ultrasound pulses to delineate internal structures of the body. Despite being useful in medicine, ultrasonograms usually suffer from multiplicative noises that may limit doctors to analyse and interpret them. Attempts to address the challenge have been made from previous works, but denoising ultrasonograms while preserving semantic features remains an open-ended problem. In this work, we have proposed a diffusion-steered model that gives an effective interplay between total variation and Perona-Malik models. Two parameters have been introduced into the framework to convexify our energy functional. Also, to deal with multiplicative noise, we have incorporated a log-based prior into the framework. Empirical results show that the proposed method generates sharper and detailed images. Even more importantly, our framework can be evolved over a longer time without smudging critical image features.

2 citations


Journal ArticleDOI
TL;DR: The proposed CPD thresholding algorithm does not assume any prior statistical distribution of background and object grey levels and is less influenced by an outlier due to the judicious derivation of a robust criterion function depending on Kullback-Leibler divergence measure.
Abstract: Aim of this paper is reformulation of global image thresholding problem as a well-founded statistical method known as change-point detection (CPD) problem. Our proposed CPD thresholding algorithm does not assume any prior statistical distribution of background and object grey levels. Further, this method is less influenced by an outlier due to our judicious derivation of a robust criterion function depending on Kullback-Leibler (KL) divergence measure. Experimental result shows efficacy of proposed method compared to other popular methods available for global image thresholding. In this paper we also propose a performance criterion for comparison of thresholding algorithms. This performance criteria does not depend on any ground truth image. We have used this performance criterion to compare the results of proposed thresholding algorithm with most cited global thresholding algorithms in the literature.

2 citations



Journal ArticleDOI
TL;DR: The aim of this paper is to show a procedure for identify the barycentre of a diver by means of video processing, aimed to introduce quantitative analysis tools and diving performance measurement and therefore in diving training.
Abstract: The aim of this paper is to show a procedure for identify the barycentre of a diver by means of video processing. This procedure is aimed to introduce quantitative analysis tools and diving performance measurement and therefore in diving training. Sport performance analysis is a trend that is growing exponentially for all level athletes: it has been applied extensively in some sports such as cycling. Sport performance analysis has been applied mainly for high level athletes; in order to be used also for middle or low level athletes the proposed technique has to be flexible and low cost. Video processing is suitable to fulfil both these requirements. In diving, the first analysis that has to be done is the barycentre trajectory tracking.

Journal ArticleDOI
TL;DR: Experimental results shows that the proposed technique outperforms the conventional and state-of-the-art techniques and the denoised images using DTCWT (Dual Tree Complex Wavelet Transform) is better balance between smoothness and accuracy than the DWT.
Abstract: This paper addresses image enhancement system consisting of image denoising technique based on Dual Tree Complex Wavelet Transform (DT-CWT) . The proposed algorithm at the outset models the noisy remote sensing image (NRSI) statistically by aptly amalgamating the structural features and textures from it. This statistical model is decomposed using DTCWT with Tap-10 or length-10 filter banks based on Farras wavelet implementation and sub band coefficients are suitably modeled to denoise with a method which is efficiently organized by combining the clustering techniques with soft thresholding softclustering technique. The clustering techniques classify the noisy and image pixels based on the neighborhood connected component analysis(CCA), connected pixel analysis and inter-pixel intensity variance (IPIV) and calculate an appropriate threshold value for noise removal. This threshold value is used with soft thresholding technique to denoise the image .Experimental results shows that that the proposed technique outperforms the conventional and state-of-the-art techniques .It is also evaluated that the denoised images using DTCWT (Dual Tree Complex Wavelet Transform) is better balance between smoothness and accuracy than the DWT.. We used the PSNR (Peak Signal to Noise Ratio) along with RMSE to assess the quality of denoised images.

Journal ArticleDOI
TL;DR: A statistical framework for recognising 2D shapes which are represented as an arrangement of curves or strokes and how the stroke parameters, shape-alignment parameters and stroke labels may be recovered by applying the expectation maximization EM algorithm to the utility measure is presented.
Abstract: This paper presents a statistical framework for recognising 2D shapes which are represented as an arrangement of curves or strokes. The approach is a hierarchical one which mixes geometric and symbolic information in a three-layer architecture. Each curve primitive is represented using a point-distribution model which describes how its shape varies over a set of training data. We assign stroke labels to the primitives and these indicate to which class they belong. Shapes are decomposed into an arrangement of primitives and the global shape representation has two components. The first of these is a second point distribution model that is used to represent the geometric arrangement of the curve centre-points. The second component is a string of stroke labels that represents the symbolic arrangement of strokes. Hence each shape can be represented by a set of centre-point deformation parameters and a dictionary of permissible stroke label configurations. The hierarchy is a two-level architecture in which the curve models reside at the nonterminal lower level of the tree. The top level represents the curve arrangements allowed by the dictionary of permissible stroke combinations. The aim in recognition is to minimise the cross entropy between the probability distributions for geometric alignment errors and curve label errors. We show how the stroke parameters, shape-alignment parameters and stroke labels may be recovered by applying the expectation maximization EM algorithm to the utility measure. We apply the resulting shape-recognition method to Arabic character recognition.

Journal ArticleDOI
TL;DR: This paper tends to propose the special domain based mostly watermarking scheme for color pictures that uses the Sobel and canny edge detection strategies to work out edge data of the luminance and chrominance elements of the colour image.
Abstract: Copyright protection has currently become a difficult domain in reality situation. an honest quality watermarking scheme might to have high sensory activity transparency, and may even be robust enough against potential attacks. This paper tends to propose the special domain based mostly watermarking scheme for color pictures. This scheme uses the Sobel and canny edge detection strategies to work out edge data of the luminance and chrominance elements of the colour image. The edge detection strategies are used to verify the embedding capability of every color element. The massive capacities of watermark bits are embedded into an element of enormous edge information. The strength of the projected scheme is analyzed considering differing kinds of image process attacks, like Blurring and adding noise.

Journal ArticleDOI
TL;DR: A novel approach to background subtraction that compare a current frame with the background model that has been set before, so each pixel of the image can be classified as a foreground or a background element and the tracking step is divided into two different methods, the surface method and the K-NN method.
Abstract: Object tracking can be defined as the process of detecting an object of interest from a video scene and keeping track of its motion, orientation, occlusion etc. in order to extract useful information. It is indeed a challenging problem and it’s an important task. Many researchers are getting attracted in the field of computer vision, specifically the field of object tracking in video surveillance. The main purpose of this paper is to give to the reader information of the present state of the art object tracking, together with presenting steps involved in Background Subtraction and their techniques. In related literature we found three main methods of object tracking: the first method is the optical flow; the second is related to the background subtraction, which is divided into two types presented in this paper, then the temporal differencing and the SIFT method and the last one is the mean shift method. We present a novel approach to background subtraction that compare a current frame with the background model that we have set before, so we can classified each pixel of the image as a foreground or a background element, then comes the tracking step to present our object of interest, which is a person, by his centroid. The tracking step is divided into two different methods, the surface method and the K-NN method, both are explained in the paper. Our proposed method is implemented and evaluated using CAVIAR database.

Journal ArticleDOI
TL;DR: The Novel concept of HVDLP is introduced in the proposed method to enhance the performance and the Euclidean Distance is used to compare final features of face database and test images to compute the performance parameters.
Abstract: Face image is an efficient biometric trait to recognize human beings without expecting any co-operation from a person. In this paper, we propose HVDLP: Horizontal Vertical Diagonal Local Pattern based face recognition using Discrete Wavelet Transform (DWT) and Local Binary Pattern (LBP). The face images of different sizes are converted into uniform size of 108×990and color images are converted to gray scale images in pre-processing. The Discrete Wavelet Transform (DWT) is applied on pre-processed images and LL band is obtained with the size of 54×45. The Novel concept of HVDLP is introduced in the proposed method to enhance the performance. The HVDLP is applied on 9×9 sub matrix of LL band to consider HVDLP coefficients. The local Binary Pattern (LBP) is applied on HVDLP of LL band. The final features are generated by using Guided filters on HVDLP and LBP matrices. The Euclidean Distance (ED) is used to compare final features of face database and test images to compute the performance parameters.

Journal ArticleDOI
TL;DR: The researchers tried to estimate both types of sounds noisy European woman are due to interference noise source of fire fighter and faculty room and assess its impact in the form of graphs vote against a function of time, the pattern shape of the signal and SNR.
Abstract: This study aimed to estimate the original voice signal which is interrupted by noise with MMSE method based on Wiener filter. The Wiener filter is classified as the MMSE estimator studied by previous researchers. The study assessed the voice signal input count down by European woman that are distorted by two types of noises, the one is noise based on the outdoor location, sound of siren firefighter, and the other is the indoor location noise, which represented by noise in lecturer room in campus. The two process signal is estimated by MMSE estimator which approximated by Wiener filter that must have founded and counted the covariance of each signal processes are related to the system. Thus the researchers tried to estimate both types of sounds noisy European woman are due to interference noise source of fire fighter and faculty room and assess its impact in the form of graphs vote against a function of time, the pattern shape of the signal and SNR.

Journal ArticleDOI
TL;DR: In this paper, a new gradient-based method for the extraction of the orientation field associated to a fingerprint, and a regularisation procedure to improve the orientation fields computed from noisy fingerprint images was proposed.
Abstract: We propose a new gradient-based method for the extraction of the orientation field associated to a fingerprint, and a regularisation procedure to improve the orientation field computed from noisy fingerprint images. The regularisation algorithm is based on three new integral operators, introduced and discussed in this paper. A pre-processing technique is also proposed to achieve better performances of the algorithm. The results of a numerical experiment are reported to give an evidence of the efficiency of the proposed algorithm.


Journal ArticleDOI
TL;DR: A device, algorithm and graphical user interface is introduced to obtain anthropometric measurements of foot and no significant difference between manual and image-based anthropometry observed.
Abstract: This paper introduces a device, algorithm and graphical user interface to obtain anthropometric measurements of foot. Presented device facilitates obtaining scale of image and image processing by taking one image from side foot and underfoot simultaneously. Introduced image processing algorithm minimizes a noise criterion, which is suitable for object detection in single object images and outperforms famous image thresholding methods when lighting condition is poor. Performance of image-based method is compared to manual method. Image-based measurements of underfoot in average was 4mm less than actual measures. Mean absolute error of underfoot length was 1.6mm, however length obtained from side foot had 4.4mm mean absolute error. Furthermore, based on t-test and f-test results, no significant difference between manual and image-based anthropometry observed. In order to maintain anthropometry process performance in different situations user interface designed for handling changes in light conditions and altering speed of the algorithm.

Journal ArticleDOI
TL;DR: Algorithm view and converts .dcm image files jpeg2000 standard image is proposed, whereby the image should be viewable, using with common image viewer programs.
Abstract: Imaging and Communications in Medicine (DICOM) standard is an image archive system which allow itself to serve as an image manager that control the acquisition, retrieval, and distributions of medical images within entire picture archiving and communication The DICOM technology is suitable when sending images between different departments within hospitals or/and other hospitals, and consultant. However, some hospitals lack the DICOM system. In this paper proposed algorithm view and converts .dcm image files jpeg2000 standard image, whereby the image should be viewable, using with common image viewer programs. Now this files are ready to transfer via internet and easily viewable on normal computer systems using JPEG2000 viewer or on Linux platform and Windows platform.