scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Signal and Information Processing in 2014"


Journal ArticleDOI
TL;DR: Various acoustic features are combined to form a feature set, so as to detect voice disorders in children based on which further treatments can be prescribed by a pathologist and enable an automatic non-invasive device to diagnose and analyze the voice of the patient.
Abstract: The identification and classification of pathological voice are still a challenging area of research in speech processing. Acoustic features of speech are used mainly to discriminate normal voices from pathological voices. This paper explores and compares various classification models to find the ability of acoustic parameters in differentiating normal voices from pathological voices. An attempt is made to analyze and to discriminate pathological voice from normal voice in children using different classification methods. The classification of pathological voice from normal voice is implemented using Support Vector Machine (SVM) and Radial Basis Functional Neural Network (RBFNN). The normal and pathological voices of children are used to train and test the classifiers. A dataset is constructed by recording speech utterances of a set of Tamil phrases. The speech signal is then analyzed in order to extract the acoustic parameters such as the Signal Energy, pitch, formant frequencies, Mean Square Residual signal, Reflection coefficients, Jitter and Shimmer. In this study various acoustic features are combined to form a feature set, so as to detect voice disorders in children based on which further treatments can be prescribed by a pathologist. Hence, a successful pathological voice classification will enable an automatic non-invasive device to diagnose and analyze the voice of the patient.

37 citations


Journal ArticleDOI
TL;DR: Results showed that using one-fourth overlapped data buffers with 128 points Hanning windows and no frames averaging leads to the best performance in removing noise from the noisy speech.
Abstract: Spectral subtraction is used in this research as a method to remove noise from noisy speech signals in the frequency domain. This method consists of computing the spectrum of the noisy speech using the Fast Fourier Transform (FFT) and subtracting the average magnitude of the noise spectrum from the noisy speech spectrum. We applied spectral subtraction to the speech signal “Real graph”. A digital audio recorder system embedded in a personal computer was used to sample the speech signal “Real graph” to which we digitally added vacuum cleaner noise. The noise removal algorithm was implemented using Matlab software by storing the noisy speech data into Hanning time-widowed half-overlapped data buffers, computing the corresponding spectrums using the FFT, removing the noise from the noisy speech, and reconstructing the speech back into the time domain using the inverse Fast Fourier Transform (IFFT). The performance of the algorithm was evaluated by calculating the Speech to Noise Ratio (SNR). Frame averaging was introduced as an optional technique that could improve the SNR. Seventeen different configurations with various lengths of the Hanning time windows, various degrees of data buffers overlapping, and various numbers of frames to be averaged were investigated in view of improving the SNR. Results showed that using one-fourth overlapped data buffers with 128 points Hanning windows and no frames averaging leads to the best performance in removing noise from the noisy speech.

35 citations


Journal ArticleDOI
TL;DR: In this article, a nonlinear autoregressive approach with exogenous input is used as a novel method for statistical forecasting of the disturbance storm time index, a measure of space weather related to the ring current which surrounds the Earth, and fluctuations in disturbance stormtime field strength as a result of incoming solar particles.
Abstract: A nonlinear autoregressive approach with exogenous input is used as a novel method for statistical forecasting of the disturbance storm time index, a measure of space weather related to the ring current which surrounds the Earth, and fluctuations in disturbance storm time field strength as a result of incoming solar particles. This ring current produces a magnetic field which opposes the planetary geomagnetic field. Given the occurrence of solar activity hours or days before subsequent geomagnetic fluctuations and the potential effects that geomagnetic storms have on terrestrial systems, it would be useful to be able to predict geophysical parameters in advance using both historical disturbance storm time indices and external input of solar winds and the interplanetary magnetic field. By assessing various statistical techniques it is determined that artificial neural networks may be ideal for the prediction of disturbance storm time index values which may in turn be used to forecast geomagnetic storms. Furthermore, it is found that a Bayesian regularization neural network algorithm may be the most accurate model compared to both other forms of artificial neural network used and the linear models employing regression analyses.

23 citations


Journal ArticleDOI
TL;DR: It has been shown that applying wavelet edge detection method to the segmented images generated through the proposed image preprocessing approach yields the superior performance among other standard edge detection methods.
Abstract: Edge detection is the process of determining where boundaries of objects fall within an image. So far, several standard operators-based methods have been widely used for edge detection. However, due to inherent quality of images, these methods prove ineffective if they are applied without any preprocessing. In this paper, an image preprocessing approach has been adopted in order to get certain parameters that are useful to perform better edge detection with the standard operators-based edge detection methods. The proposed preprocessing approach involves computation of the histogram, finding out the total number of peaks and suppressing irrelevant peaks. From the intensity values corresponding to relevant peaks, threshold values are obtained. From these threshold values, optimal multilevel thresholds are calculated using the Otsu method, then multilevel image segmentation is carried out. Finally, a standard edge detection method can be applied to the resultant segmented image. Simulation results are presented to show that our preprocessed approach when used with a standard edge detection method enhances its performance. It has been also shown that applying wavelet edge detection method to the segmented images, generated through our preprocessing approach, yields the superior performance among other standard edge detection methods.

22 citations


Journal ArticleDOI
TL;DR: A mathematical approach is proposed in order to identify characteristics of these signals of real traffic situations and to quantify their similarity and complexity and enables the reconstruction of bus ACC and DEC signals.
Abstract: Public transportation by bus is an essential part of mobility. Braking and starting, e.g., approaching a bus stop, are documented as the main reason for non-collision incidents. These situations are evoked by the acceleration forces leading to perturbations of the passenger’s base of support. In laboratory studies perturbations are applied to getting insight into the postural control system and neuromuscular responses. However, bus perturbations diverge from laboratory ones with respect to duration, maximum and shape, and it was shown recently that these characteristics influence the postural response. Thus, results from posturographic studies cannot be generalised and transferred to bus perturbations. In this study, acceleration (ACC) and deceleration (DEC) signals of real traffic situations were examined. A mathematical approach is proposed in order to identify characteristics of these signals and to quantify their similarity and complexity. Typical characteristics (duration, maximum, and shape) of real-world driving manoeuvres concerning start and stop situations could be identified. A mean duration of 13.6 s for ACC and 9.8 s for DEC signals was found which is clearly longer than laboratory perturbations. ACC and DEC signals are more complex than the used signals for platform displacements in the laboratory. The proposed method enables the reconstruction of bus ACC and DEC signals. The data can be used as input for studies on postural control with high ecological validity.

16 citations


Journal ArticleDOI
TL;DR: Inspired by the core foundation of quantum mechanics, a new easy shape representation for content based image retrieval is proposed by borrowing the concept of quantum superposition into the basis of distance histogram.
Abstract: Content Based Image Retrieval (CBIR) is a technique in which images are indexed based on their visual contents and retrieving is only based upon these indexed images contents. Among the visual contents to describe the image details is shape. Shape of object, is considered as the most important distinguishable feature which living things can easily recognize, which is also a fact while this line is being written, and large efforts are currently underway in describing image contents by their shapes. Inspired by the core foundation of quantum mechanics, a new easy shape representation for content based image retrieval is proposed by borrowing the concept of quantum superposition into the basis of distance histogram. Results show better retrieval accuracy of the proposed method when compared with distance histogram.

9 citations


Journal ArticleDOI
TL;DR: A test on parts of PB.A57 manuscript, with 95% level of confidence concluded that the success percentage of preprocessing in producing Javanese character images ranged 85.9% - 94.82%.
Abstract: Manuscript preprocessing is the earliest stage in transliteration process of manuscripts in Javanese scripts. Manuscript preprocessing stage is aimed to produce images of letters which form the manuscripts to be processed further in manuscript transliteration system. There are four main steps in manuscript preprocessing, which are manuscript binarization, noise reduction, line segmentation, and character segmentation for every line image produced by line segmentation. The result of the test on parts of PB.A57 manuscript which contains 291 character images, with 95% level of confidence concluded that the success percentage of preprocessing in producing Javanese character images ranged 85.9% - 94.82%.

7 citations


Journal ArticleDOI
TL;DR: In this paper, an improved partial transmit sequence (PTS) scheme based on combining the grouped discrete cosine transform (DCT) with PTS technique is proposed, where adjacent partitioned data are firstly transformed by a DCT into new modified data, and then the proposed scheme utilizes the conventional PTS technique to further reduce the peak-to-average power ratio (PAPR) of the OFDM signal.
Abstract: The high peak-to-average power ratio (PAPR) is one of the serious problems in the application of OFDM technology. In this paper, an improved partial transmit sequence (PTS) scheme based on combining the grouped discrete cosine transform (DCT) with PTS technique is proposed. In the proposed scheme, the adjacent partitioned data are firstly transformed by a DCT into new modified data. After that the proposed scheme utilizes the conventional PTS technique to further reduce the PAPR of the OFDM signal. The performance of the PAPR is evaluated using a computer simulation. The simulation results indicate that the proposed scheme may improve the PAPR performance compared with the conventional PTS scheme, the grouped DCT scheme, and original OFDM respectively.

7 citations


Journal ArticleDOI
TL;DR: A new adaptive control method used to adjust the output voltage and current of DC-DC (DC: Direct Current) power converter under different sudden changes in load is presented.
Abstract: The purpose of this paper is to present a new adaptive control method used to adjust the output voltage and current of DC-DC (DC: Direct Current) power converter under different sudden changes in load. The controller is a PID controller (Proportional, Integrator, and Differentiator). The gains of the PID controller (KP, KI and KD) tuned using Simulated Annealing (SA) algorithm which is part of Generic Probabilistic Metaheuristic family. The new control system is expected to have a fast transient response feature, with less undershoot of the output voltage and less overshoot of the reactor current. Pulse Width Modulation (PWM) will be utilized to switch the power electronic devices.

5 citations


Journal ArticleDOI
TL;DR: In this article, a closed-form approximation for the residual inter-symbol interference (ISI) obtained by blind adaptive equalizers is proposed for the noisy and biased input case, where the error of the equalized output signal may be expressed as a polynomial function of order 3.
Abstract: Recently, two expressions (for the noiseless and noisy case) were proposed for the residual inter-symbol interference (ISI) obtained by blind adaptive equalizers, where the error of the equalized output signal may be expressed as a polynomial function of order 3. However, those expressions are not applicable for biased input signals. In this paper, a closed-form approximated expression is proposed for the residual ISI applicable for the noisy and biased input case. This new proposed expression is valid for blind adaptive equalizers, where the error of the equalized output signal may be expressed as a polynomial function of order 3. The new proposed expression depends on the equalizer’s tap length, input signal statistics, channel power, SNR, step-size parameter and on the input signal’s bias. Simulation results indicate a high correlation between the simulated results and those obtained from our new proposed expression.

4 citations


Journal ArticleDOI
TL;DR: The results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database, and efficiency results were better than the well-known Mel Frequency Frequency Coefficient (MFCC) and the Zak transform.
Abstract: In this paper, an expert system for security based on biometric human features that can be obtained without any contact with the registering sensor is presented. These features are extracted from human’s voice, so the system is called Voice Recognition System (VRS). The proposed system consists of a combination of three stages: signal pre-processing, features extraction by using Wavelet Packet Transform (WPT) and features matching by using Artificial Neural Networks (ANNs). The features vectors are formed after two steps: firstly, decomposing the speech signal at level 7 with Daubechies 20-tap (db20), secondly, the energy corresponding to each WPT node is calculated which collected to form a features vector. One hundred twenty eight features vector for each speaker was fed to the Feed Forward Back-propagation Neural Network (FFBPNN). The data used in this paper are drawn from the English Language Speech Database for Speaker Recognition (ELSDSR) database which composes of audio files for training and other files for testing. The performance of the proposed system is evaluated by using the test files. Our results showed that the rate of correct recognition of the proposed system is about 100% for training files and 95.7% for one testing file for each speaker from the ELSDSR database. The proposed method showed efficiency results were better than the well-known Mel Frequency Cepstral Coefficient (MFCC) and the Zak transform.

Journal ArticleDOI
TL;DR: Measurements reiterate recent measurements in a large population of human brains showing the superimposition of Schumann power densities in QEEG data and indicate that intrinsic features of proton densities within cerebral water may be a fundamental basis to consciousness that can be simulated experimentally.
Abstract: The physical properties of water, particularly the nature of interfacial water and pH shifts associated with dynamics of the hydronium ion near any surface, may be a primary source of the complex electromagnetic patterns frequently correlated with consciousness Effectively all of the major correlates of consciousness, including the 40 Hz and 8 Hz coupling between the cerebral cortices and hippocampal formation, can be accommodated by the properties of water within a specific-shaped volume exposed to a magnetic field In the present study, quantitative electroencephalographic activity was measured from an experimental simulation of the human head constructed using conductive dough whose pH could be changed systematically Spectral analyses of electrical potentials generated over the regions equivalent to the left and right temporal lobes in humans exhibited patterns characteristic of Schumann Resonance This fundamental and its harmonics are generated within the earth-ionospheric cavity with intensities similar to the volumetric intracerebral magnetic (~2 pT) and electric field (~6 × 10-1 V·m-1) strengths The power densities for specific pH values were moderately correlated with those obtained from normal human brains for the fundamental (first) and second harmonic for the level simulating the cerebral cortices Calculations indicated that the effective pH would be similar to that encountered within a single layer of protons near the plasma membrane surface These results reiterate recent measurements in a large population of human brains showing the superimposition of Schumann power densities in QEEG data and indicate that intrinsic features of proton densities within cerebral water may be a fundamental basis to consciousness that can be simulated experimentally

Journal ArticleDOI
TL;DR: Experimental results indicate that this de-noising method for IMU acceleration signals can efficiently and adaptively remove noise, and this method can better meet the precision requirement.
Abstract: In the track irregularity detection, the acceleration signals of the inertial measurement unit (IMU) output which with low frequency components and noise, this paper studied a de-noising algorithm. Based on the criterion of consecutive mean square error, a de-noising method for IMU acceleration signals based on empirical mode decomposition (EMD) was proposed. This method can divide the intrinsic mode functions (IMFs) derived from EMD into signal dominant modes and noise dominant modes, then the modes reflecting the important structures of a signal were combined together to form partially reconstructed de-noised signal. Simulations were conducted for simulated signals and a real IMU acceleration signals using this method. Experimental results indicate that this method can efficiently and adaptively remove noise, and this method can better meet the precision requirement.

Journal ArticleDOI
TL;DR: The convergence property of the regularized low pass filtering algorithm is proved in theory and tested by numerical results.
Abstract: In this paper the low pass filter is discussed in the noisy case. And a regularized low pass filter is presented. The convergence property of the regularized low pass filtering algorithm is proved in theory and tested by numerical results.

Journal ArticleDOI
TL;DR: A simple method to automatically localize signboard texts within JPEG mobile phone camera images using the Discrete Cosine Transform used by the JPEG compression format is proposed.
Abstract: Extraction of the text data present in images involves Text detection, Text localization, Text tracking, Text extraction, Text Enhancement and Text Recognition. Due to its inherent complexity, traditional text localization algorithms in natural scenes, especially in multi-context scenes, are not implementable under low computational resources architectures such as mobile phones. In this paper, we proposed a simple method to automatically localize signboard texts within JPEG mobile phone camera images. Taking into account the information provided by the Discrete Cosine Transform (DCT) used by the JPEG compression format, we delimitate the borders of the most important text region. This system is simple, reliable, affordable, easily implementable, and quick even working under architectures with low computational resources.

Journal ArticleDOI
TL;DR: New data analysis methods and new ways to detect people work location via mobile computing technology are presented and a centralized mobile device management (MDM) system is presented.
Abstract: Client software on mobile devices that can cause the remote control perform data mining tasks and show production results is significantly added the value for the nomadic users and organizations that need to perform data analysis stored in the repository, far away from the site, where users work, allowing them to generate knowledge regardless of their physical location. This paper presents new data analysis methods and new ways to detect people work location via mobile computing technology. The growing number of applications, content, and data can be accessed from a wide range of devices. It becomes necessary to introduce a centralized mobile device management. MDM is a KDE software package working with enterprise systems using mobile devices. The paper discussed the design system in detail.

Journal ArticleDOI
TL;DR: A model of speech synthesis that focuses on its intonation is proposed and a prosody file that can be pronounced by speech engine application is proposed.
Abstract: Prosody in speech synthesis systems (text-to-speech) is a determinant of tone, duration, and loudness of speech sound. Intonation is a part of prosody which determines the speech tone. In Indonesian, intonation is determined by the structure of sentences, types of sentences, and also the position of the word in a sentence. In this study, a model of speech synthesis that focuses on its intonation is proposed. The speech intonation is determined by sentence structure, intonation patterns of the example sentences, and general rules of Indonesian pronunciation. The model receives texts and intonation patterns as inputs. Based on the general principle of Indonesian pronunciation, a prosody file was made. Based on input text, sentence structure is determined and then interval among parts of a sentence (phrase) can be determined. These intervals are used to correct the duration of the initial prosody file. Furthermore, the frequencies in prosody file were corrected using intonation patterns. The final result is prosody file that can be pronounced by speech engine application. Experiment results of studies using the original voice of radio news announcer and the speech synthesis show that the peaks of F0 are determined by general rules or intonation patterns which are dominant. Similarity test with the PESQ method shows that the result of the synthesis is 1.18 at MOS-LQO scale.

Journal ArticleDOI
Amin Alqudah1
TL;DR: Some computational geometry preliminaries are discussed, and then a summary of some different techniques used to address the surface reconstruction problem is presented, emphasizing on their advantages and disadvantages.
Abstract: Surface reconstruction is a problem in the field of computational geometry that is concerned with recreating a surface from scattered data points sampled from an unknown surface. To date, the primary application of surface reconstruction algorithms has been in computer graphics, where physical models are digitized in three dimensions with laser range scanners or mechanical digitizing probes (Bernardini et al., 1999 [1]). Surface reconstruction algorithms are used to convert the set of digitized points into a wire frame mesh model, which can be colored, textured, shaded, and placed into a 3D scene (in a movie or television commercial, for example). In this paper, we discuss some computational geometry preliminaries, and then move on to a summary of some different techniques used to address the surface reconstruction problem. The coming sections describe two algorithms: that of Hoppe, et al. (1992 [2]) and Amenta, et al. (1998 [3]). Finally, we present other applications of surface reconstruction and a brief comparison for some algorithms in this filed emphasizing on their advantages and disadvantages.

Journal ArticleDOI
TL;DR: Fouda et al. as discussed by the authors proposed a template matching from 2D into 1D based on the template matching algorithm, which is a modification of the Template Matching algorithm.
Abstract: The following article has been retracted due to special reason of the authors. This paper published in Vol.5 No. 2, 2014, has been removed from this site. Title: Template Matching from 2-D into 1-D Author: Yasser Fouda

Journal ArticleDOI
TL;DR: The results obtained by the proposed algorithm show that the tracking process was successfully carried out for a set of color videos with different challenging conditions such as occlusion, illumination changes, cluttered conditions, and object scale changes.
Abstract: This paper presents a new kernel-based algorithm for video object tracking called rebound of region of interest (RROI). The novel algorithm uses a rectangle-shaped section as region of interest (ROI) to represent and track specific objects in videos. The proposed algorithm is constituted by two stages. The first stage seeks to determine the direction of the object’s motion by analyzing the changing regions around the object being tracked between two consecutive frames. Once the direction of the object’s motion has been predicted, it is initialized an iterative process that seeks to minimize a function of dissimilarity in order to find the location of the object being tracked in the next frame. The main advantage of the proposed algorithm is that, unlike existing kernel-based methods, it is immune to highly cluttered conditions. The results obtained by the proposed algorithm show that the tracking process was successfully carried out for a set of color videos with different challenging conditions such as occlusion, illumination changes, cluttered conditions, and object scale changes.

Journal ArticleDOI
TL;DR: On the basis of tree matrix algorithm, whole depictions of loops can be figured out, providing of foundation for further research of relations between loops and LDPC codes’ performance.
Abstract: LDPC codes are finding increasing use in applications requiring reliable and highly efficient information transfer over bandwidth. An LDPC code is defined by a sparse parity-check matrix and can be described by a bipartite graph called Tanner graph. Loops in Tanner graph prevent the sum-product algorithm from converging. Further, loops, especially short loops, degrade the performance of LDPC decoder, because they affect the independence of the extrinsic information exchanged in the iterative decoding. This paper, by graph theory, deduces cut-node tree graph of LDPC code, and depicts it with matrix. On the basis of tree matrix algorithm, whole depictions of loops can be figured out, providing of foundation for further research of relations between loops and LDPC codes’ performance.

Journal ArticleDOI
TL;DR: A novel implementation of the level set method that achieves real-time level-set-based object tracking by simple operations such as switching values of thelevel set functions and there is no need to solve any partial differential equations (PDEs).
Abstract: This paper proposes a novel implementation of the level set method that achieves real-time level-set-based object tracking. In the proposed algorithm, the evolution of the curve is realized by simple operations such as switching values of the level set functions and there is no need to solve any partial differential equations (PDEs). The object contour could change due to the change in the location, orientation or due to the changeable nature of the object shape itself. Knowing the contour, the average color value for the pixels within the contour could be found. The estimated object color and contour in one frame are the bases for locating the object in the consecutive one. The color is used to segment the object pixels and the estimated contour is used to initialize the deformation process. Thus, the algorithm works in a closed cycle in which the color is used to segment the object pixels to get the object contour and the contour is used to get the typical-color of the object. With our fast algorithm, a real-time system has been implemented on a standard PC. Results from standard test sequences and our real time system are presented.