scispace - formally typeset
Search or ask a question

Showing papers in "EURASIP Journal on Advances in Signal Processing in 2007"


Journal ArticleDOI
TL;DR: Several ECG applications are reviewed where PCA techniques have been successfully employed, including data compression, ST-T segment analysis for the detection of myocardial ischemia and abnormalities in ventricular repolarization, extraction of atrial fibrillatory waves for detailed characterization of atrium fibrillation, and analysis of body surface potential maps.
Abstract: This paper reviews the current status of principal component analysis in the area of ECG signal processing. The fundamentals of PCA are briefly described and the relationship between PCA and Karhunen-Loeve transform is explained. Aspects on PCA related to data with temporal and spatial correlations are considered as adaptive estimation of principal components is. Several ECG applications are reviewed where PCA techniques have been successfully employed, including data compression, ST-T segment analysis for the detection of myocardial ischemia and abnormalities in ventricular repolarization, extraction of atrial fibrillatory waves for detailed characterization of atrial fibrillation, and analysis of body surface potential maps.

322 citations


Journal ArticleDOI
TL;DR: A discriminative model for polyphonic piano transcription is presented and a frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.
Abstract: We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.

261 citations


Journal ArticleDOI
TL;DR: Automatic identification of bird species by their vocalization is studied and results with the proposed method suggest better or equal performance when compared to existing reference methods.
Abstract: Automatic identification of bird species by their vocalization is studied in this paper. Bird sounds are represented with two different parametric representations: (i) the mel-cepstrum parameters and (ii) a set of low-level signal parameters, both of which have been found useful for bird species recognition. Recognition is performed in a decision tree with support vector machine (SVM) classifiers at each node that perform classification between two species. Recognition is tested with two sets of bird species whose recognition has been previously tested with alternative methods. Recognition results with the proposed method suggest better or equal performance when compared to existing reference methods.

241 citations


Journal ArticleDOI
TL;DR: This paper uses the fixed-point iteration method and preconditioning techniques to efficiently solve the associated nonlinear Euler-Lagrange equations of the corresponding variational problem in SR.
Abstract: Super-resolution (SR) reconstruction technique is capable of producing a high-resolution image from a sequence of low-resolution images. In this paper, we study an efficient SR algorithm for digital video. To effectively deal with the intractable problems in SR video reconstruction, such as inevitable motion estimation errors, noise, blurring, missing regions, and compression artifacts, the total variation (TV) regularization is employed in the reconstruction model. We use the fixed-point iteration method and preconditioning techniques to efficiently solve the associated nonlinear Euler-Lagrange equations of the corresponding variational problem in SR. The proposed algorithm has been tested in several cases of motion and degradation. It is also compared with the Laplacian regularization-based SR algorithm and other TV-based SR algorithms. Experimental results are presented to illustrate the effectiveness of the proposed algorithm.

210 citations


Journal ArticleDOI
TL;DR: Channel equalization in filter bank based multicarrier (FBMC) modulation is addressed and a novel structure, consisting of a linear-phase FIR amplitude equalizer and an allpass filter as phase equalizer is found to provide enhanced robustness to timing estimation errors.
Abstract: Channel equalization in filter bank based multicarrier (FBMC) modulation is addressed. We utilize an efficient oversampled filter bank concept with 2x-oversampled subcarrier signals that can be equalized independently of each other. Due to Nyquist pulse shaping, consecutive symbol waveforms overlap in time, which calls for special means for equalization. Two alternative linear low-complexity subcarrier equalizer structures are developed together with straightforward channel estimation-based methods to calculate the equalizer coefficients using pointwise equalization within each subband (in a frequency-sampled manner). A novel structure, consisting of a linear-phase FIR amplitude equalizer and an allpass filter as phase equalizer, is found to provide enhanced robustness to timing estimation errors. This allows the receiver to be operated without time synchronization before the filter bank. The coded error-rate performance of FBMC with the studied equalization scheme is compared to a cyclic prefix OFDM reference in wireless mobile channel conditions, taking into account issues like spectral regrowth with practical nonlinear transmitters and sensitivity to frequency offsets. It is further emphasized that FBMC provides flexible means for high-quality frequency selective filtering in the receiver to suppress strong interfering spectral components within or close to the used frequency band.

201 citations


Journal ArticleDOI
TL;DR: A real-time audio rendering system is introduced which combines a full room-specific simulation, dynamic crosstalk cancellation, and multitrack binaural synthesis for virtual acoustical imaging and can be used as a reliable platform for further research on VR applications.
Abstract: A real-time audio rendering system is introduced which combines a full room-specific simulation, dynamic crosstalk cancellation, and multitrack binaural synthesis for virtual acoustical imaging. The system is applicable for any room shape (normal, long, flat, coupled), independent of the a priori assumption of a diffuse sound field. This provides the possibility of simulating indoor or outdoor spatially distributed, freely movable sources and a moving listener in virtual environments. In addition to that, near-to-head sources can be simulated by using measured near-field HRTFs. The reproduction component consists of a headphone-free reproduction by dynamic crosstalk cancellation. The focus of the project is mainly on the integration and interaction of all involved subsystems. It is demonstrated that the system is capable of real-time room simulation and reproduction and, thus, can be used as a reliable platform for further research on VR applications.

154 citations


Journal ArticleDOI
TL;DR: This paper considers MA in multihop environments and adopts directed diffusion (DD) to dispatch MA, and shows that MADD exhibits better performance than original DD (in the client/server paradigm) in terms of packet delivery ratio, energy consumption, and end-to-end delivery latency.
Abstract: In the environments where the source nodes are close to one another and generate a lot of sensory data traffic with redundancy, transmitting all sensory data by individual nodes not only wastes the scarce wireless bandwidth, but also consumes a lot of battery energy. Instead of each source node sending sensory data to its sink for aggregation (the so-called client/server computing), Qi et al. in 2003 proposed a mobile agent (MA)-based distributed sensor network (MADSN) for collaborative signal and information processing, which considerably reduces the sensory data traffic and query latency as well. However, MADSN is based on the assumption that the operation of mobile agent is only carried out within one hop in a clustering-based architecture. This paper considers MA in multihop environments and adopts directed diffusion (DD) to dispatch MA. The gradient in DD gives a hint to efficiently forward the MA among target sensors. The mobile agent paradigm in combination with the DD framework is dubbed mobile agent-based directed diffusion (MADD). With appropriate parameters set, extensive simulation shows that MADD exhibits better performance than original DD (in the client/server paradigm) in terms of packet delivery ratio, energy consumption, and end-to-end delivery latency.

154 citations


Journal ArticleDOI
TL;DR: Considering the difficulties and limitations of recording long-term ECG data, especially from pregnant women, the model described in this paper may serve as an effective means of simulation and analysis of a wide range of ECGs, including adults and fetuses.
Abstract: A three-dimensional dynamic model of the electrical activity of the heart is presented. The model is based on the single dipole model of the heart and is later related to the body surface potentials through a linear model which accounts for the temporal movements and rotations of the cardiac dipole, together with a realistic ECG noise model. The proposed model is also generalized to maternal and fetal ECG mixtures recorded from the abdomen of pregnant women in single and multiple pregnancies. The applicability of the model for the evaluation of signal processing algorithms is illustrated using independent component analysis. Considering the difficulties and limitations of recording long-term ECG data, especially from pregnant women, the model described in this paper may serve as an effective means of simulation and analysis of a wide range of ECGs, including adults and fetuses.

145 citations


Journal ArticleDOI
TL;DR: An extensive overview of the available estimators is presented, and a theoretical estimator is derived to experimentally assess an upper bound to the performance that can be achieved by any subspace-based method.
Abstract: The objective of this paper is threefold: (1) to provide an extensive review of signal subspace speech enhancement, (2) to derive an upper bound for the performance of these techniques, and (3) to present a comprehensive study of the potential of subspace filtering to increase the robustness of automatic speech recognisers against stationary additive noise distortions. Subspace filtering methods are based on the orthogonal decomposition of the noisy speech observation space into a signal subspace and a noise subspace. This decomposition is possible under the assumption of a low-rank model for speech, and on the availability of an estimate of the noise correlation matrix. We present an extensive overview of the available estimators, and derive a theoretical estimator to experimentally assess an upper bound to the performance that can be achieved by any subspace-based method. Automatic speech recognition (ASR) experiments with noisy data demonstrate that subspace-based speech enhancement can significantly increase the robustness of these systems in additive coloured noise environments. Optimal performance is obtained only if no explicit rank reduction of the noisy Hankel matrix is performed. Although this strategy might increase the level of the residual noise, it reduces the risk of removing essential signal information for the recogniser's back end. Finally, it is also shown that subspace filtering compares favourably to the well-known spectral subtraction technique.

141 citations


Journal ArticleDOI
TL;DR: A combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP) approach designed for complex numbers is found to be comparable to, or even better than, the SOCP solution, with a lower computational cost for problems with low input/output dimensions.
Abstract: We address the problem of underdetermined BSS. While most previous approaches are designed for instantaneous mixtures, we propose a time-frequency-domain algorithm for convolutive mixtures. We adopt a two-step method based on a general maximum a posteriori (MAP) approach. In the first step, we estimate the mixing matrix based on hierarchical clustering, assuming that the source signals are suciently sparse. The algorithm works directly on the complex-valued data in the time-frequency domain and shows better convergence than algorithms based on self-organizing maps. The assumption of Laplacian priors for the source signals in the second step leads to an algorithm for estimating the source signals. It involves the l1-norm minimization of complex numbers because of the use of the time-frequency-domain approach. We compare a combinatorial approach initially designed for real numbers with a second-order cone programming (SOCP) approach designed for complex numbers. We found that although the former approach is not theoretically justified for complex numbers, its results are comparable to, or even better than, the SOCP solution. The advantage is a lower computational cost for problems with low input/output dimensions.

124 citations


Journal ArticleDOI
TL;DR: This work focuses on the receiver signal processing algorithms and derives a maximum likelihood frequency-domain detector that takes into account the presence of impulse noise as well as the intercode interference (ICI) and the multiple-access interference (MAI) that are generated by the frequency-selective power line channel.
Abstract: We consider a bit-interleaved coded wideband impulse-modulated system for power line communications. Impulse modulation is combined with direct-sequence code-division multiple access (DS-CDMA) to obtain a form of orthogonal modulation and to multiplex the users. We focus on the receiver signal processing algorithms and derive a maximum likelihood frequency-domain detector that takes into account the presence of impulse noise as well as the intercode interference (ICI) and the multiple-access interference (MAI) that are generated by the frequency-selective power line channel. To reduce complexity, we propose several simplified frequency-domain receiver algorithms with different complexity and performance. We address the problem of the practical estimation of the channel frequency response as well as the estimation of the correlation of the ICI-MAI-plus-noise that is needed in the detection metric. To improve the estimators performance, a simple hard feedback from the channel decoder is also used. Simulation results show that the scheme provides robust performance as a result of spreading the symbol energy both in frequency (through the wideband pulse) and in time (through the spreading code and the bit-interleaved convolutional code).

Journal ArticleDOI
TL;DR: A new modified wavelet transform is presented that can be applied to ECG signals in order to remove noise from them under a wide range of variations for noise, by adaptively determining both the center frequency of each scale together with the-function and applying a new proposed thresholding rule.
Abstract: We present a new modified wavelet transform, called the multiadaptive bionic wavelet transform (MABWT), that can be applied to ECG signals in order to remove noise from them under a wide range of variations for noise. By using the definition of bionic wavelet transform and adaptively determining both the center frequency of each scale together with the -function, the problem of desired signal decomposition is solved. Applying a new proposed thresholding rule works successfully in denoising the ECG. Moreover by using the multiadaptation scheme, lowpass noisy interference effects on the baseline of ECG will be removed as a direct task. The method was extensively clinically tested with real and simulated ECG signals which showed high performance of noise reduction, comparable to those of wavelet transform (WT). Quantitative evaluation of the proposed algorithm shows that the average SNR improvement of MABWT is 1.82 dB more than the WT-based results, for the best case. Also the procedure has largely proved advantageous over wavelet-based methods for baseline wandering cancellation, including both DC components and baseline drifts.

Journal ArticleDOI
TL;DR: This paper presents a novel method to recognize inharmonic and transient bird sounds efficiently using wavelet decomposition and recognition using either supervised or unsupervised classifier.
Abstract: This paper presents a novel method to recognize inharmonic and transient bird sounds efficiently. The recognition algorithm consists of feature extraction using wavelet decomposition and recognition using either supervised or unsupervised classifier. The proposed method was tested on sounds of eight bird species of which five species have inharmonic sounds and three reference species have harmonic sounds. Inharmonic sounds are not well matched to the conventional spectral analysis methods, because the spectral domain does not include any visible trajectories that computer can track and identify. Thus, the wavelet analysis was selected due to its ability to preserve both frequency and temporal information, and its ability to analyze signals which contain discontinuities and sharp spikes. The shift invariant feature vectors calculated from the wavelet coefficients were used as inputs of two neural networks: the unsupervised self-organizing map (SOM) and the supervised multilayer perceptron (MLP). The results were encouraging: the SOM network recognized 78% and the MLP network 96% of the test sounds correctly.

Journal ArticleDOI
TL;DR: Two different segmentations of iris are presented and it is observed that relying on a smaller but more reliable part of the iris, though reducing the net amount of information, improves the overall performance.
Abstract: Accurate iris detection is a crucial part of an iris recognition system. One of the main issues in iris segmentation is coping with occlusion that happens due to eyelids and eyelashes. In the literature, some various methods have been suggested to solve the occlusion problem. In this paper, two different segmentations of iris are presented. In the first algorithm, a circle is located around the pupil with an appropriate diameter. The iris area encircled by the circular boundary is used for recognition purposes then. In the second method, again a circle is located around the pupil with a larger diameter. This time, however, only the lower part of the encircled iris area is utilized for individual recognition. Wavelet-based texture features are used in the process. Hamming and harmonic mean distance classifiers are exploited as a mixed classifier in suggested algorithm. It is observed that relying on a smaller but more reliable part of the iris, though reducing the net amount of information, improves the overall performance. Experimental results on CASIA database show that our method has a promising performance with an accuracy of 99.31%. The sensitivity of the proposed method is analyzed versus contrast, illumination, and noise as well, where lower sensitivity to all factors is observed when the lower half of the iris is used for recognition.

Journal ArticleDOI
TL;DR: A method for performing automatic segmentation based on features related to rhythm, timbre, and harmony is presented and compared, and it is shown that the timbre-related feature performs best.
Abstract: The segmentation of music into intro-chorus-verse-outro, and similar segments, is a difficult topic. A method for performing automatic segmentation based on features related to rhythm, timbre, and harmony is presented, and compared, between the features and between the features andmanual segmentation of a database of 48 songs. Standard information retrieval performance measures are used in the comparison, and it is shown that the timbre-related feature performs best.

Journal ArticleDOI
TL;DR: The results obtained in the dome analysis show that prewhitened signal gating outperforms the other two optimum detectors.
Abstract: Optimum detection is applied to ultrasonic signals corrupted with significant levels of grain noise. The aim is to enhance the echoes produced by the interface between the first and second layers of a dome to obtain interface traces in echo pulse B-scan mode. This is useful information for the restorer before restoration of the dome paintings. Three optimum detectors are considered: matched filter, signal gating, and prewhitened signal gating. Assumed models and practical limitations of the three optimum detectors are considered. The results obtained in the dome analysis show that prewhitened signal gating outperforms the other two optimum detectors.

Journal ArticleDOI
TL;DR: Simulation results using the standard asymmetric digital subscriber line (ADSL) test loops show that the proposed heuristic optimal discrete bit allocation algorithm is efficient for practical DMT transmissions.
Abstract: A heuristic optimal discrete bit allocation algorithm is proposed for solving the margin maximization problem in discrete multitone (DMT) systems Starting from an initial equal power assignment bit distribution, the proposed algorithm employs a multistaged bit rate allocation scheme to meet the target rate If the total bit rate is far from the target rate, a multiple-bits loading procedure is used to obtain a bit allocation close to the target rate When close to the target rate, a parallel bit-loading procedure is used to achieve the target rate and this is computationally more efficient than conventional greedy bit-loading algorithm Finally, the target bit rate distribution is checked, if it is efficient, then it is also the optimal solution; else, optimal bit distribution can be obtained only by few bit swaps Simulation results using the standard asymmetric digital subscriber line (ADSL) test loops show that the proposed algorithm is efficient for practical DMT transmissions

Journal ArticleDOI
TL;DR: Results obtained have shown that the HNR and the critical-band energy spectrum can be used to correlate laryngeal pathology and voice alteration, using previously classified voice samples, and could be an additional acoustic indicator that supplements the clinical diagnostic features for voice evaluation.
Abstract: Acoustic analysis of speech signals is a noninvasive technique that has been proved to be an effective tool for the objective support of vocal and voice disease screening. In the present study acoustic analysis of sustained vowels is considered. A simple k-means nearest neighbor classifier is designed to test the efficacy of a harmonics-to-noise ratio (HNR) measure and the critical-band energy spectrum of the voiced speech signal as tools for the detection of laryngeal pathologies. It groups the given voice signal sample into pathologic and normal. The voiced speech signal is decomposed into harmonic and noise components using an iterative signal extrapolation algorithm. The HNRs at four different frequency bands are estimated and used as features. Voiced speech is also filtered with 21 critical-bandpass filters that mimic the human auditory neurons. Normalized energies of these filter outputs are used as another set of features. The results obtained have shown that the HNR and the critical-band energy spectrum can be used to correlate laryngeal pathology and voice alteration, using previously classified voice samples. This method could be an additional acoustic indicator that supplements the clinical diagnostic features for voice evaluation.

Journal ArticleDOI
TL;DR: The two main types of classification methods for power quality disturbances based on underlying causes are presented: deterministic classification, giving an expert system as an example, and statistical classification, with support vector machines (a novel method) as an examples.
Abstract: This paper presents the two main types of classification methods for power quality disturbances based on underlying causes: deterministic classification, giving an expert system as an example, and statistical classification, with support vector machines (a novel method) as an example. An expert system is suitable when one has limited amount of data and sufficient power system expert knowledge; however, its application requires a set of threshold values. Statistical methods are suitable when large amount of data is available for training. Two important issues to guarantee the eectiveness of a classifier, data segmentation, and feature extraction are discussed. Segmentation of a sequence of data recording is preprocessing to partition the data into segments each representing a duration containing either an event or a transition between two events. Extraction of features is applied to each segment individually. Some useful features and their effectiveness are then discussed. Some experimental results are included for demonstrating the effectiveness of both systems. Finally, conclusions are given together with the discussion of some future research directions.

Journal ArticleDOI
TL;DR: A new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music by weighting features based on how much they are affected by overlapping, which improves instrument identification using musical context.
Abstract: We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play, partials (harmonic components) of their sounds overlap and interfere, which makes the acoustic features different from those of monophonic sounds. To cope with this, we weight features based on how much they are affected by overlapping. First, we quantitatively evaluate the influence of overlapping on each feature as the ratio of the within-class variance to the between-class variance in the distribution of training data obtained from polyphonic sounds. Then, we generate feature axes using a weighted mixture that minimizes the influence via linear discriminant analysis. In addition, we improve instrument identification using musical context. Experimental results showed that the recognition rates using both feature weighting and musical context were 84.1% for duo, 77.6% for trio, and 72.3% for quartet; those without using either were 53.4, 49.6, and 46.5%, respectively.

Journal ArticleDOI
TL;DR: This paper reviews the use of wavelet transform approach in processing power quality data and the strengths, limitations, and challenges in employing the methods are discussed with consideration of the needs and expectations when analyzing power quality disturbances.
Abstract: The emergence of power quality as a topical issue in power systems in the 1990s largely coincides with the huge advancements achieved in the computing technology and information theory. This unsurprisingly has spurred the development of more sophisticated instruments for measuring power quality disturbances and the use of new methods in processing and analyzing the measurements. Fourier theory was the core of many traditional techniques and it is still widely used today. However, it is increasingly being replaced by newer approaches notably wavelet transformand especially in the post-event processing of the time-varying phenomena. This paper reviews the use of wavelet transform approach in processing power quality data. The strengths, limitations, and challenges in employing the methods are discussed with consideration of the needs and expectations when analyzing power quality disturbances. Several examples are given and discussions are made on the various design issues and considerations, which would be useful to those contemplating adopting wavelet transform in power quality applications. A new approach of combining wavelet transform and rank correlation is introduced as an alternative method for identifying capacitor-switching transients.

Journal ArticleDOI
TL;DR: This work uses McKay and Fujinaga's 3-root and 9-leaf genre data set to use normalized compression distance (NCD) for MIDI music genre classification and achieves diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.
Abstract: We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

Journal ArticleDOI
TL;DR: A biologically motivated multiresolution contour detection method using Bayesian denoising and a surround inhibition technique, relying on the observation that object contours lead to long connected components rather than to short rods obtained from textures.
Abstract: Standard edge detectors react to all local luminance changes, irrespective of whether they are due to the contours of the objects represented in a scene or due to natural textures like grass, foliage, water, and so forth. Moreover, edges due to texture are often stronger than edges due to object contours. This implies that further processing is needed to discriminate object contours from texture edges. In this paper, we propose a biologically motivated multiresolution contour detection method using Bayesian denoising and a surround inhibition technique. Specifically, the proposed approach deploys computation of the gradient at different resolutions, followed by Bayesian denoising of the edge image. Then, a biologically motivated surround inhibition step is applied in order to suppress edges that are due to texture. We propose an improvement of the surround suppression used in previous works. Finally, a contour-oriented binarization algorithm is used, relying on the observation that object contours lead to long connected components rather than to short rods obtained from textures. Experimental results show that our contour detection method outperforms standard edge detectors as well as other methods that deploy inhibition.

Journal ArticleDOI
TL;DR: A novel collaborative image coding and transmission scheme to minimize the energy for data transmission of image data collected in a sensor network for the purpose of energy efficient transmission is proposed.
Abstract: The imaging sensors are able to provide intuitive visual information for quick recognition and decision. However, imaging sensors usually generate vast amount of data. Therefore, processing and coding of image data collected in a sensor network for the purpose of energy efficient transmission poses a significant technical challenge. In particular, multiple sensors may be collecting similar visual information simultaneously. We propose in this paper a novel collaborative image coding and transmission scheme to minimize the energy for data transmission. First, we apply a shape matching method to coarsely register images to find out maximal overlap to exploit the spatial correlation between images acquired from neighboring sensors. For a given image sequence, we transmit background image only once. A lightweight and efficient background subtraction method is employed to detect targets. Only the regions of target and their spatial locations are transmitted to the monitoring center. The whole image can then be reconstructed by fusing the background and the target images as well as their spatial locations. Experimental results show that the energy for image transmission can indeed be greatly reduced with collaborative image coding and transmission.

Journal ArticleDOI
TL;DR: The experimental results show that the proposed method is superior in terms of precision versus recall and can be used for 3D model search and retrieval in a highly efficient manner.
Abstract: This paper presents a novel methodology for content-based search and retrieval of 3D objects. After proper positioning of the 3D objects using translation and scaling, a set of functionals is applied to the 3D model producing a new domain of concentric spheres. In this new domain, a new set of functionals is applied, resulting in a descriptor vector which is completely rotation invariant and thus suitable for 3D model matching. Further, weights are assigned to each descriptor, so as to significantly improve the retrieval results. Experiments on two different databases of 3D objects are performed so as to evaluate the proposed method in comparison with those most commonly cited in the literature. The experimental results show that the proposed method is superior in terms of precision versus recall and can be used for 3D model search and retrieval in a highly efficient manner.

Journal ArticleDOI
Geoffroy Peeters1
TL;DR: A novel approach to automatic estimation of tempo over time based on a proposed reassigned spectral energy flux for the detection of musical events and a proposed combination of discrete Fourier transform and frequency-mapped autocorrelation function is presented.
Abstract: We present a novel approach to automatic estimation of tempo over time. This method aims at detecting tempo at the tactus level for percussive and nonpercussive audio. The front-end of our system is based on a proposed reassigned spectral energy flux for the detection of musical events. The dominant periodicities of this flux are estimated by a proposed combination of discrete Fourier transform and frequency-mapped autocorrelation function. The most likely meter, beat, and tatum over time are then estimated jointly using proposed meter/beat subdivision templates and a Viterbi decoding algorithm. The performances of our system have been evaluated on four different test sets among which three were used during the ISMIR 2004 tempo induction contest. The performances obtained are close to the best results of this contest.

Journal ArticleDOI
TL;DR: This paper proposes a method for noncoherent sources, which continues to work under such conditions, while maintaining low computational complexity, and allows the probability of false alarm to be controlled and predefined, which is a crucial point for systems such as RADARs.
Abstract: High-resolution methods for estimating signal processing parameters such as bearing angles in array processing or frequencies in spectral analysis may be hampered by the model order if poorly selected. As classical model order selection methods fail when the number of snapshots available is small, this paper proposes a method for noncoherent sources, which continues to work under such conditions, while maintaining low computational complexity. For white Gaussian noise and short data we show that the profile of the ordered noise eigenvalues is seen to approximately fit an exponential law. This fact is used to provide a recursive algorithm which detects a mismatch between the observed eigenvalue profile and the theoretical noise-only eigenvalue profile, as such a mismatch indicates the presence of a source. Moreover this proposed method allows the probability of false alarm to be controlled and predefined, which is a crucial point for systems such as RADARs. Results of simulations are provided in order to show the capabilities of the algorithm.

Journal ArticleDOI
TL;DR: An important property of STSs is established: the initiation times of actors in an STS are bounded by the initiation time of the same actors in any static periodic schedule of theSame job, which guarantees strictly periodic behavior of a task within a self-timed implementation.
Abstract: This paper deals with the scheduling analysis of hard real-time streaming applications. These applications are mapped onto a heterogeneous multiprocessor system-on-chip (MPSoC), where we must jointly meet the timing requirements of several jobs. Each job is independently activated and processes streams at its own rate. The dynamic starting and stopping of jobs necessitates the usage of self-timed schedules (STSs). By modeling job implementations using multirate data flow (MRDF) graph semantics, real-time analysis can be performed. Traditionally, temporal analysis of STSs for MRDF graphs only aims at evaluating the average throughput. It does not cope well with latency, and it does not take into account the temporal behavior during the initial transient phase. In this paper, we establish an important property of STSs: the initiation times of actors in an STS are bounded by the initiation times of the same actors in any static periodic schedule of the same job; based on this property, we show how to guarantee strictly periodic behavior of a task within a self-timed implementation; then, we provide useful bounds on maximum latency for jobs with periodic, sporadic, and bursty sources, as well as a technique to check latency requirements. We present two case studies that exemplify the application of these techniques: a simplified channel equalizer and a wireless LAN receiver.

Journal ArticleDOI
TL;DR: A new discriminative learning algorithm to improve people identification accuracy and a novel method of body obscuring, which removes the appearance information of the people while preserving rich structure and motion information, are proposed.
Abstract: This paper presents a system for protecting the privacy of specific individuals in video recordings. We address the following two problems: automatic people identification with limited labeled data, and human body obscuring with preserved structure and motion information. In order to address the first problem, we propose a new discriminative learning algorithm to improve people identification accuracy using limited training data labeled from the original video and imperfect pairwise constraints labeled from face obscured video data. We employ a robust face detection and tracking algorithm to obscure human faces in the video. Our experiments in a nursing home environment show that the system can obtain a high accuracy of people identification using limited labeled data and noisy pairwise constraints. The study result indicates that human subjects can perform reasonably well in labeling pairwise constraints with the face masked data. For the second problem, we propose a novel method of body obscuring, which removes the appearance information of the people while preserving rich structure and motion information. The proposed approach provides a way to minimize the risk of exposing the identities of the protected people while maximizing the use of the captured data for activity/behavior analysis.

Journal ArticleDOI
TL;DR: New fixed-point algorithms for the blind separation of complex-valued mixtures of independent, noncircularly symmetric, and non-Gaussian source signals are derived and have superior finite-sample performance in data-starved scenarios as compared to existing complex ICA methods.
Abstract: We derive new fixed-point algorithms for the blind separation of complex-valued mixtures of independent, noncircularly symmetric, and non-Gaussian source signals. Leveraging recently developed results on the separability of complex-valued signal mixtures, we systematically construct iterative procedures on a kurtosis-based contrast whose evolutionary characteristics are identical to those of the FastICA algorithm of Hyvarinen and Oja in the real-valued mixture case. Thus, our methods inherit the fast convergence properties, computational simplicity, and ease of use of the FastICA algorithm while at the same time extending this class of techniques to complex signal mixtures. For extracting multiple sources, symmetric and asymmetric signal deflation procedures can be employed. Simulations for both noiseless and noisy mixtures indicate that the proposed algorithms have superior finite-sample performance in data-starved scenarios as compared to existing complex ICA methods while performing about as well as the best of these techniques for larger data-record lengths.