scispace - formally typeset
Search or ask a question

Showing papers on "Time–frequency analysis published in 2017"


Journal Article
TL;DR: The main idea of SET is to only retain the TF information of STFT results most related to time-varying features of the signal and to remove most smeared TF energy, such that the energy concentration of the novel TF representation can be enhanced greatly.
Abstract: In this paper, we introduce a new time-frequency (TF) analysis (TFA) method to study the trend and instantaneous frequency (IF) of nonlinear and nonstationary data. Our proposed method is termed the synchroextracting transform (SET), which belongs to a postprocessing procedure of the short-time Fourier transform (STFT). Compared with classical TFA methods, the proposed method can generate a more energy concentrated TF representation and allow for signal reconstruction. The proposed SET method is inspired by the recently proposed synchrosqueezing transform (SST) and the theory of the ideal TFA. To analyze a signal, it is important to obtain the time-varying information, such as the IF and instantaneous amplitude. The SST is to squeeze all TF coefficients into the IF trajectory. Differ from the squeezing manner of SST, the main idea of SET is to only retain the TF information of STFT results most related to time-varying features of the signal and to remove most smeared TF energy, such that the energy concentration of the novel TF representation can be enhanced greatly. Numerical and real-world signals are employed to validate the effectiveness of the SET method.

310 citations


Journal ArticleDOI
TL;DR: The proposed CNN architecture achieves better results with less learnable parameters than similar architectures used for fault detection, including cases with experimental noise.
Abstract: Traditional feature extraction and selection is a labor-intensive process requiring expert knowledge of the relevant features pertinent to the system. This knowledge is sometimes a luxury and could introduce added uncertainty and bias to the results. To address this problem a deep learning enabled featureless methodology is proposed to automatically learn the features of the data. Time-frequency representations of the raw data are used to generate image representations of the raw signal, which are then fed into a deep convolutional neural network (CNN) architecture for classification and fault diagnosis. This methodology was applied to two public data sets of rolling element bearing vibration signals. Three time-frequency analysis methods (short-time Fourier transform, wavelet transform, and Hilbert-Huang transform) were explored for their representation effectiveness. The proposed CNN architecture achieves better results with less learnable parameters than similar architectures used for fault detection, including cases with experimental noise.

303 citations


Journal ArticleDOI
TL;DR: Efficient detection of epileptic seizure is achieved when seizure events appear for long duration in hours long EEG recordings and the proposed method develops time–frequency plane for multivariate signals and builds patient-specific models for EEG seizure detection.
Abstract: Objective : This paper investigates the multivariate oscillatory nature of electroencephalogram (EEG) signals in adaptive frequency scales for epileptic seizure detection. Methods : The empirical wavelet transform (EWT) has been explored for the multivariate signals in order to determine the joint instantaneous amplitudes and frequencies in signal adaptive frequency scales. The proposed multivariate extension of EWT has been studied on multivariate multicomponent synthetic signal, as well as on multivariate EEG signals of Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) scalp EEG database. In a moving-window-based analysis, 2-s-duration multivariate EEG signal epochs containing five automatically selected channels have been decomposed and three features have been extracted from each 1-s part of the 2-s-duration joint instantaneous amplitudes of multivariate EEG signals. The extracted features from each oscillatory level have been processed using a proposed feature processing step and joint features have been computed in order to achieve better discrimination of seizure and seizure-free EEG signal epochs. Results : The proposed detection method has been evaluated over 177 h of EEG records using six classifiers. We have achieved average sensitivity, specificity, and accuracy values as 97.91%, 99.57%, and 99.41%, respectively, using tenfold cross-validation method, which are higher than the compared state of art methods studied on this database. Conclusion : Efficient detection of epileptic seizure is achieved when seizure events appear for long duration in hours long EEG recordings. Significance : The proposed method develops time–frequency plane for multivariate signals and builds patient-specific models for EEG seizure detection.

291 citations


Journal ArticleDOI
TL;DR: A generalization of the short-time Fourier-based synchrosqueezing transform using a new local estimate of instantaneous frequency enables not only to achieve a highly concentrated time-frequency representation for a wide variety of amplitude- and frequency-modulated multicomponent signals but also to reconstruct their modes with a high accuracy.
Abstract: This paper puts forward a generalization of the short-time Fourier-based synchrosqueezing transform using a new local estimate of instantaneous frequency. Such a technique enables not only to achieve a highly concentrated time-frequency representation for a wide variety of amplitude- and frequency-modulated multicomponent signals but also to reconstruct their modes with a high accuracy. Numerical investigation on synthetic and gravitational-wave signals shows the efficiency of this new approach.

282 citations


Journal ArticleDOI
TL;DR: In this article, a variational nonlinear chirp mode decomposition (VNCMD) is proposed to analyze wide-band NCSs, which can be viewed as a time-frequency filter bank, which concurrently extracts all the signal modes.
Abstract: Variational mode decomposition (VMD), a recently introduced method for adaptive data analysis, has aroused much attention in various fields. However, the VMD is formulated based on the assumption of narrow-band property of the signal model. To analyze wide-band nonlinear chirp signals (NCSs), we present an alternative method called variational nonlinear chirp mode decomposition (VNCMD). The VNCMD is developed from the fact that a wideband NCS can be transformed to a narrow-band signal by using demodulation techniques. Our decomposition problem is, thus, formulated as an optimal demodulation problem, which is efficiently solved by the alternating direction method of multipliers. Our method can be viewed as a time–frequency filter bank, which concurrently extracts all the signal modes. Some simulated and real data examples are provided showing the effectiveness of the VNCMD in analyzing NCSs containing close or even crossed modes.

204 citations


Journal ArticleDOI
TL;DR: The local cut-off frequency is adaptively designed by fully facilitating the instantaneous amplitude and frequency information and is able to improve the frequency separation performance, as well as the stability under low sampling rates.

171 citations


Journal ArticleDOI
TL;DR: A novel class of orthogonal wavelet filter banks which are localized in time–frequency domain to detect FC and NFC EEG signals automatically and help in localization of the affected brain area which needs to undergo surgery is employed.
Abstract: It is difficult to detect subtle and vital differences in electroencephalogram (EEG) signals simply by visual inspection. Further, the non-stationary nature of EEG signals makes the task more difficult. Determination of epileptic focus is essential for the treatment of pharmacoresistant focal epilepsy. This requires accurate separation of focal and non-focal groups of EEG signals. Hence, an intelligent system that can detect and discriminate focal–class (FC) and non–focal–class (NFC) of EEG signals automatically can aid the clinicians in their diagnosis. In order to facilitate accurate analysis of non-stationary signals, joint time–frequency localized bases are highly desirable. The performance of wavelet bases is found to be effective in analyzing transient and abrupt behavior of EEG signals. Hence, we employ a novel class of orthogonal wavelet filter banks which are localized in time–frequency domain to detect FC and NFC EEG signals automatically. We classify EEG signals as FC and NFC using the proposed wavelet based system. We compute various entropies from the wavelet coefficients of the signals. These entropies are used as discriminating features for the classification of FC and NFC of EEG signals. The features are ranked using Student’s t-test ranking algorithm and then fed to Least Squares-Support Vector Machine (LS–SVM) to classify the signals. Our proposed method achieved the highest classification accuracy of 94.25%. We have obtained 91.95% sensitivity and 96.56% specificity, respectively, using this method. The classification of FC and NFC of EEG signals helps in localization of the affected brain area which needs to undergo surgery.

148 citations


Journal ArticleDOI
TL;DR: The proposed adaptive parameterless EWT (APEWT) method could effectively fulfill the fault diagnosis of rotor rubbing and show a better effect than EMD and EEMD methods, according to analysis of experiment data.

124 citations


Journal ArticleDOI
TL;DR: This paper presents a systematic and up-to-date review on adaptive mode decomposition in two major topics, i.e., mono-component decomposition algorithms and instantaneous frequency estimation approaches reported in more than 80 representative articles published since 1998.
Abstract: Effective signal processing methods are essential for machinery fault diagnosis. Most conventional signal processing methods lack adaptability, thus being unable to well extract the embedded meaningful information. Adaptive mode decomposition methods have excellent adaptability and high flexibility in describing arbitrary complicated signals, and are free from the limitations imposed by conventional basis expansion, thus being able to adapt to the signal characteristics, extract rich characteristic information, and therefore reveal the underlying physical nature. This paper presents a systematic and up-to-date review on adaptive mode decomposition in two major topics, i.e., mono-component decomposition algorithms (such as empirical mode composition, local mean decomposition, intrinsic time-scale decomposition, local characteristic scale decomposition, Hilbert vibration decomposition, empirical wavelet transform, variational mode decomposition, nonlinear mode decomposition, and adaptive local iterative filtering) and instantaneous frequency estimation approaches (including Hilbert-transform-based analytic signal, direct quadrature, and normalized Hilbert transform based on empirical AM-FM decomposition, as well as generalized zero-crossing and energy separation) reported in more than 80 representative articles published since 1998. Their fundamental principles, advantages and disadvantages, and applications to signal analysis in machinery fault diagnosis, are examined. Examples are provided to illustrate their performance.

121 citations


Journal ArticleDOI
TL;DR: The designed three-band filter banks and multi-layer perceptron neural network (MLPNN) are further used together to implement a signal classifier that provides classification accuracy better than the recently reported results for epileptic seizure EEG signal classification.

114 citations


Posted Content
TL;DR: This study supports the hypothesis that time-frequency representations are valuable in learning useful features for sound classification and observes that the optimal window size during transformation is dependent on the characteristics of the audio signal and architecturally, 2D convolution yielded better results in most cases compared to 1D.
Abstract: Recent successful applications of convolutional neural networks (CNNs) to audio classification and speech recognition have motivated the search for better input representations for more efficient training. Visual displays of an audio signal, through various time-frequency representations such as spectrograms offer a rich representation of the temporal and spectral structure of the original signal. In this letter, we compare various popular signal processing methods to obtain this representation, such as short-time Fourier transform (STFT) with linear and Mel scales, constant-Q transform (CQT) and continuous Wavelet transform (CWT), and assess their impact on the classification performance of two environmental sound datasets using CNNs. This study supports the hypothesis that time-frequency representations are valuable in learning useful features for sound classification. Moreover, the actual transformation used is shown to impact the classification accuracy, with Mel-scaled STFT outperforming the other discussed methods slightly and baseline MFCC features to a large degree. Additionally, we observe that the optimal window size during transformation is dependent on the characteristics of the audio signal and architecturally, 2D convolution yielded better results in most cases compared to 1D.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a bearing fault detection method based on stator currents analysis using the Hilbert-Huang transform (HHT) and empirical mode decomposition (EMD).
Abstract: This paper focuses on rolling elements bearing fault detection in induction machines based on stator currents analysis. Specifically, it proposes to process the stator currents using the Hilbert–Huang transform. This approach relies on two steps: empirical mode decomposition and Hilbert transform. The empirical mode decomposition is used in order to estimate the intrinsic mode functions (IMFs). These IMFs are assumed to be mono-component signals and can be processed using demodulation technique. Afterward, the Hilbert transform is used to compute the instantaneous amplitude (IA) and instantaneous frequency (IF) of these IMFs. The analysis of the IA and IF allows identifying fault signature that can be used for more accurate diagnosis. The proposed approach is used for bearing fault detection in induction machines at several fault degrees. The effectiveness of the proposed approach is verified by a series of simulation and experimental tests corresponding to different bearing fault conditions. The fault severity is assessed based on the IMFs energy and the variance of the IA and IF of each IMF.

Journal ArticleDOI
TL;DR: In this paper, a new feature extraction step that combines the classical wavelet packet decomposition energy distribution technique and a feature extraction technique based on the selection of the most impulsive frequency bands is presented.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed the complex variational mode decomposition (CVMD) algorithm for the analysis of complex-valued data in the presence of white noise and the effects of initialization of center frequency on the filter bank property.

Journal ArticleDOI
TL;DR: A general model to characterize MCCSs, where instantaneous frequencies (IFs) and instantaneous amplitudes (IAs) of the intrinsic chirp components (ICCs) are modeled as Fourier series are developed and the decomposition problem boils down to identifying the developed model.

Journal ArticleDOI
TL;DR: This paper overviews recent advances dealing with time-frequency processing of sparse signals acquired using compressive sensing approaches based on the polynomial Fouriertransform and the short-time Fourier transform.

Journal ArticleDOI
TL;DR: By combining with analyses of the PCFCRD, several unclear mechanisms of the constant delay introduction are mathematically interpreted, especially quantitative influences on the chirp rate resolution and theoretical antinoise performance.
Abstract: This paper presents an extension of the integrated time-chirp rate analysis technique, known as the parameterized centroid frequency-chirp rate distribution (PCFCRD), for noisy multicomponent linear frequency modulated signals analysis The PCFCRD is based on a newly defined correlation function and its auto term accumulation is two-dimensionally coherent The principle, cross term characteristic, implementation, properties, parameter selection criterion, antinoise performance, and estimation accuracy are analyzed for the PCFCRD In this paper, comparisons with the coherently integrated cubic phase function, maximum likelihood method, and Lv's distribution are performed With mathematic analyses and numerical simulations, we demonstrate that the PCFCRD outperforms these three representative counterparts Further, by combining with analyses of the PCFCRD, several unclear mechanisms of the constant delay introduction are mathematically interpreted, especially quantitative influences on the chirp rate resolution and theoretical antinoise performance

Journal ArticleDOI
TL;DR: Concentration in frequency and time is proposed to distinguish the different TF contents of time-dependent signals with time-varying amplitude and instantaneous frequencies and this promising TF analysis tool is introduced to seismic data processing.
Abstract: Time–frequency (TF) analysis can reveal local variations in seismic data processing and interpretation, where seismic signals are nonstationary and time varying. High-quality TF representation (TFR) is important for revealing the local information about these nonstationary seismic signals and describing geological structures. Due to the Heisenberg uncertainty principle, traditional TF methods (e.g., short time Fourier transform and continuous wavelet transform) cannot get the finest time resolution and the best frequency resolution at the same time, which leads to ambiguous TFR with a negative effect on the seismic signal analysis. Concentration in frequency and time is proposed to distinguish the different TF contents of time-dependent signals with time-varying amplitude and instantaneous frequencies. We introduce this promising TF analysis tool to seismic data processing. Experiments on synthetic signals and seismic data show its validity and effectiveness, which is helpful for seismic data interpretation in the future.

Journal ArticleDOI
TL;DR: New chirp rate and instantaneous frequency estimators designed for frequency-modulated signals are introduced and paves the way to the real-time computation of a time-frequency representation, which is both invertible and sharply localized in frequency.
Abstract: This letter introduces new chirp rate and instantaneous frequency estimators designed for frequency-modulated signals. These estimators are first investigated from a deterministic point of view, then compared together in terms of statistical efficiency. They are also used to design new recursive versions of the vertically synchrosqueezed short-time Fourier transform, using a previously published method (D. Fourer, F. Auger, and P. Flandrin, “Recursive versions of the Levenberg-Marquardt reassigned spectrogram and of the synchrosqueezed STFT,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. , Mar. 2016, pp. 4880–4884). This study paves the way to the real-time computation of a time-frequency representation, which is both invertible and sharply localized in frequency.

Journal ArticleDOI
TL;DR: This paper deals with the problem of extracting information from non-stationary signals in the form of features that can be used for effective decision-making in both data analysis and machine learning for automatic classification systems, and forms a TF/TS feature set including two complementary categories: signal related features and image features.

Journal ArticleDOI
TL;DR: A novel signal processing method based on parameterized demodulation (PD) that is free of the resolution problem in fast Fourier transform-based methods with a limited data length and can obtain accurate rate tracking in a noisy environment is proposed.
Abstract: Utilizing Doppler radar to conduct noncontact vital sign detection has attracted growing interest in recent years. Aiming to extract the vital sign information from the baseband signal effectively and accurately, a novel signal processing method based on parameterized demodulation (PD) is proposed. To effectively characterize the baseband signal whose phase consists of two oscillating components (i.e., the respiration and heartbeat components), the proposed algorithm defines a demodulation operator with sine kernel functions and formulates the phase demodulation as a parameter optimization problem. To increase the computational efficiency of the algorithm, the parameters corresponding to the respiration and heartbeat components are estimated sequentially. Specifically, the respiration component is first estimated and removed from the phase of the baseband signal, and then, the heartbeat component is extracted from the residual signal. Compared with the existing methods, the proposed algorithm is free of the resolution problem in fast Fourier transform-based methods with a limited data length and can obtain accurate rate tracking in a noisy environment. Both simulated and experimental results are provided to demonstrate the advantages and effectiveness of the proposed method for accurate noncontact vital sign detection.

Journal ArticleDOI
TL;DR: The methods of choosing the mother wavelet suited for DC fault is presented, based on degree of correlation to the fault pattern and the time delay, and the wavelet analysis is performed on a multi-terminal HVDC system, built in PSCAD/EMTDC software.

Journal ArticleDOI
TL;DR: In this article, a non-iterative method for the reconstruction of the short-time Fourier transform (STFT) phase from the magnitude is presented, which is based on the direct relationship between the partial derivatives of the phase and the logarithm of the magnitude of the un-sampled STFT with respect to the Gaussian window.
Abstract: A noniterative method for the reconstruction of the short-time fourier transform (STFT) phase from the magnitude is presented. The method is based on the direct relationship between the partial derivatives of the phase and the logarithm of the magnitude of the un-sampled STFT with respect to the Gaussian window. Although the theory holds in the continuous setting only, the experiments show that the algorithm performs well even in the discretized setting (discrete Gabor transform) with low redundancy using the sampled Gaussian window, the truncated Gaussian window and even other compactly supported windows such as the Hann window. Due to the noniterative nature, the algorithm is very fast and it is suitable for long audio signals. Moreover, solutions of iterative phase reconstruction algorithms can be improved considerably by initializing them with the phase estimate provided by the present algorithm. We present an extensive comparison with the state-of-the-art algorithms in a reproducible manner.

Journal ArticleDOI
TL;DR: Time-frequency distributions were often used to estimate the spectrotemporal signal features and appeared more suited for estimating IF of actual SCG signals, and STFT had lower error than CWT methods for most test signals and PCT had the most consistently accurate IF estimations.
Abstract: Accurate estimation of seismocardiographic (SCG) signal features can help successful signal characterization and classification in health and disease. This may lead to new methods for diagnosing and monitoring heart function. Time-frequency distributions (TFD) were often used to estimate the spectrotemporal signal features. In this study, the performance of different TFDs (e.g., short-time Fourier transform (STFT), polynomial chirplet transform (PCT), and continuous wavelet transform (CWT) with different mother functions) was assessed using simulated signals, and then utilized to analyze actual SCGs. The instantaneous frequency (IF) was determined from TFD and the error in estimating IF was calculated for simulated signals. Results suggested that the lowest IF error depended on the TFD and the test signal. STFT had lower error than CWT methods for most test signals. For a simulated SCG, Morlet CWT more accurately estimated IF than other CWTs, but Morlet did not provide noticeable advantages over STFT or PCT. PCT had the most consistently accurate IF estimations and appeared more suited for estimating IF of actual SCG signals. PCT analysis showed that actual SCGs from eight healthy subjects had multiple spectral peaks at 9.20 ± 0.48, 25.84 ± 0.77, 50.71 ± 1.83 Hz (mean ± SEM). These may prove useful features for SCG characterization and classification.

Journal ArticleDOI
TL;DR: The resulting multidirectional distribution (MDD) approach proves to be more effective than classical methods like extended modified B distribution, S-method, or compact kernel distribution in terms of auto-terms resolution and cross-terms suppression.
Abstract: This paper presents a new advanced methodology for designing high resolution time–frequency distributions (TFDs) of multicomponent nonstationary signals that can be approximated using piece-wise linear frequency modulated (PW-LFM) signals. Most previous kernel design methods assumed that signals auto-terms are mostly centered around the origin of the $( u,\tau)$ ambiguity domain while signal cross-terms are mostly away from the origin. This study uses a multicomponent test signal for which each component is modeled as a PW-LFM signal; it finds that the above assumption is a very rough approximation of the location of the auto-terms energy and cross-terms energy in the ambiguity domain and it is only valid for signals that are well separated in the $(t,f)$ domain. A refined investigation led to improved specifications for separating cross-terms from auto-terms in the $( u,\tau)$ ambiguity domain. The resulting approach first represents the signal in the ambiguity domain, and then applies a multidirectional signal dependent compact kernel that accounts for the direction of the auto-terms energy. The resulting multidirectional distribution (MDD) approach proves to be more effective than classical methods like extended modified B distribution, S-method, or compact kernel distribution in terms of auto-terms resolution and cross-terms suppression. Results on simulated and real data validate the improved performance of the MDD, showing up to 8% gain as compared to more standard state-of-the-art TFDs.

Journal ArticleDOI
TL;DR: A fast algorithm without searching target's motion parameters is proposed to address the detection performance of radar maneuvering target with jerk motion, and Comparisons with other representative algorithms in computational cost, motion parameter estimation performance, and detection ability indicate that the proposed algorithm can achieve a good balance between the computational cost and Detection ability.
Abstract: The detection performance of radar maneuvering target with jerk motion is affected by the range migration (RM) and Doppler frequency migration (DFM). To address these problems, a fast algorithm without searching target's motion parameters is proposed. In this algorithm, the second-order keystone transform is first applied to eliminate the quadratic coupling between the range frequency and slow time. Then, by employing a new defined symmetric autocorrelation function, scaled Fourier transform, and inverse fast Fourier transform, the target's initial range and velocity are estimated. With these two estimates, the azimuth echoes along the target's trajectory, which can be modeled as a cubic phase signal (CPS), are extracted. Thereafter, the target's radial acceleration and jerk are estimated by approaches for parameters estimation of the CPS. Finally, by constructing a compensation function, the RM and DFM are compensated simultaneously, followed by the coherent integration and target detection. Comparisons with other representative algorithms in computational cost, motion parameter estimation performance, and detection ability indicate that the proposed algorithm can achieve a good balance between the computational cost and detection ability. The simulation and raw data processing results demonstrate the effectiveness of the proposed algorithm.

Journal ArticleDOI
TL;DR: In this paper, a mathematical model based on the concept of modal strain energy and signal processing method based on Hilbert-Huang Transform (HHT) was proposed to identify the cracks.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a sliding window discrete Fourier transform and the effect of sidelobes of sideband frequencies on the fundamental component amplitude of stator current, which can detect the amplitude of the fault indicator frequency in vicinity of the fundamental one.
Abstract: The Fourier transform is widely used to diagnose induction motor faults through the monitoring of fault signatures from measured signals such as stator currents. For a good frequency resolution, Fourier transform needs a long signal acquisition time that increases the probability of speed fluctuations which, leads to fault signatures variations. In addition, limited acquisition time and acquired points generate unwanted sidelobes leakage phenomenon, caused by step frequency resolution. In signal processing, the use of window functions allows the avoidance of this phenomenon with the cost of losing a part of signal information. In this paper, the authors propose a new method for the diagnosis of induction motor broken bar fault based on sliding window discrete Fourier transform and the effect of sidelobes of sideband frequencies on the fundamental component amplitude of stator current. The main advantage of the proposed method is that one can detect the amplitude of the fault indicator frequency in vicinity of the fundamental one in shorter time and with good precision even if the motor turns at no-load when compared to used methods, as fast Fourier transform, zoom fast Fourier transform, multiple signal classification, and zoom multiple signal classification. The simulation and experimental results validate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: In this article, a direction of arrival (DOA) estimation algorithm based on short time Fourier transform (STFT) and multiple invariance estimation of signal parameters via rotational invariance techniques (MI-ESPRIT) is proposed.
Abstract: In order to improve the angle measurement precision of LFM signals with a low computational complexity, a direction of arrival (DOA) estimation algorithm STFT-MI-ESPRIT is proposed in this paper. The algorithm is based on short time Fourier transform (STFT) and multiple invariance estimation of signal parameters via rotational invariance techniques (MI-ESPRIT). Firstly, the STFT of the array element’s output is calculated and the signals are transformed to the time-frequency domain. Then the spatial time-frequency distribution matrix can be obtained through selecting multiple single-source time-frequency points in the time-frequency plane and the signal subspace can also be obtained using Eigen decomposition. Finally, the multiple rotational invariant equation of the array based on STFT is obtained and the closed-form solution is obtained using the multi-least-squares (MLS) criterion. The simulation results show that the proposed algorithm can improve the estimation precision greatly compared with the traditional ESPRIT-like algorithms and its computational complexity remains the same in general. This paper also proposes that the STFT-MI-ESPRIT algorithm can use partial rotational invariances of the array instead of all the rotational invariances, which can reduce the computational complexity on the basis of ensuring the estimation precision basically. The simulation results verify the effectiveness of the conclusion.

Proceedings ArticleDOI
16 Jun 2017
TL;DR: A fully complex-valued deep neural network (FCDNN) that learns the nonlinear mapping fromcomplex-valued STFT coefficients of a mixture to sources and outperforms the state-of-the-art DNN-based methods on singing source separation.
Abstract: Deep neural network (DNN) have become a popular means of separating a target source from a mixed signal. Most of DNN-based methods modify only the magnitude spectrum of the mixture. The phase spectrum is left unchanged, which is inherent in the short-time Fourier transform (STFT) coefficients of the input signal. However, recent studies have revealed that incorporating phase information can improve the quality of separated sources. To estimate simultaneously the magnitude and the phase of STFT coefficients, this work paper developed a fully complex-valued deep neural network (FCDNN) that learns the nonlinear mapping from complex-valued STFT coefficients of a mixture to sources. In addition, to reinforce the sparsity of the estimated spectra, a sparse penalty term is incorporated into the objective function of the FCDNN. Finally, the proposed method is applied to singing source separation. Experimental results indicate that the proposed method outperforms the state-of-the-art DNN-based methods.