scispace - formally typeset
Search or ask a question
Author

Pascal Scalart

Other affiliations: Orange S.A., Laval University, CNET
Bio: Pascal Scalart is an academic researcher from University of Rennes. The author has contributed to research in topics: Speech enhancement & Noise reduction. The author has an hindex of 20, co-authored 85 publications receiving 2212 citations. Previous affiliations of Pascal Scalart include Orange S.A. & Laval University.


Papers
More filters
Proceedings ArticleDOI
07 May 1996
TL;DR: A new approach is then developed which achieves a trade-off between effective noise reduction and low computational load for real-time operations and demonstrates that the subjective and objective results are much better than existing methods.
Abstract: This paper addresses the problem of single microphone frequency domain speech enhancement in noisy environments. The main characteristics of available frequency domain noise reduction algorithms are presented. We have confirmed that the a priori SNR estimation leads to the best subjective results. According to these conclusions, a new approach is then developed which achieves a trade-off between effective noise reduction and low computational load for real-time operations. The obtained solutions demonstrate that the subjective and objective results are much better than existing methods.

794 citations

Journal ArticleDOI
TL;DR: A method called two-step noise reduction (TSNR) technique is proposed which solves this problem while maintaining the benefits of the decision-directed approach and a significant improvement is brought by HRNR compared to TSNR thanks to the preservation of harmonics.
Abstract: This paper addresses the problem of single-microphone speech enhancement in noisy environments. State-of-the-art short-time noise reduction techniques are most often expressed as a spectral gain depending on the signal-to-noise ratio (SNR). The well-known decision-directed (DD) approach drastically limits the level of musical noise, but the estimated a priori SNR is biased since it depends on the speech spectrum estimation in the previous frame. Therefore, the gain function matches the previous frame rather than the current one which degrades the noise reduction performance. The consequence of this bias is an annoying reverberation effect. We propose a method called two-step noise reduction (TSNR) technique which solves this problem while maintaining the benefits of the decision-directed approach. The estimation of the a priori SNR is refined by a second step to remove the bias of the DD approach, thus removing the reverberation effect. However, classic short-time noise reduction techniques, including TSNR, introduce harmonic distortion in enhanced speech because of the unreliability of estimators for small signal-to-noise ratios. This is mainly due to the difficult task of noise power spectrum density (PSD) estimation in single-microphone schemes. To overcome this problem, we propose a method called harmonic regeneration noise reduction (HRNR). A nonlinearity is used to regenerate the degraded harmonics of the distorted signal in an efficient way. The resulting artificial signal is produced in order to refine the a priori SNR used to compute a spectral gain able to preserve the speech harmonics. These methods are analyzed and objective and formal subjective test results between HRNR and TSNR techniques are provided. A significant improvement is brought by HRNR compared to TSNR thanks to the preservation of harmonics

286 citations

Journal ArticleDOI
TL;DR: An overview of recent bibliographic references dealing with speech processing in mobile terminals is given and a fairly large list of references taken from many conferences proceedings and journals are given and commented.

136 citations

Journal ArticleDOI
TL;DR: This paper presents a summary of the solutions retained for this dual reduction in the context of mono-channel and two-channel sound pick-ups.
Abstract: The modern telecommunications field is concerned with freedom and, in this context, hands-free systems offer subscribers the possibility of talking more naturally, without using a handset. This new type of use leads to new problems which were negligible in traditional telephony, namely the superposition of noise and echo on the speech signal. To solve these problems and provide a quality that is sufficient for telecommunications, combined reduction of these disturbances is required. This paper presents a summary of the solutions retained for this dual reduction in the context of mono-channel and two-channel sound pick-ups.

111 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: A new method, called the two-step noise reduction (TSNR) technique, is proposed, which solves the problem of single microphone speech enhancement in noisy environments while maintaining the benefits of the decision-directed approach.
Abstract: The paper addresses the problem of single microphone speech enhancement in noisy environments Common short-time noise reduction techniques proposed in the art are expressed as a spectral gain depending on the a priori SNR In the well-known decision-directed approach, the a priori SNR depends on the speech spectrum estimation in the previous frame As a consequence, the gain function matches the previous frame rather than the current one which degrades the noise reduction performance We propose a new method, called the two-step noise reduction (TSNR) technique, which solves this problem while maintaining the benefits of the decision-directed approach This method is analyzed and results in voice communication and speech recognition contexts are given

89 citations


Cited by
More filters
Patent
11 Jan 2011
TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

1,462 citations

Journal ArticleDOI
TL;DR: The proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general, and is effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.
Abstract: In contrast to the conventional minimum mean square error (MMSE)-based noise reduction techniques, we propose a supervised method to enhance speech by means of finding a mapping function between noisy and clean speech signals based on deep neural networks (DNNs). In order to be able to handle a wide range of additive noises in real-world situations, a large training set that encompasses many possible combinations of speech and noise types, is first designed. A DNN architecture is then employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been proposed to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and the dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the conventional MMSE based technique. It is also interesting to observe that the proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.

1,250 citations

Proceedings ArticleDOI
28 Mar 2017
TL;DR: This work proposes the use of generative adversarial networks for speech enhancement, and operates at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.
Abstract: Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of generative adversarial networks for speech enhancement. In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them. We evaluate the proposed model using an independent, unseen test set with two speakers and 20 alternative noise conditions. The enhanced samples confirm the viability of the proposed model, and both objective and subjective evaluations confirm the effectiveness of it. With that, we open the exploration of generative architectures for speech enhancement, which may progressively incorporate further speech-centric design choices to improve their performance.

1,001 citations

Journal ArticleDOI
S. Biyiksiz1
01 Mar 1985
TL;DR: This book by Elliott and Rao is a valuable contribution to the general areas of signal processing and communications and can be used for a graduate level course in perhaps two ways.
Abstract: There has been a great deal of material in the area of discrete-time transforms that has been published in recent years. This book does an excellent job of presenting important aspects of such material in a clear manner. The book has 11 chapters and a very useful appendix. Seven of these chapters are essentially devoted to the Fourier series/transform, discrete Fourier transform, fast Fourier transform (FFT), and applications of the FFT in the area of spectral estimation. Chapters 8 through 10 deal with many other discrete-time transforms and algorithms to compute them. Of these transforms, the KarhunenLoeve, the discrete cosine, and the Walsh-Hadamard transform are perhaps the most well-known. A lucid discussion of number theoretic transforms i5 presented in Chapter 11. This reviewer feels that the authors have done a fine job of compiling the pertinent material and presenting it in a concise and clear manner. There are a number of problems at the end of each chapter, an appreciable number of which are challenging. The authors have included a comprehensive set of references at the end of the book. In brief, this book is a valuable contribution to the general areas of signal processing and communications. It can be used for a graduate level course in perhaps two ways. One would be to cover the first seven chapters in great detail. The other would be to cover the whole book by focussing on different topics in a selective manner. This book by Elliott and Rao is extremely useful to researchers/engineers who are working in the areas of signal processing and communications. It i s also an excellent reference book, and hence a valuable addition to one’s library

843 citations

Book
30 Apr 2013
TL;DR: This book offers a unified presentation of OFDM theory and high speed and wireless applications, in particular, ADSL, wireless LAN, and digital broadcasting technologies are explained.
Abstract: From the Publisher: Multi-carrier modulation, in particular orthogonal frequency division multiplexing (OFDM), has been successfully applied to a wide variety of digital communications applications for several years. Although OFDM has been chosen as the physical layer standard for a diversity of important systems, the theory, algorithms, and implementation techniques remain subjects of current interest. This book is intended to be a concise summary of the present state of the art of the theory and practice of OFDM technology. This book offers a unified presentation of OFDM theory and high speed and wireless applications. In particular, ADSL, wireless LAN, and digital broadcasting technologies are explained. It is hoped that this book will prove valuable both to developers of such systems, and to researchers and graduate students involved in analysis of digital communications, and will remain a valuable summary of the technology, providing an understanding of new advances as well as the present core technology.

755 citations