Home
/
Authors
/
Jean Laroche

Author

Jean Laroche

Other affiliations: Télécom ParisTech, Orange S.A., Creative Technology ...read more

Bio: Jean Laroche is an academic researcher from Audience. The author has contributed to research in topics: Audio signal & Frequency domain. The author has an hindex of 29, co-authored 75 publications receiving 3215 citations. Previous affiliations of Jean Laroche include Télécom ParisTech & Orange S.A..

Papers published on a yearly basis

2018
2016
2014
2013
2011
2010
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1989

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Non-parametric techniques for pitch-scale and time-scale modification of speech

[...]

Eric Moulines¹, Jean Laroche¹•Institutions (1)

Télécom ParisTech¹

01 Feb 1995-Speech Communication

TL;DR: This contribution reviews frequency-domain algorithms (phase-vocoder) and time- domain algorithms (Time-Domain Pitch-Synchronous Overlap/Add and the like) in the same framework and presents more recent variations of these schemes.

...read moreread less

363 citations

Journal Article•DOI•

Improved phase vocoder time-scale modification of audio

[...]

Jean Laroche, Mark Dolson

01 May 1999-IEEE Transactions on Speech and Audio Processing

TL;DR: This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes, and two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved.

...read moreread less

Abstract: The phase vocoder is a well established tool for time scaling and pitch shifting speech and audio signals via modification of their short-time Fourier transforms (STFTs). In contrast to time-domain time-scaling and pitch-shifting techniques, the phase vocoder is generally considered to yield high quality results, especially for large modification factors and/or polyphonic signals. However, the phase vocoder is also known for introducing a characteristic perceptual artifact, often described as "phasiness", "reverberation", or "loss of presence". This paper examines the problem of phasiness in the context of time-scale modification and provides new insights into its causes. Two extensions to the standard phase vocoder algorithm are introduced, and the resulting sound quality is shown to be significantly improved. Moreover, the modified phase vocoder is shown to provide a factor-of-two decrease in computational cost.

...read moreread less

355 citations

Proceedings Article•DOI•

HNS: Speech modification based on a harmonic+noise model

[...]

Jean Laroche¹, Yannis Stylianou¹, Eric Moulines¹•Institutions (1)

Télécom ParisTech¹

27 Apr 1993

TL;DR: HNS (harmonic plus noise synthesis), an analysis/modification/synthesis model based on a harmonic plus noise representation of the speech signal, is presented and informal listening-tests demonstrate the effectiveness of this approach for time-scale modifications.

...read moreread less

Abstract: HNS (harmonic plus noise synthesis), an analysis/modification/synthesis model based on a harmonic plus noise representation of the speech signal, is presented. Significant improvements over previous work on the subject are proposed at both the analysis and the synthesis stages: a model of harmonically related sinusoids with linearly varying complex amplitudes for the representation of the deterministic part of the signal; a joint time-domain and frequency-domain representations of the stochastic part of the signal; and a pitch-synchronous PSOLA (pitch-synchronized overlap-add)-like synthesis scheme. Informal listening-tests demonstrate the effectiveness of this approach for time-scale modifications. >

...read moreread less

168 citations

Patent•

Process for identifying audio content

[...]

Jean Laroche¹•Institutions (1)

Creative Technology¹

15 May 2001

TL;DR: In this paper, a fingerprint of an audio signal is generated based on the energy content in frequency subbands, which is then compared to a database to identify the audio signal, which will be useful for signals altered subsequent to the generation of the fingerprint.

...read moreread less

Abstract: A fingerprint of an audio signal is generated based on the energy content in frequency subbands. Processing techniques assure a robust identification fingerprint that will be useful for signals altered subsequent to the generation of the fingerprint. The fingerprint is compared to a database to identify the audio signal.

...read moreread less

167 citations

Proceedings Article•DOI•

New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects

[...]

Jean Laroche, Mark Dolson

17 Oct 1999

TL;DR: The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combination of timescaling and sampling rate conversion.

...read moreread less

Abstract: The phase-vocoder is usually presented as a high-quality solution for time-scale modification of signals, pitch-scale modifications usually being implemented as a combination of timescaling and sampling rate conversion. We present two new phase-vocoder-based techniques which allow direct manipulation of the signal in the frequency-domain, enabling such applications as pitch-shifting, chorusing, harmonizing, partial stretching and other exotic modifications which cannot be achieved by the standard time-scale sampling-rate conversion scheme. The new techniques are based on a very simple peak-detection stage, followed by a peak-shifting stage. The very simplest one allows for 50% overlap but restricts the precision of the modifications, while the most flexible techniques requires a more expensive 75% overlap.

...read moreread less

136 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Musical genre classification of audio signals

[...]

George Tzanetakis¹, Perry R. Cook¹•Institutions (1)

Princeton University¹

07 Nov 2002-IEEE Transactions on Speech and Audio Processing

TL;DR: The automatic classification of audio signals into an hierarchy of musical genres is explored and three feature sets for representing timbral texture, rhythmic content and pitch content are proposed.

...read moreread less

Abstract: Musical genres are categorical labels created by humans to characterize pieces of music. A musical genre is characterized by the common characteristics shared by its members. These characteristics typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. Genre hierarchies are commonly used to structure the large collections of music available on the Web. Currently musical genre annotation is performed manually. Automatic musical genre classification can assist or replace the human user in this process and would be a valuable addition to music information retrieval systems. In addition, automatic musical genre classification provides a framework for developing and evaluating features for any type of content-based analysis of musical signals. In this paper, the automatic classification of audio signals into an hierarchy of musical genres is explored. More specifically, three feature sets for representing timbral texture, rhythmic content and pitch content are proposed. The performance and relative importance of the proposed features is investigated by training statistical pattern recognition classifiers using real-world audio collections. Both whole file and real-time frame-based classification schemes are described. Using the proposed feature sets, classification of 61% for ten musical genres is achieved. This result is comparable to results reported for human musical genre classification.

...read moreread less

2,668 citations

Book•

Digital Watermarking and Steganography

[...]

Ingemar J. Cox, Matthew L. Miller, Jeffrey Adam Bloom, Jessica Fridrich, Ton Kalker - Show less +1 more

23 Nov 2007

TL;DR: This new edition now contains essential information on steganalysis and steganography, and digital watermark embedding is given a complete update with new processes and applications.

...read moreread less

Abstract: Digital audio, video, images, and documents are flying through cyberspace to their respective owners. Unfortunately, along the way, individuals may choose to intervene and take this content for themselves. Digital watermarking and steganography technology greatly reduces the instances of this by limiting or eliminating the ability of third parties to decipher the content that he has taken. The many techiniques of digital watermarking (embedding a code) and steganography (hiding information) continue to evolve as applications that necessitate them do the same. The authors of this second edition provide an update on the framework for applying these techniques that they provided researchers and professionals in the first well-received edition. Steganography and steganalysis (the art of detecting hidden information) have been added to a robust treatment of digital watermarking, as many in each field research and deal with the other. New material includes watermarking with side information, QIM, and dirty-paper codes. The revision and inclusion of new material by these influential authors has created a must-own book for anyone in this profession. *This new edition now contains essential information on steganalysis and steganography *New concepts and new applications including QIM introduced *Digital watermark embedding is given a complete update with new processes and applications

...read moreread less

1,773 citations

Journal Article•DOI•

Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

[...]

Hideki Kawahara, Ikuyo Masuda-Katsuse, Alain de Cheveigné

01 Apr 1999-Speech Communication

TL;DR: A set of simple new procedures has been developed to enable the real-time manipulation of speech parameters by using pitch-adaptive spectral analysis combined with a surface reconstruction method in the time–frequency region.

...read moreread less

1,741 citations

Journal Article•DOI•

Continuous probabilistic transform for voice conversion

[...]

Yannis Stylianou¹, Olivier Cappé², Eric Moulines²•Institutions (2)

Bell Labs¹, École Normale Supérieure²

01 Mar 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

Abstract: Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic+noise model) system, which allows high-quality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods.

...read moreread less

1,109 citations

Journal Article•DOI•

Multirate digital signal processing

[...]

S. Biyiksiz¹•Institutions (1)

Raytheon¹

01 Mar 1985

TL;DR: This book by Elliott and Rao is a valuable contribution to the general areas of signal processing and communications and can be used for a graduate level course in perhaps two ways.

...read moreread less

Abstract: There has been a great deal of material in the area of discrete-time transforms that has been published in recent years. This book does an excellent job of presenting important aspects of such material in a clear manner. The book has 11 chapters and a very useful appendix. Seven of these chapters are essentially devoted to the Fourier series/transform, discrete Fourier transform, fast Fourier transform (FFT), and applications of the FFT in the area of spectral estimation. Chapters 8 through 10 deal with many other discrete-time transforms and algorithms to compute them. Of these transforms, the KarhunenLoeve, the discrete cosine, and the Walsh-Hadamard transform are perhaps the most well-known. A lucid discussion of number theoretic transforms i5 presented in Chapter 11. This reviewer feels that the authors have done a fine job of compiling the pertinent material and presenting it in a concise and clear manner. There are a number of problems at the end of each chapter, an appreciable number of which are challenging. The authors have included a comprehensive set of references at the end of the book. In brief, this book is a valuable contribution to the general areas of signal processing and communications. It can be used for a graduate level course in perhaps two ways. One would be to cover the first seven chapters in great detail. The other would be to cover the whole book by focussing on different topics in a selective manner. This book by Elliott and Rao is extremely useful to researchers/engineers who are working in the areas of signal processing and communications. It i s also an excellent reference book, and hence a valuable addition to one’s library

...read moreread less

843 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse