Home
/
Authors
/
Juan José Burred

Author

Juan José Burred

Other affiliations: Technical University of Berlin, Free University of Berlin

Bio: Juan José Burred is an academic researcher from IRCAM. The author has contributed to research in topics: Source separation & Audio signal processing. The author has an hindex of 12, co-authored 30 publications receiving 445 citations. Previous affiliations of Juan José Burred include Technical University of Berlin & Free University of Berlin.

Papers

PDF

Open Access

More filters

Journal Article•

Hierarchical Automatic Audio Signal Classification

[...]

Juan José Burred, Alexander Lerch

15 Jul 2004-Journal of The Audio Engineering Society

TL;DR: The design, implementation, and evaluation of a system for automatic audio signal classification is presented, differentiating between three speech classes, 13 musical genres, and background noise according to audio type.

...read moreread less

Abstract: The design, implementation, and evaluation of a system for automatic audio signal classification is presented. The signals are classified according to audio type, differentiating between three speech classes, 13 musical genres, and background noise. A large number of audio features are evaluated for their suitability in such a classification task, including MPEG-7 descriptors and several new features. The selection of the features is carried out systematically with regard to their robustness to noise and bandwidth changes, as well as to their ability to distinguish a given set of audio types. Direct and hierarchical approaches for the feature selection and for the classification are evaluated and compared.

...read moreread less

76 citations

Journal Article•DOI•

Cracking the social code of speech prosody using reverse correlation.

[...]

Emmanuel Ponsot¹, Juan José Burred, Pascal Belin², Jean-Julien Aucouturier¹•Institutions (2)

University of Paris¹, Centre national de la recherche scientifique²

10 Apr 2018-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: An experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation is introduced and shows that two of the most important dimensions of social judgments, a speaker’s perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word “Hello.”

...read moreread less

Abstract: Human listeners excel at forming high-level social representations about each other, even from the briefest of utterances. In particular, pitch is widely recognized as the auditory dimension that conveys most of the information about a speaker’s traits, emotional states, and attitudes. While past research has primarily looked at the influence of mean pitch, almost nothing is known about how intonation patterns, i.e., finely tuned pitch trajectories around the mean, may determine social judgments in speech. Here, we introduce an experimental paradigm that combines state-of-the-art voice transformation algorithms with psychophysical reverse correlation and show that two of the most important dimensions of social judgments, a speaker’s perceived dominance and trustworthiness, are driven by robust and distinguishing pitch trajectories in short utterances like the word “Hello,” which remained remarkably stable whether male or female listeners judged male or female speakers. These findings reveal a unique communicative adaptation that enables listeners to infer social traits regardless of speakers’ physical characteristics, such as sex and mean pitch. By characterizing how any given individual’s mental representations may differ from this generic code, the method introduced here opens avenues to explore dysprosody and social-cognitive deficits in disorders like autism spectrum and schizophrenia. In addition, once derived experimentally, these prototypes can be applied to novel utterances, thus providing a principled way to modulate personality impressions in arbitrary speech signals.

...read moreread less

56 citations

Journal Article•DOI•

Dynamic Spectral Envelope Modeling for Timbre Analysis of Musical Instrument Sounds

[...]

Juan José Burred¹, Axel Röbel¹, Thomas Sikora²•Institutions (2)

IRCAM¹, Free University of Berlin²

01 Mar 2010-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A computational model of musical instrument sounds that focuses on capturing the dynamic behavior of the spectral envelope, which results in a compact representation in the form of a set of prototype curves in feature space, or equivalently of prototype spectro-temporal envelopes in the time-frequency domain.

...read moreread less

Abstract: We present a computational model of musical instrument sounds that focuses on capturing the dynamic behavior of the spectral envelope. A set of spectro-temporal envelopes belonging to different notes of each instrument are extracted by means of sinusoidal modeling and subsequent frequency interpolation, before being subjected to principal component analysis. The prototypical evolution of the envelopes in the obtained reduced-dimensional space is modeled as a nonstationary Gaussian Process. This results in a compact representation in the form of a set of prototype curves in feature space, or equivalently of prototype spectro-temporal envelopes in the time-frequency domain. Finally, the obtained models are successfully evaluated in the context of two music content analysis tasks: classification of instrument samples and detection of instruments in monaural polyphonic mixtures.

...read moreread less

45 citations

Proceedings Article•

Polyphonic Instrument Recognition Using Spectral Clustering.

[...]

Luis Gustavo Martins, Juan José Burred, George Tzanetakis, Mathieu Lagrange

01 Jan 2007

TL;DR: This paper proposes a framework for the sound source separation and timbre classification of polyphonic, multi-instrumental music signals, inspired by ideas from Computational Auditory Scene Analysis and formulated as a graph partitioning problem.

...read moreread less

Abstract: The identification of the instruments playing in a polyphonic music signal is an important and unsolved problem in Music Information Retrieval. In this paper, we propose a framework for the sound source separation and timbre classification of polyphonic, multi-instrumental music signals. The sound source separation method is inspired by ideas from Computational Auditory Scene Analysis and formulated as a graph partitioning problem. It utilizes a sinusoidal analysis front-end and makes use of the normalized cut, applied as a global criterion for segmenting graphs. Timbre models for six musical instruments are used for the classification of the resulting sound sources. The proposed framework is evaluated on a dataset consisting of mixtures of a variable number of simultaneous pitches and instruments, up to a maximum of four concurrent notes.

...read moreread less

37 citations

Proceedings Article•DOI•

Audio event detection based on layered symbolic sequence representations

[...]

Michele Lai Chin, Juan José Burred

25 Mar 2012

TL;DR: A novel application of genetic motif discovery in symbolic sequence representations of sound for audio event detection in an unsupervised and query less manner and can be interpreted as statistical temporal models of spectral evolution.

...read moreread less

Abstract: We introduce a novel application of genetic motif discovery in symbolic sequence representations of sound for audio event detection. Sounds are represented as a set of parallel symbolic sequences, each symbol representing a spectral shape, and each layer indicating the contribution weights of each spectral shape to the sound. Such layered symbolic representations are input to a genetic motif discovery algorithm that detects and clusters recurrent and structurally salient sound events in an unsupervised and query less manner. The found motifs can be interpreted as statistical temporal models of spectral evolution. The system is successfully evaluated in two tasks: environmental sound event detection, and drum onset detection.

...read moreread less

28 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Digital processing of speech signals

[...]

M.G. Bellanger

01 Oct 1980

1,565 citations

Journal Article•

The Journal of the Acoustical Society of America

[...]

Léon Auger

01 Jan 1949-Revue D'histoire Des Sciences

631 citations

Bubbles: A Technique to Reveal the Use of Information in Recognition Tasks EOA Report

[...]

Philippe G. Schyns

01 Jan 2005

TL;DR: In this article, a general technique called Bubbles is proposed to assign the credit of human categorization performance to specific visual information, such as gender, expressive or not and identity.

...read moreread less

Abstract: Everyday, people flexibly perform different categorizations of common faces, objects and scenes. Intuition and scattered evidence suggest that these categorizations require the use of different visual information from the input. However, there is no unifying method, based on the categorization performance of subjects, that can isolate the information used. To this end, we developed Bubbles, a general technique that can assign the credit of human categorization performance to specific visual information. To illustrate the technique, we applied Bubbles on three categorization tasks (gender, expressive or not and identity) on the same set of faces, with human and ideal observers to compare the features they used.

...read moreread less

623 citations

Journal Article•DOI•

Automatic music transcription: challenges and future directions

[...]

Emmanouil Benetos¹, Simon Dixon², Dimitrios Giannoulis², Holger Kirchhoff², Anssi Klapuri³ - Show less +1 more•Institutions (3)

City University London¹, Queen Mary University of London², Tampere University of Technology³

01 Dec 2013

TL;DR: Limits of current transcription methods are analyzed and promising directions for future research are identified, including the integration of information from multiple algorithms and different musical aspects.

...read moreread less

Abstract: Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects.

...read moreread less

298 citations

Journal Article•DOI•

Book Review: Auditory Scene Analysis: The Perceptual Organization of SoundBregmanA.S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: The MIT Press. Pp. 773. ISBN 0-262-02297-4. £49.50.

[...]

Quentin Summerfield

01 Jan 1992-Quarterly Journal of Experimental Psychology

TL;DR: Bregman as discussed by the authors argues that there are two kinds of principle for auditory grouping and segregation: schema-based and primitive, and provides a comprehensive review and interpretation of perceptual experiments up to 1989, so his book pre-dates recent attempts to implement auditory grouping principles in computational models.

...read moreread less

Abstract: The world is full of sources of sound. As I write this review, I can hear the humming of the word processor, the creaking of a door in the wind, the distant rumble of an aeroplane, the passage of a car close by, a bird twittering, my neighbour talking on his doorstep, music from his son’s hi-fi, and someone speaking on the radio in the next room. Although each source generates a particular pattern of changes in air-pressure, the changes have summed together by the time they reach my ears, yet I perceive each source distinctly. What principles of perceptual grouping and segregation do listeners use to partition such mixtures of sound? Which principles are applied automatically to all sounds? Which are specialized for particular classes of sound, such as speech? In what ways have the principles been exploited in musical composition? These are the major concerns of this lengthy, scholarly, but readable book. Bregman’s approach is functional not physiological, empirical not computational. He provides a comprehensive review and interpretation of perceptual experiments up to about 1989, so his book pre-dates recent attempts to implement auditory grouping principles in computational models and to find a physiological substrate for them. One important distinction is sustained throughout the book. Bregman argues that there are two kinds of principle for auditory grouping and segregation: “schema-based’’ and “primitive”. Schema-based principles are specific to particular types of source. They are learnt by listeners, and their application is under attentional control. One example may be the use of the knowledge of the timbre of an instrument to follow its part in an ensemble. Another example may be the use of phonetic knowledge to integrate acoustic cues in speech perception. Primitive grouping principles, in contrast, are innate, learnt through evolution. They automatically exploit fundamental physical properties of sounds and sound sources. For example: the sizes of resonators generally change slowly; they often generate energy simultaneously over a wide frequency range; when they vibrate, they create energy at the discrete

...read moreread less

273 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

Collapse