scispace - formally typeset
Search or ask a question
Institution

IRCAM

EducationParis, France
About: IRCAM is a education organization based out in Paris, France. It is known for research contribution in the topics: Timbre & Audio signal processing. The organization has 313 authors who have published 754 publications receiving 18031 citations. The organization is also known as: IRCAM.


Papers
More filters
Journal ArticleDOI
TL;DR: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds, based on the well-known autocorrelation method with a number of modifications that combine to prevent errors.
Abstract: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors. The algorithm has several desirable features. Error rates are about three times lower than the best competing methods, as evaluated over a database of speech recorded together with a laryngograph signal. There is no upper limit on the frequency search range, so the algorithm is suited for high-pitched voices and music. The algorithm is relatively simple and may be implemented efficiently and with low latency, and it involves few parameters that must be tuned. It is based on a signal model (periodic signal) that may be extended in several ways to handle various forms of aperiodicity that occur in particular applications. Finally, interesting parallels may be drawn with models of auditory processing.

1,975 citations

Journal ArticleDOI
Philippe Esling1, Carlos Agon1
TL;DR: A survey of the techniques applied for time-series data mining, namely representation techniques, distance measures, and indexing methods, is provided.
Abstract: In almost every scientific field, measurements are performed over time. These observations lead to a collection of organized data called time series. The purpose of time-series data mining is to try to extract all meaningful knowledge from the shape of data. Even if humans have a natural capacity to perform these tasks, it remains a complex problem for computers. In this article we intend to provide a survey of the techniques applied for time-series data mining. The first part is devoted to an overview of the tasks that have captured most of the interest of researchers. Considering that in most cases, time-series task relies on the same components for implementation, we divide the literature depending on these common aspects, namely representation techniques, distance measures, and indexing methods. The study of the relevant literature has been categorized for each individual aspects. Four types of robustness could then be formalized and any kind of distance could then be classified. Finally, the study submits various research trends and avenues that can be explored in the near future. We hope that this article can provide a broad and deep understanding of the time-series data mining research field.

762 citations

Journal ArticleDOI
TL;DR: The model with latent classes and specificities gave a better fit to the data and made the acoustic correlates of the common dimensions more interpretable, suggesting that musical timbres possess specific attributes not accounted for by these shared perceptual dimensions.
Abstract: To study the perceptual structure of musical timbre and the effects of musical training, timbral dissimilarities of synthesized instrument sounds were rated by professional musicians, amateur musicians, and nonmusicians The data were analyzed with an extended version of the multidimensional scaling algorithm CLASCAL (Winsberg & De Soete, 1993), which estimates the number of latent classes of subjects, the coordinates of each timbre on common Euclidean dimensions, a specificity value of unique attributes for each timbre, and a separate weight for each latent class on each of the common dimensions and the set of specificities Five latent classes were found for a three-dimensional spatial model with specificities Common dimensions were quantified psychophysically in terms of log-rise time, spectral centroid, and degree of spectral variation The results further suggest that musical timbres possess specific attributes not accounted for by these shared perceptual dimensions Weight patterns indicate that perceptual salience of dimensions and specificities varied across classes A comparison of class structure with biographical factors associated with degree of musical training and activity was not clearly related to the class structure, though musicians gave more precise and coherent judgments than did nonmusicians or amateurs The model with latent classes and specificities gave a better fit to the data and made the acoustic correlates of the common dimensions more interpretable

599 citations

Journal ArticleDOI
TL;DR: In this article, a 3-dimensional space was found to provide a good fit of the data, with arousal and emotional valence as the primary dimensions, and emotional responses to music are very stable within and between participants, and are weakly influenced by musical expertise and excerpt duration.
Abstract: Musically trained and untrained listeners were required to listen to 27 musical excerpts and to group those that conveyed a similar emotional meaning (Experiment 1). The groupings were transformed into a matrix of emotional dissimilarity that was analysed through multidimensional scaling methods (MDS). A 3-dimensional space was found to provide a good fit of the data, with arousal and emotional valence as the primary dimensions. Experiments 2 and 3 confirmed the consistency of this 3-dimensional space using excerpts of only 1 second duration. The overall findings indicate that emotional responses to music are very stable within and between participants, and are weakly influenced by musical expertise and excerpt duration. These findings are discussed in light of a cognitive account of musical emotion.

361 citations

Journal ArticleDOI
TL;DR: This analysis suggests ten classes of relatively independent audio descriptors, showing that the Timbre Toolbox is a multidimensional instrument for the measurement of the acoustical structure of complex sound signals.
Abstract: The analysis of musical signals to extract audio descriptors that can potentially characterize their timbre has been disparate and often too focused on a particular small set of sounds. The Timbre Toolbox provides a comprehensive set of descriptors that can be useful in perceptual research, as well as in music information retrieval and machine-learning approaches to content-based retrieval in large sound databases. Sound events are first analyzed in terms of various input representations (short-term Fourier transform, harmonic sinusoidal components, an auditory model based on the equivalent rectangular bandwidth concept, the energy envelope). A large number of audio descriptors are then derived from each of these representations to capture temporal, spectral, spectrotemporal, and energetic properties of the sound events. Some descriptors are global, providing a single value for the whole sound event, whereas others are time-varying. Robust descriptive statistics are used to characterize the time-varying descriptors. To examine the information redundancy across audio descriptors, correlational analysis followed by hierarchical clustering is performed. This analysis suggests ten classes of relatively independent audio descriptors, showing that the Timbre Toolbox is a multidimensional instrument for the measurement of the acoustical structure of complex sound signals.

309 citations


Authors

Showing all 319 results

Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

75% related

Facebook
10.9K papers, 570.1K citations

74% related

Microsoft
86.9K papers, 4.1M citations

74% related

Carnegie Mellon University
104.3K papers, 5.9M citations

73% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20232
202212
202115
202026
201939
201825