Home
/
Authors
/
Slim Essid

Author

Slim Essid

Other affiliations: Nanyang Technological University, University of Paris, Institut Mines-Télécom ...read more

Bio: Slim Essid is an academic researcher from Télécom ParisTech. The author has contributed to research in topics: Non-negative matrix factorization & Feature learning. The author has an hindex of 25, co-authored 128 publications receiving 2081 citations. Previous affiliations of Slim Essid include Nanyang Technological University & University of Paris.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

YAAFE, an Easy to Use and Efficient Audio Feature Extraction Software.

[...]

Benoît Mathieu, Slim Essid¹, Thomas Fillon¹, Jacques Prado, Gael Richard¹ - Show less +1 more•Institutions (1)

Télécom ParisTech¹

09 Aug 2010

TL;DR: In this paper, a new audio feature extraction software, YAAFE 1, is presented and compared to widely used libraries and the main advantage is a significantly lower complexity due to the appropriate exploitation of redundancy in the feature calculation.

...read moreread less

Abstract: Music Information Retrieval systems are commonly built on a feature extraction stage. For applications involving automatic classification (e.g. speech/music discrimination, music genre or mood recognition, ...), traditional approaches will consider a large set of audio features to be extracted on a large dataset. In some cases, this will lead to computationally intensive systems and there is, therefore, a strong need for efficient feature extraction. In this paper, a new audio feature extraction software, YAAFE 1 , is presented and compared to widely used libraries. The main advantage of YAAFE is a significantly lower complexity due to the appropriate exploitation of redundancy in the feature calculation. YAAFE remains easy to configure and each feature can be parameterized independently. Finally, the YAAFE framework and most of its core feature library are released in source code under the GNU Lesser General Public License.

...read moreread less

175 citations

Journal Article•DOI•

Instrument recognition in polyphonic music based on automatic taxonomies

[...]

Slim Essid¹, Gael Richard¹, Bertrand David¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Dec 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This study focuses on a single music genre but combines a variety of instruments among which are percussion and singing voice, and obtains a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously.

...read moreread less

Abstract: We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic distances, we obtain a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously. Moreover, a wide set of acoustic features is studied including some new proposals. In particular, signal to mask ratios are found to be useful features for audio classification. This study focuses on a single music genre (i.e., jazz) but combines a variety of instruments among which are percussion and singing voice. Using a varied database of sound excerpts from commercial recordings, we show that the segmentation of music with respect to the instruments played can be achieved with an average accuracy of 53%.

...read moreread less

133 citations

Journal Article•DOI•

Temporal Integration for Audio Classification With Application to Musical Instrument Classification

[...]

C. Joder¹, Slim Essid¹, Gael Richard¹•Institutions (1)

ParisTech¹

01 Jan 2009-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A number of methods for early and late temporal integration are proposed and an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases is provided.

...read moreread less

Abstract: Nowadays, it appears essential to design automatic indexing tools which provide meaningful and efficient means to describe the musical audio content. There is in fact a growing interest for music information retrieval (MIR) applications amongst which the most popular are related to music similarity retrieval, artist identification, musical genre or instrument recognition. Current MIR-related classification systems usually do not take into account the mid-term temporal properties of the signal (over several frames) and lie on the assumption that the observations of the features in different frames are statistically independent. The aim of this paper is to demonstrate the usefulness of the information carried by the evolution of these characteristics over time. To that purpose, we propose a number of methods for early and late temporal integration and provide an in-depth experimental study on their interest for the task of musical instrument recognition on solo musical phrases. In particular, the impact of the time horizon over which the temporal integration is performed will be assessed both for fixed and variable frame length analysis. Also, a number of proposed alignment kernels will be used for late temporal integration. For all experiments, the results are compared to a state of the art musical instrument recognition system.

...read moreread less

129 citations

Journal Article•DOI•

Musical instrument recognition by pairwise classification strategies

[...]

Slim Essid¹, Gael Richard¹, Bertrand David¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Jul 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: It is shown that higher recognition rates can be reached with pairwise optimized subsets of features in association with SVM classification using a radial basis function kernel.

...read moreread less

Abstract: Musical instrument recognition is an important aspect of music information retrieval. In this paper, statistical pattern recognition techniques are utilized to tackle the problem in the context of solo musical phrases. Ten instrument classes from different instrument families are considered. A large sound database is collected from excerpts of musical phrases acquired from commercial recordings translating different instrument instances, performers, and recording conditions. More than 150 signal processing features are studied including new descriptors. Two feature selection techniques, inertia ratio maximization with feature space projection and genetic algorithms are considered in a class pairwise manner whereby the most relevant features are fetched for each instrument pair. For the classification task, experimental results are provided using Gaussian mixture models (GMMs) and support vector machines (SVMs). It is shown that higher recognition rates can be reached with pairwise optimized subsets of features in association with SVM classification using a radial basis function kernel

...read moreread less

111 citations

Journal Article•DOI•

Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification

[...]

Victor Bisot¹, Romain Serizel¹, Slim Essid¹, Gael Richard¹•Institutions (1)

Université Paris-Saclay¹

01 Jun 2017-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: It is shown that the unsupervised learning methods provide better representations of acoustic scenes than the best conventional hand-crafted features on both datasets and the introduction of a novel nonnegative supervised matrix factorization model and deep neural networks trained on spectrograms allow for further improvements.

...read moreread less

Abstract: In this paper, we study the usefulness of various matrix factorization methods for learning features to be used for the specific acoustic scene classification ASC problem A common way of addressing ASC has been to engineer features capable of capturing the specificities of acoustic environments Instead, we show that better representations of the scenes can be automatically learned from time-frequency representations using matrix factorization techniques We mainly focus on extensions including sparse, kernel-based, convolutive and a novel supervised dictionary learning variant of principal component analysis and nonnegative matrix factorization An experimental evaluation is performed on two of the largest ASC datasets available in order to compare and discuss the usefulness of these methods for the task We show that the unsupervised learning methods provide better representations of acoustic scenes than the best conventional hand-crafted features on both datasets Furthermore, the introduction of a novel nonnegative supervised matrix factorization model and deep neural networks trained on spectrograms, allow us to reach further improvements

...read moreread less

93 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

[...]

Justin Salamon¹, Juan Pablo Bello¹•Institutions (1)

New York University¹

23 Jan 2017-IEEE Signal Processing Letters

TL;DR: It is shown that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation.

...read moreread less

Abstract: The ability of deep convolutional neural networks (CNNs) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep CNN architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model's classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.

...read moreread less

996 citations

Book•

我的台灣, 看見心靈的故鄉 =2009林磐聳藝術與設計展

[...]

蕭瓊瑞撰述, 姚村雄撰述

01 Jan 2009

936 citations

Book Chapter•DOI•

Regression Analysis of Count Data: Preface

[...]

A. Colin Cameron, Pravin K. Trivedi

01 Jan 1998

885 citations

Journal Article•DOI•

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

[...]

Justin Salamon¹, Juan Pablo Bello¹•Institutions (1)

New York University¹

15 Aug 2016-arXiv: Sound

TL;DR: In this paper, the authors proposed a deep convolutional neural network architecture for environmental sound classification and used audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture.

...read moreread less

Abstract: The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a "shallow" dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model's classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.

...read moreread less

864 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse