scispace - formally typeset
A

Antonio Bonafonte

Researcher at Polytechnic University of Catalonia

Publications -  131
Citations -  4409

Antonio Bonafonte is an academic researcher from Polytechnic University of Catalonia. The author has contributed to research in topics: Speech synthesis & Speech processing. The author has an hindex of 27, co-authored 128 publications receiving 3823 citations. Previous affiliations of Antonio Bonafonte include Amazon.com & University of Southern California.

Papers
More filters
Proceedings ArticleDOI

SEGAN: Speech Enhancement Generative Adversarial Network

TL;DR: This work proposes the use of generative adversarial networks for speech enhancement, and operates at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.
Proceedings Article

Speech emotion recognition using hidden Markov models

TL;DR: This paper introduces a first approach to emotion recognition using RAMSES, the UPC’s speech recognition system, based on standard speech recognition technology using hidden semi-continuous Markov models.
Proceedings ArticleDOI

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.

TL;DR: This article proposed an improved self-supervised method, where a single neural encoder is followed by multiple workers that jointly solve different selfsupervised tasks, and the needed consensus across different tasks naturally imposes meaningful constraints to the encoder, contributing to discover general representations and to minimize the risk of learning superficial ones.
Journal ArticleDOI

Voice Conversion Based on Weighted Frequency Warping

TL;DR: Compared to standard probabilistic systems, Weighted Frequency Warping results in a significant increase in quality scores, whereas the conversion scores remain almost unaltered.
Posted Content

SEGAN: Speech Enhancement Generative Adversarial Network

TL;DR: In this paper, a generative adversarial network (GAN) is proposed for speech enhancement, where the model is trained at the waveform level, training the model end-to-end and incorporating 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.