R
Roberto Barra-Chicote
Researcher at Amazon.com
Publications - 66
Citations - 984
Roberto Barra-Chicote is an academic researcher from Amazon.com. The author has contributed to research in topics: Speech synthesis & Computer science. The author has an hindex of 17, co-authored 58 publications receiving 707 citations. Previous affiliations of Roberto Barra-Chicote include Technical University of Madrid.
Papers
More filters
Proceedings ArticleDOI
Improvements to Prosodic Alignment for Automatic Dubbing
TL;DR: In this paper, the prosodic alignment component of the dubbing architecture is improved and compared to previous work, the enhanced prosodic alignments significantly improve prosodic accuracy and provide segmentation perceptibly better or on par with manually annotated reference segmentation.
Proceedings ArticleDOI
From Speech-to-Speech Translation to Automatic Dubbing
Marcello Federico,Robert Enyedi,Roberto Barra-Chicote,Ritwik Giri,Umut Isik,Arvindh Krishnaswamy,Hassan Sawaf +6 more
TL;DR: In this paper, the authors present enhancements to a speech-to-speech translation pipeline in order to perform automatic dubbing of TED Talks from English into Italian, which measure the perceived naturalness of automatic dubbed and the relative importance of each proposed enhancement.
Posted Content
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech.
TL;DR: The proposed Text-to-Speech method to create an unseen expressive style using one utterance of expressive speech of around one second provides a 22% KL-divergence reduction while jointly improving perceptual metrics over state-of-the-art.
Book ChapterDOI
Towards Cross-Lingual Emotion Transplantation
TL;DR: The aim is to lean the nuances of emotional speech in a source language for which there is enough data to adapt an acceptable quality emotional model by means of CSMAPLR adaptation, and then convert the adaptation function so it can be applied to a target language in a different target speaker while maintaining the speaker identity but adding emotional information.
Proceedings ArticleDOI
Using Vaes and Normalizing Flows for One-Shot Text-To-Speech Synthesis of Expressive Speech
TL;DR: This article proposed a Text-to-Speech method to create an unseen expressive style using one utterance of expressive speech of around one second, which enhances the disentanglement capabilities of a state-of-the-art sequence-tosequence based system with a VAE and a Householder Flow.