scispace - formally typeset
R

Ramon Sanabria

Researcher at Carnegie Mellon University

Publications -  40
Citations -  751

Ramon Sanabria is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Language model & Computer science. The author has an hindex of 12, co-authored 36 publications receiving 532 citations. Previous affiliations of Ramon Sanabria include University of Edinburgh.

Papers
More filters
Proceedings Article

How2: A Large-scale Dataset for Multimodal Language Understanding

TL;DR: How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations, is introduced, and integrated sequence-to-sequence baselines for machine translation, automatic speech recognition, spoken language translation, and multi-modal summarization are presented.

The IWSLT 2019 Evaluation Campaign

TL;DR: The IWSLT 2019 evaluation campaign featured three tasks: speech translation ofTED talks and How2 instructional videos from English into German and Portuguese, and text translation of TED talks fromEnglish into Czech.
Proceedings ArticleDOI

Sequence-Based Multi-Lingual Low Resource Speech Recognition

TL;DR: The authors showed that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss and showed that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data.
Proceedings ArticleDOI

Hierarchical Multitask Learning With CTC

TL;DR: This paper shows how Hierarchical Multitask Learning can encourage the formation of useful intermediate representations by performing Connectionist Temporal Classification at different levels of the network with targets of different granularity.
Posted Content

Sequence-based Multi-lingual Low Resource Speech Recognition

TL;DR: It is shown that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss and can be adapted cross-lingually to an unseen language using just 25% of the target data.