scispace - formally typeset
Search or ask a question

Showing papers by "Enrique Alfonseca published in 2018"


Posted Content
TL;DR: This study follows a rigorous evaluation protocol using a large set of previously used and novel automatic and human evaluation metrics, applied to both generated samples and reconstructions, and hopes that it will become the new evaluation standard when comparing neural generative models for text.
Abstract: In this paper, we study recent neural generative models for text generation related to variational autoencoders. Previous works have employed various techniques to control the prior distribution of the latent codes in these models, which is important for sampling performance, but little attention has been paid to reconstruction error. In our study, we follow a rigorous evaluation protocol using a large set of previously used and novel automatic and human evaluation metrics, applied to both generated samples and reconstructions. We hope that it will become the new evaluation standard when comparing neural generative models for text.

46 citations


Book ChapterDOI
TL;DR: A set of prosodic modifications that highlight potentially important parts of the answer using various acoustic cues are offered and show that different modifications lead to better comprehension at the expense of slightly degraded naturalness of the audio.
Abstract: Many popular form factors of digital assistants---such as Amazon Echo, Apple Homepod, or Google Home---enable the user to hold a conversation with these systems based only on the speech modality. The lack of a screen presents unique challenges. To satisfy the information need of a user, the presentation of the answer needs to be optimized for such voice-only interactions. In this paper, we propose a task of evaluating the usefulness of audio transformations (i.e., prosodic modifications) for voice-only question answering. We introduce a crowdsourcing setup where we evaluate the quality of our proposed modifications along multiple dimensions corresponding to the informativeness, naturalness, and ability of the user to identify key parts of the answer. We offer a set of prosodic modifications that highlight potentially important parts of the answer using various acoustic cues. Our experiments show that some of these prosodic modifications lead to better comprehension at the expense of only slightly degraded naturalness of the audio.

17 citations