Attention is All you Need

Open AccessProceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- Vol. 30, pp 5998-6008

Chats0

TLDR

This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

Abstract:

The dominant sequence transduction models are based on complex recurrent orconvolutional neural networks in an encoder and decoder configuration. The best performing such models also connect the encoder and decoder through an attentionm echanisms. We propose a novel, simple network architecture based solely onan attention mechanism, dispensing with recurrence and convolutions entirely.Experiments on two machine translation tasks show these models to be superiorin quality while being more parallelizable and requiring significantly less timeto train. Our single model with 165 million parameters, achieves 27.5 BLEU onEnglish-to-German translation, improving over the existing best ensemble result by over 1 BLEU. On English-to-French translation, we outperform the previoussingle state-of-the-art with model by 0.7 BLEU, achieving a BLEU score of 41.1.

Citations

PDF

Open Access

More filters

Posted Content

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Jaehyeon Kim, +3 more

- 22 May 2020 -

arXiv: Audio and Speech Processing

TL;DR: This article proposed Glow-TTS, a flow-based generative model for parallel text-to-speech (TTS) that does not require any external aligner and achieves an order-of-magnitude speedup over the autoregressive model, Tacotron 2, at synthesis with comparable speech quality.

...read moreread less

Proceedings ArticleDOI

Deep Learning for Depression Detection of Twitter Users

Ahmed Husseini Orabi, +3 more

TL;DR: The most effective deep neural network architecture among a few of selected architectures that were successfully used in natural language processing tasks are identified and used to detect users with signs of mental illnesses given limited unstructured text data extracted from the Twitter social media platform.

...read moreread less

Posted Content

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

Yu Zhang, +7 more

- 20 Oct 2020 -

arXiv: Audio and Speech Processing

TL;DR: A combination of recent developments in semi-supervised learning for automatic speech recognition to obtain state-of-the-art results on LibriSpeech utilizing the unlabeled audio of the Libri-Light dataset.

...read moreread less

Proceedings ArticleDOI

Music Gesture for Visual Sound Separation

Chuang Gan, +4 more

TL;DR: This work proposes ``Music Gesture," a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music, which adopts a context-aware graph network to integrate visual semantic context with body dynamics and applies an audio-visual fusion model to associate body movements with the corresponding audio signals.

...read moreread less

Proceedings ArticleDOI

Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer

Yan Lu, +6 more

TL;DR: Wang et al. as mentioned in this paper proposed a cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modal-specific characteristics to boost the reID performance.

...read moreread less

Collapse

Attention is All you Need

Citations

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Deep Learning for Depression Detection of Twitter Users

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

Music Gesture for Visual Sound Separation

Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer

Related Papers (5)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Adam: A Method for Stochastic Optimization

Deep Residual Learning for Image Recognition

Long short-term memory

Bleu: a Method for Automatic Evaluation of Machine Translation