scispace - formally typeset
K

Kaitao Song

Researcher at Nanjing University of Science and Technology

Publications -  40
Citations -  2694

Kaitao Song is an academic researcher from Nanjing University of Science and Technology. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 9, co-authored 27 publications receiving 1056 citations. Previous affiliations of Kaitao Song include Microsoft.

Papers
More filters
Posted Content

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

TL;DR: Huang et al. as discussed by the authors proposed Pyramid Vision Transformer (PVT), which is a simple backbone network useful for many dense prediction tasks without convolutions, and achieved state-of-the-art performance on the COCO dataset.
Proceedings Article

MASS: Masked Sequence to Sequence Pre-training for Language Generation

TL;DR: This work proposes MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks, which achieves the state-of-the-art accuracy on the unsupervised English-French translation, even beating the early attention-based supervised model.
Proceedings Article

MPNet: Masked and Permuted Pre-training for Language Understanding

TL;DR: This paper proposes MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations, and achieves better results on these tasks compared with previous state-of-the-art pre-trained methods.
Posted Content

MASS: Masked Sequence to Sequence Pre-training for Language Generation

TL;DR: The authors proposed MASS, which adopts the encoder-decoder framework to reconstruct a sentence fragment given the remaining part of the sentence, and achieves state-of-the-art performance on unsupervised English-French translation.
Posted Content

PVTv2: Improved Baselines with Pyramid Vision Transformer

TL;DR: Huang et al. as mentioned in this paper improved the Pyramid Vision Transformer (abbreviated as PVTv1) by adding three designs, including overlapping patch embedding, convolutional feed-forward networks and linear complexity attention layers.