K
Kaitao Song
Researcher at Nanjing University of Science and Technology
Publications - 40
Citations - 2694
Kaitao Song is an academic researcher from Nanjing University of Science and Technology. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 9, co-authored 27 publications receiving 1056 citations. Previous affiliations of Kaitao Song include Microsoft.
Papers
More filters
Posted Content
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang,Enze Xie,Xiang Li,Deng-Ping Fan,Kaitao Song,Ding Liang,Tong Lu,Ping Luo,Ling Shao +8 more
TL;DR: Huang et al. as discussed by the authors proposed Pyramid Vision Transformer (PVT), which is a simple backbone network useful for many dense prediction tasks without convolutions, and achieved state-of-the-art performance on the COCO dataset.
Proceedings Article
MASS: Masked Sequence to Sequence Pre-training for Language Generation
TL;DR: This work proposes MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks, which achieves the state-of-the-art accuracy on the unsupervised English-French translation, even beating the early attention-based supervised model.
Proceedings Article
MPNet: Masked and Permuted Pre-training for Language Understanding
TL;DR: This paper proposes MPNet, a novel pre-training method that inherits the advantages of BERT and XLNet and avoids their limitations, and achieves better results on these tasks compared with previous state-of-the-art pre-trained methods.
Posted Content
MASS: Masked Sequence to Sequence Pre-training for Language Generation
TL;DR: The authors proposed MASS, which adopts the encoder-decoder framework to reconstruct a sentence fragment given the remaining part of the sentence, and achieves state-of-the-art performance on unsupervised English-French translation.
Posted Content
PVTv2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang,Enze Xie,Xiang Li,Deng-Ping Fan,Kaitao Song,Ding Liang,Tong Lu,Ping Luo,Ling Shao +8 more
TL;DR: Huang et al. as mentioned in this paper improved the Pyramid Vision Transformer (abbreviated as PVTv1) by adding three designs, including overlapping patch embedding, convolutional feed-forward networks and linear complexity attention layers.