S
Sixiao Zheng
Researcher at Fudan University
Publications - 9
Citations - 1812
Sixiao Zheng is an academic researcher from Fudan University. The author has contributed to research in topics: Computer science & Encoder. The author has an hindex of 2, co-authored 6 publications receiving 208 citations.
Papers
More filters
Proceedings ArticleDOI
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng,Jiachen Lu,Hengshuang Zhao,Xiatian Zhu,Zekun Luo,Yabiao Wang,Yanwei Fu,Jianfeng Feng,Tao Xiang,Philip H. S. Torr,Li Zhang +10 more
TL;DR: Zhang et al. as discussed by the authors proposed a pure transformer to encode an image as a sequence of patches, which can be combined with a simple decoder to provide a powerful segmentation model.
Proceedings ArticleDOI
NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection
TL;DR: Wang et al. as mentioned in this paper propose a pull loss to pull predictions with the same target close to each other, and a push loss to push predictions with different targets away from each other.
Posted Content
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng,Jiachen Lu,Hengshuang Zhao,Xiatian Zhu,Zekun Luo,Yabiao Wang,Yanwei Fu,Jianfeng Feng,Tao Xiang,Philip H. S. Torr,Li Zhang +10 more
TL;DR: Zhang et al. as mentioned in this paper proposed a pure transformer to encode an image as a sequence of patches, which can be combined with a simple decoder to provide a powerful segmentation model.
Proceedings ArticleDOI
NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection
TL;DR: Wang et al. as discussed by the authors propose a pull loss to pull predictions with the same target close to each other, and a push loss to push predictions with different targets away from each other.
Journal ArticleDOI
Visual Representation Learning with Transformer: A Sequence-to-Sequence Perspective
TL;DR: This paper treats visual representation learning generally as a sequence-to-sequence prediction task, and forms a family of Hierarchical Local-Global (HLG) Transformers characterized by local attention within windows and global-attention across windows in a hierarchical and pyramidal architecture.