E
Enze Xie
Researcher at University of Hong Kong
Publications - 67
Citations - 3714
Enze Xie is an academic researcher from University of Hong Kong. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 14, co-authored 44 publications receiving 1228 citations. Previous affiliations of Enze Xie include Nanjing University of Science and Technology & Tongji University.
Papers
More filters
Posted Content
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang,Enze Xie,Xiang Li,Deng-Ping Fan,Kaitao Song,Ding Liang,Tong Lu,Ping Luo,Ling Shao +8 more
TL;DR: Huang et al. as discussed by the authors proposed Pyramid Vision Transformer (PVT), which is a simple backbone network useful for many dense prediction tasks without convolutions, and achieved state-of-the-art performance on the COCO dataset.
Proceedings ArticleDOI
Shape Robust Text Detection With Progressive Scale Expansion Network
TL;DR: A novel Progressive Scale Expansion Network (PSENet) is proposed, which can precisely detect text instances with arbitrary shapes and is effective to split the close text instances, making it easier to use segmentation-based methods to detect arbitrary-shaped text instances.
Proceedings ArticleDOI
PolarMask: Single Shot Instance Segmentation With Polar Representation
TL;DR: PolarMask as discussed by the authors formulates the instance segmentation problem as predicting contour of instance through instance center classification and dense distance regression in a polar coordinate, which can be used by easily embedding it into most off-the-shelf detection methods.
Proceedings ArticleDOI
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
TL;DR: A new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks, and achieves the new state-of-the-art 56.9% in terms of NDS metric on the nuScenes test set.
Proceedings ArticleDOI
Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network
TL;DR: Zhang et al. as discussed by the authors proposed an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable postprocessing.