scispace - formally typeset
E

Enze Xie

Researcher at University of Hong Kong

Publications -  67
Citations -  3714

Enze Xie is an academic researcher from University of Hong Kong. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 14, co-authored 44 publications receiving 1228 citations. Previous affiliations of Enze Xie include Nanjing University of Science and Technology & Tongji University.

Papers
More filters
Posted Content

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

TL;DR: Huang et al. as discussed by the authors proposed Pyramid Vision Transformer (PVT), which is a simple backbone network useful for many dense prediction tasks without convolutions, and achieved state-of-the-art performance on the COCO dataset.
Proceedings ArticleDOI

Shape Robust Text Detection With Progressive Scale Expansion Network

TL;DR: A novel Progressive Scale Expansion Network (PSENet) is proposed, which can precisely detect text instances with arbitrary shapes and is effective to split the close text instances, making it easier to use segmentation-based methods to detect arbitrary-shaped text instances.
Proceedings ArticleDOI

PolarMask: Single Shot Instance Segmentation With Polar Representation

TL;DR: PolarMask as discussed by the authors formulates the instance segmentation problem as predicting contour of instance through instance center classification and dense distance regression in a polar coordinate, which can be used by easily embedding it into most off-the-shelf detection methods.
Proceedings ArticleDOI

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

TL;DR: A new framework termed BEVFormer, which learns unified BEV representations with spatiotemporal transformers to support multiple autonomous driving perception tasks, and achieves the new state-of-the-art 56.9% in terms of NDS metric on the nuScenes test set.
Proceedings ArticleDOI

Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network

TL;DR: Zhang et al. as discussed by the authors proposed an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable postprocessing.