Home
/
Authors
/
Jiajun Liang

Author

Jiajun Liang

Bio: Jiajun Liang is an academic researcher from Tsinghua University. The author has contributed to research in topics: Computer science & Pipeline (computing). The author has an hindex of 7, co-authored 9 publications receiving 1242 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

EAST: An Efficient and Accurate Scene Text Detector

[...]

Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, He Weiran, Jiajun Liang - Show less +3 more

01 Jul 2017

TL;DR: This work proposes a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes, and significantly outperforms state-of-the-art methods in terms of both accuracy and efficiency.

...read moreread less

Abstract: Previous approaches for scene text detection have already achieved promising performances across various benchmarks. However, they usually fall short when dealing with challenging scenarios, even when equipped with deep neural network models, because the overall performance is determined by the interplay of multiple stages and components in the pipelines. In this work, we propose a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes. The pipeline directly predicts words or text lines of arbitrary orientations and quadrilateral shapes in full images, eliminating unnecessary intermediate steps (e.g., candidate aggregation and word partitioning), with a single neural network. The simplicity of our pipeline allows concentrating efforts on designing loss functions and neural network architecture. Experiments on standard datasets including ICDAR 2015, COCO-Text and MSRA-TD500 demonstrate that the proposed algorithm significantly outperforms state-of-the-art methods in terms of both accuracy and efficiency. On the ICDAR 2015 dataset, the proposed algorithm achieves an F-score of 0.7820 at 13.2fps at 720p resolution.

...read moreread less

1,161 citations

Posted Content•

EAST: An Efficient and Accurate Scene Text Detector

[...]

Xinyu Zhou, Cong Yao, He Wen, Yuzhi Wang, Shuchang Zhou, He Weiran, Jiajun Liang - Show less +3 more

11 Apr 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposed a simple yet powerful pipeline that yields fast and accurate text detection in natural scenes, eliminating unnecessary intermediate steps (e.g., candidate aggregation and word partitioning) with a single neural network.

...read moreread less

409 citations

Journal Article•DOI•

Scene Text Recognition from Two-Dimensional Perspective

[...]

Minghui Liao¹, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang, Pengyuan Lyu², Cong Yao, Xiang Bai¹ - Show less +4 more•Institutions (2)

Huazhong University of Science and Technology¹, Tencent²

17 Jul 2019

TL;DR: Zhang et al. as mentioned in this paper proposed a character attention fully convolutional network (CA-FCN) for scene text recognition with a semantic segmentation network where an attention mechanism for characters is adopted.

...read moreread less

Abstract: Inspired by speech recognition, recent state-of-the-art algorithms mostly consider scene text recognition as a sequence prediction problem. Though achieving excellent performance, these methods usually neglect an important fact that text in images are actually distributed in two-dimensional space. It is a nature quite different from that of speech, which is essentially a one-dimensional signal. In principle, directly compressing features of text into a one-dimensional form may lose useful information and introduce extra noise. In this paper, we approach scene text recognition from a two-dimensional perspective. A simple yet effective model, called Character Attention Fully Convolutional Network (CA-FCN), is devised for recognizing the text of arbitrary shapes. Scene text recognition is realized with a semantic segmentation network, where an attention mechanism for characters is adopted. Combined with a word formation module, CA-FCN can simultaneously recognize the script and predict the position of each character. Experiments demonstrate that the proposed algorithm outperforms previous methods on both regular and irregular text datasets. Moreover, it is proven to be more robust to imprecise localizations in the text detection phase, which are very common in practice.

...read moreread less

152 citations

Proceedings Article•DOI•

Decoupled Knowledge Distillation

[...]

Bo-Rui Zhao, Quan Cui, Ren-Jie Song, Yiyu Qiu, Jiajun Liang - Show less +1 more

16 Mar 2022

TL;DR: Decoupled Knowledge Distillation is presented, enabling TCKD and NCKD to play their roles more efficiently and flexibly and achieves comparable or even better results and has better training efficiency on CIFAR-100, ImageNet, and MS-COCO datasets for image classification and object detection tasks.

...read moreread less

Abstract: State-of-the-art distillation methods are mainly based on distilling deep features from intermediate layers, while the significance of logit distillation is greatly overlooked. To provide a novel viewpoint to study logit distillation, we re-formulate the classical KD loss into two parts, i.e., target class knowledge distillation (TCKD) and non-target class knowledge distillation (NCKD). We empirically investigate and prove the effects of the two parts: TCKD transfers knowledge concerning the “difficulty” of training samples, while NCKD is the prominent reason why logit distillation works. More importantly, we reveal that the classical KD loss is a coupled formulation, which (1) suppresses the effectiveness of NCKD and (2) limits the flexibility to balance these two parts. To address these issues, we present Decoupled Knowledge Distillation (DKD), enabling TCKD and NCKD to play their roles more efficiently and flexibly. Compared with complex feature-based methods, our DKD achieves comparable or even better results and has better training efficiency on CIFAR-100, ImageNet, and MS-COCO datasets for image classification and object detection tasks. This paper proves the great potential of logit distillation, and we hope it will be helpful for future research. The code is available at https://github.com/megviiresearch/mdistiller.

...read moreread less

105 citations

Proceedings Article•DOI•

Big Data Application in Education: Dropout Prediction in Edx MOOCs

[...]

Jiajun Liang¹, Jian Yang², Yongji Wu¹, Chao Li¹, Li Zheng¹ - Show less +1 more•Institutions (2)

Tsinghua University¹, University of Science and Technology Beijing²

20 Apr 2016

TL;DR: This work describes the complete approach to cope with drop out prediction task, including data extraction from Edx platform, data preprocessing, feature engineering and performance test on several supervised classification model such as SVM, Logistics Regression, Random Forest and Gradient Boosting Decision Tree.

...read moreread less

Abstract: Educational Data Mining and Learning Analytics are two growing fields of study, trying to make sense of education data and to improve teaching and learning experience. We study dropout prediction in Massively Open Online Courses (MOOCS), where the goal is given student's learning behavior log data in one month, to predict whether students would drop out in next ten days. We collect 39 courses data from XuetangX platform, which is based on the open source Edx platform. We describe our complete approach to cope with drop out prediction task, including data extraction from Edx platform, data preprocessing, feature engineering and performance test on several supervised classification model such as SVM, Logistics Regression, Random Forest and Gradient Boosting Decision Tree. We achieve 88% accuracy in dropout prediction task with GBDT model.

...read moreread less

85 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

FCOS: Fully Convolutional One-Stage Object Detection

[...]

Zhi Tian¹, Chunhua Shen¹, Hao Chen¹, Tong He¹•Institutions (1)

University of Adelaide¹

02 Apr 2019

TL;DR: For the first time, a much simpler and flexible detection framework achieving improved detection accuracy is demonstrated, and it is hoped that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks.

...read moreread less

Abstract: We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free. By eliminating the pre-defined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks. Code is available at: https://tinyurl.com/FCOSv1

...read moreread less

2,244 citations

Posted Content•

FCOS: Fully Convolutional One-Stage Object Detection

[...]

Zhi Tian¹, Chunhua Shen¹, Hao Chen¹, Tong He¹•Institutions (1)

University of Adelaide¹

02 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a fully convolutional one-stage object detector (FCOS) is proposed to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation.

...read moreread less

Abstract: We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all state-of-the-art object detectors such as RetinaNet, SSD, YOLOv3, and Faster R-CNN rely on pre-defined anchor boxes. In contrast, our proposed detector FCOS is anchor box free, as well as proposal free. By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training. More importantly, we also avoid all hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance. With the only post-processing non-maximum suppression (NMS), FCOS with ResNeXt-64x4d-101 achieves 44.7% in AP with single-model and single-scale testing, surpassing previous one-stage detectors with the advantage of being much simpler. For the first time, we demonstrate a much simpler and flexible detection framework achieving improved detection accuracy. We hope that the proposed FCOS framework can serve as a simple and strong alternative for many other instance-level tasks. Code is available at:Code is available at: this https URL

...read moreread less

2,160 citations

Proceedings Article•DOI•

PIXOR: Real-time 3D Object Detection from Point Clouds

[...]

Bin Yang¹, Wenjie Luo¹, Raquel Urtasun¹•Institutions (1)

University of Toronto¹

18 Jun 2018

TL;DR: PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.

...read moreread less

Abstract: We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. Speed is critical as detection is a necessary component for safety. Existing approaches are, however, expensive in computation due to high dimensionality of point clouds. We utilize the 3D data more efficiently by representing the scene from the Bird's Eye View (BEV), and propose PIXOR, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions. The input representation, network architecture, and model optimization are specially designed to balance high accuracy and real-time efficiency. We validate PIXOR on two datasets: the KITTI BEV object detection benchmark, and a large-scale 3D vehicle detection benchmark. In both datasets we show that the proposed detector surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.

...read moreread less

1,033 citations

Posted Content•

Object Detection in 20 Years: A Survey

[...]

Zhengxia Zou¹, Zhenwei Shi², Yuhong Guo, Jieping Ye¹•Institutions (2)

University of Michigan¹, Beihang University²

13 May 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019), and makes an in-deep analysis of their challenges as well as technical improvements in recent years.

...read moreread less

Abstract: Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.

...read moreread less

802 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse