Home
/
Authors
/
Yangdong Deng

Author

Yangdong Deng

Other affiliations: Carnegie Mellon University

Bio: Yangdong Deng is an academic researcher from Tsinghua University. The author has contributed to research in topics: Very-large-scale integration & CUDA. The author has an hindex of 22, co-authored 89 publications receiving 1840 citations. Previous affiliations of Yangdong Deng include Carnegie Mellon University.

Papers published on a yearly basis

2023
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2005
2004
2003
2002
2001

Papers

PDF

Open Access

More filters

Posted Content•

Light-Head R-CNN: In Defense of Two-Stage Object Detector.

[...]

Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun - Show less +2 more

20 Nov 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: The authors' ResNet-101 based light-head R-CNN outperforms state-of-art object detectors on COCO while keeping time efficiency and significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy.

...read moreread less

Abstract: In this paper, we first investigate why typical two-stage methods are not as fast as single-stage, fast detectors like YOLO and SSD. We find that Faster R-CNN and R-FCN perform an intensive computation after or before RoI warping. Faster R-CNN involves two fully connected layers for RoI recognition, while R-FCN produces a large score maps. Thus, the speed of these networks is slow due to the heavy-head design in the architecture. Even if we significantly reduce the base model, the computation cost cannot be largely decreased accordingly. We propose a new two-stage detector, Light-Head R-CNN, to address the shortcoming in current two-stage approaches. In our design, we make the head of network as light as possible, by using a thin feature map and a cheap R-CNN subnet (pooling and single fully-connected layer). Our ResNet-101 based light-head R-CNN outperforms state-of-art object detectors on COCO while keeping time efficiency. More importantly, simply replacing the backbone with a tiny network (e.g, Xception), our Light-Head R-CNN gets 30.7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy. Code will be made publicly available.

...read moreread less

273 citations

Posted Content•

DetNet: A Backbone network for Object Detection.

[...]

Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun - Show less +2 more

17 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: State-of-the-art results have been obtained for both object detection and instance segmentation on the MSCOCO benchmark based on the DetNet~(4.8G FLOPs) backbone.

...read moreread less

Abstract: Recent CNN based object detectors, no matter one-stage methods like YOLO, SSD, and RetinaNe or two-stage detectors like Faster R-CNN, R-FCN and FPN are usually trying to directly finetune from ImageNet pre-trained models designed for image classification. There has been little work discussing on the backbone feature extractor specifically designed for the object detection. More importantly, there are several differences between the tasks of image classification and object detection. 1. Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales. 2. Object detection not only needs to recognize the category of the object instances but also spatially locate the position. Large downsampling factor brings large valid receptive field, which is good for image classification but compromises the object location ability. Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection. Moreover, DetNet includes the extra stages against traditional backbone network for image classification, while maintains high spatial resolution in deeper layers. Without any bells and whistles, state-of-the-art results have been obtained for both object detection and instance segmentation on the MSCOCO benchmark based on our DetNet~(4.8G FLOPs) backbone. The code will be released for the reproduction.

...read moreread less

238 citations

Book Chapter•DOI•

DetNet: Design Backbone for Object Detection

[...]

Zeming Li¹, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng¹, Jian Sun - Show less +2 more•Institutions (1)

Tsinghua University¹

08 Sep 2018

TL;DR: DetNet is proposed, which is a novel backbone network specifically designed for object detection that includes the extra stages against traditional backbone network for image classification, while maintains high spatial resolution in deeper layers.

...read moreread less

Abstract: Recent CNN based object detectors, either one-stage methods like YOLO, SSD, and RetinaNet, or two-stage detectors like Faster R-CNN, R-FCN and FPN, are usually trying to directly finetune from ImageNet pre-trained models designed for the task of image classification. However, there has been little work discussing the backbone feature extractor specifically designed for the task of object detection. More importantly, there are several differences between the tasks of image classification and object detection. (i) Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales. (ii) Object detection not only needs to recognize the category of the object instances but also spatially locate them. Large downsampling factors bring large valid receptive field, which is good for image classification, but compromises the object location ability. Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection. Moreover, DetNet includes the extra stages against traditional backbone network for image classification, while maintains high spatial resolution in deeper layers. Without any bells and whistles, state-of-the-art results have been obtained for both object detection and instance segmentation on the MSCOCO benchmark based on our DetNet (4.8G FLOPs) backbone. Codes will be released (https://github.com/zengarden/DetNet).

...read moreread less

233 citations

Proceedings Article•DOI•

Interconnect characteristics of 2.5-D system integration scheme

[...]

Yangdong Deng¹, Wojciech Maly¹•Institutions (1)

Carnegie Mellon University¹

01 Apr 2001

TL;DR: This paper compares wire length distributions, obtained for 2-D and 2.5-D implementations of benchmark circuits, and finds significant reductions in both total wirelength and worst-case wirelength was observed for the systems implemented as 2.

...read moreread less

Abstract: Growing number of excessively long on-chip wires in modern monolithic ICs is a byproduct of growing chip size. To address this problem instead of placing all systems components in one layer (i.e. in 2-D space) one can use a stack of single layer monolithic ICs (called here a 2.5-D integrated IC). To assess the potential benefits of such a 2.5-D integration schema this paper compares wire length distributions, obtained for 2-D and 2.5-D implementations of benchmark circuits. In the assessment two newly developed floorplanning and placement tools were used. Significant reductions in both total wirelength and worst-case wirelength was observed for the systems implemented as 2.5-D ICs.

...read moreread less

137 citations

Journal Article•DOI•

A Two-Hop Wireless Power Transfer System With an Efficiency-Enhanced Power Receiver for Motion-Free Capsule Endoscopy Inspection

[...]

Tianjia Sun¹, Xiang Xie¹, Guolin Li¹, Yingke Gu¹, Yangdong Deng¹, Zhihua Wang¹ - Show less +2 more•Institutions (1)

Tsinghua University¹

29 Jun 2012-IEEE Transactions on Biomedical Engineering

TL;DR: A two-hop wireless power transfer system for a motion-free capsule endoscopy inspection that makes patients much more conformable and eliminates the sources of reliability issues arisen from the moving cable and connectors is presented.

...read moreread less

Abstract: This paper presents a wireless power transfer system for a motion-free capsule endoscopy inspection. Conventionally, a wireless power transmitter in a specifically designed jacket has to be connected to a strong power source with a long cable. To avoid the power cable and allow patients to walk freely in a room, this paper proposes a two-hop wireless power transfer system. First, power is transferred from a floor to a power relay in the patient's jacket via strong coupling. Next, power is delivered from the power relay to the capsule via loose coupling. Besides making patients much more conformable, the proposed techniques eliminate the sources of reliability issues arisen from the moving cable and connectors. In the capsule, it is critical to enhance the power conversion efficiency. This paper develops a switch-mode rectifier (rectifying efficiency of 93.6%) and a power combination circuit (enhances combining efficiency by 18%). Thanks to the two-hop transfer mechanism and the novel circuit techniques, this system is able to transfer an average power of 24 mW and a peak power of 90 mW from the floor to a 13 mm × 27 mm capsule over a distance of 1 m with the maximum dc-to-dc power efficiency of 3.04%.

...read moreread less

131 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

YOLOv4: Optimal Speed and Accuracy of Object Detection

[...]

Alexey Bochkovskiy, Chien-Yao Wang¹, Hong-Yuan Mark Liao¹•Institutions (1)

Academia Sinica¹

23 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

...read moreread less

Abstract: There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at this https URL

...read moreread less

5,709 citations

Book Chapter•DOI•

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

[...]

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng¹, Jian Sun•Institutions (1)

Tsinghua University¹

08 Sep 2018

TL;DR: ShuffleNet V2 as discussed by the authors proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, based on a series of controlled experiments, and derives several practical guidelines for efficient network design.

...read moreread less

Abstract: Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and platform characterics. Thus, this work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

...read moreread less

3,393 citations

The C programming language

[...]

Brian W. Kernighan¹, Dennis M. Ritchie¹•Institutions (1)

AT&T¹

01 Jan 1978

TL;DR: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.), and is a "must-have" reference for every serious programmer's digital library.

...read moreread less

Abstract: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.). One of the best-selling programming books published in the last fifty years, "K&R" has been called everything from the "bible" to "a landmark in computer science" and it has influenced generations of programmers. Available now for all leading ebook platforms, this concise and beautifully written text is a "must-have" reference for every serious programmers digital library. As modestly described by the authors in the Preface to the First Edition, this "is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions. Nonetheless, a novice programmer should be able to read along and pick up the language, although access to a more knowledgeable colleague will help."

...read moreread less

2,120 citations

Proceedings Article•DOI•

CSPNet: A New Backbone that can Enhance Learning Capability of CNN

[...]

Chien-Yao Wang¹, Hong-Yuan Mark Liao¹, Yueh-Hua Wu¹, Ping-Yang Chen², Jun-Wei Hsieh², I-Hau Yeh - Show less +2 more•Institutions (2)

Academia Sinica¹, National Chiao Tung University²

14 Jun 2020

TL;DR: Cross Stage Partial Network (CSPNet) as discussed by the authors integrates feature maps from the beginning and the end of a network stage to mitigate the problem of duplicate gradient information within network optimization.

...read moreread less

Abstract: Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, such success greatly relies on costly computation resources, which hinders people with cheap devices from appreciating the advanced technology. In this paper, we propose Cross Stage Partial Network (CSPNet) to mitigate the problem that previous works require heavy inference computations from the network architecture perspective. We attribute the problem to the duplicate gradient information within network optimization. The proposed networks respect the variability of the gradients by integrating feature maps from the beginning and the end of a network stage, which, in our experiments, reduces computations by 20% with equivalent or even superior accuracy on the ImageNet dataset, and significantly outperforms state-of-the-art approaches in terms of AP 50 on the MS COCO object detection dataset. The CSPNet is easy to implement and general enough to cope with architectures based on ResNet, ResNeXt, and DenseNet.

...read moreread less

1,991 citations

Posted Content•

Group Normalization

[...]

Yuxin Wu, Kaiming He

22 Mar 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.

...read moreread less

Abstract: Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN's usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN's computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

...read moreread less

1,924 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse