Home
/
Authors
/
Chien-Hsiang Huang

Author

Chien-Hsiang Huang

Bio: Chien-Hsiang Huang is an academic researcher from National Tsing Hua University. The author has contributed to research in topics: Inference & Artificial neural network. The author has an hindex of 3, co-authored 3 publications receiving 123 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

HarDNet: A Low Memory Traffic Network

[...]

Ping Chao¹, Chao-Yang Kao², Yu-Shan Ruan², Chien-Hsiang Huang², Youn-Long Lin² - Show less +1 more•Institutions (2)

University of Michigan¹, National Tsing Hua University²

01 Oct 2019

TL;DR: In this paper, a Harmonic Densely Connected Network (HDN) was proposed to achieve high efficiency in terms of both low MACs and memory traffic for real-time object detection and semantic segmentation.

...read moreread less

Abstract: State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video. We propose a Harmonic Densely Connected Network to achieve high efficiency in terms of both low MACs and memory traffic. The new network achieves 35%, 36%, 30%, 32%, and 45% inference time reduction compared with FC-DenseNet-103, DenseNet-264, ResNet-50, ResNet-152, and SSD-VGG, respectively. We use tools including Nvidia profiler and ARM Scale-Sim to measure the memory traffic and verify that the inference latency is indeed proportional to the memory traffic consumption and the proposed network consumes low memory traffic. We conclude that one should take memory traffic into consideration when designing neural network architectures for high-resolution applications at the edge.

...read moreread less

238 citations

Posted Content•

HarDNet: A Low Memory Traffic Network

[...]

Ping Chao¹, Chao-Yang Kao², Yu-Shan Ruan², Chien-Hsiang Huang², Youn-Long Lin² - Show less +1 more•Institutions (2)

University of Michigan¹, National Tsing Hua University²

03 Sep 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is suggested that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video.

...read moreread less

33 citations

Posted Content•

HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network that Achieves over 0.9 Mean Dice and 86 FPS

[...]

Chien-Hsiang Huang, Hung-Yu Wu, Youn-Long Lin

18 Jan 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: HarDNet-MSEG as mentioned in this paper proposes a new convolution neural network for polyp segmentation, which consists of a backbone and a decoder, which achieves SOTA in both accuracy and inference speed on five popular datasets.

...read moreread less

Abstract: We propose a new convolution neural network called HarDNet-MSEG for polyp segmentation. It achieves SOTA in both accuracy and inference speed on five popular datasets. For Kvasir-SEG, HarDNet-MSEG delivers 0.904 mean Dice running at 86.7 FPS on a GeForce RTX 2080 Ti GPU. It consists of a backbone and a decoder. The backbone is a low memory traffic CNN called HarDNet68, which has been successfully applied to various CV tasks including image classification, object detection, multi-object tracking and semantic segmentation, etc. The decoder part is inspired by the Cascaded Partial Decoder, known for fast and accurate salient object detection. We have evaluated HarDNet-MSEG using those five popular datasets. The code and all experiment details are available at Github. this https URL

...read moreread less

13 citations

Cited by

PDF

Open Access

More filters

Posted Content•

YOLOv4: Optimal Speed and Accuracy of Object Detection

[...]

Alexey Bochkovskiy, Chien-Yao Wang¹, Hong-Yuan Mark Liao¹•Institutions (1)

Academia Sinica¹

23 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

...read moreread less

Abstract: There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at this https URL

...read moreread less

5,709 citations

Proceedings Article•DOI•

CSPNet: A New Backbone that can Enhance Learning Capability of CNN

[...]

Chien-Yao Wang¹, Hong-Yuan Mark Liao¹, Yueh-Hua Wu¹, Ping-Yang Chen², Jun-Wei Hsieh², I-Hau Yeh - Show less +2 more•Institutions (2)

Academia Sinica¹, National Chiao Tung University²

14 Jun 2020

TL;DR: Cross Stage Partial Network (CSPNet) as discussed by the authors integrates feature maps from the beginning and the end of a network stage to mitigate the problem of duplicate gradient information within network optimization.

...read moreread less

Abstract: Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, such success greatly relies on costly computation resources, which hinders people with cheap devices from appreciating the advanced technology. In this paper, we propose Cross Stage Partial Network (CSPNet) to mitigate the problem that previous works require heavy inference computations from the network architecture perspective. We attribute the problem to the duplicate gradient information within network optimization. The proposed networks respect the variability of the gradients by integrating feature maps from the beginning and the end of a network stage, which, in our experiments, reduces computations by 20% with equivalent or even superior accuracy on the ImageNet dataset, and significantly outperforms state-of-the-art approaches in terms of AP 50 on the MS COCO object detection dataset. The CSPNet is easy to implement and general enough to cope with architectures based on ResNet, ResNeXt, and DenseNet.

...read moreread less

1,991 citations

Posted Content•

Scaled-YOLOv4: Scaling Cross Stage Partial Network

[...]

Chien-Yao Wang¹, Alexey Bochkovskiy², Hong-Yuan Mark Liao¹•Institutions (2)

Academia Sinica¹, Intel²

16 Nov 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy.

...read moreread less

Abstract: We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy. We propose a network scaling approach that modifies not only the depth, width, resolution, but also structure of the network. YOLOv4-large model achieves state-of-the-art results: 55.4% AP (73.3% AP50) for the MS COCO dataset at a speed of 15 FPS on Tesla V100, while with the test time augmentation, YOLOv4-large achieves 55.8% AP (73.2 AP50). To the best of our knowledge, this is currently the highest accuracy on the COCO dataset among any published work. The YOLOv4-tiny model achieves 22.0% AP (42.0% AP50) at a speed of 443 FPS on RTX 2080Ti, while by using TensorRT, batch size = 4 and FP16-precision the YOLOv4-tiny achieves 1774 FPS.

...read moreread less

513 citations

Journal Article•DOI•

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking

[...]

Yifu Zhang¹, Chunyu Wang², Xinggang Wang¹, Wenjun Zeng², Wenyu Liu¹ - Show less +1 more•Institutions (2)

Huazhong University of Science and Technology¹, Microsoft²

04 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A simple approach which consists of two homogeneous branches to predict pixel-wise objectness scores and re-ID features allows \emph{FairMOT} to obtain high levels of detection and tracking accuracy and outperform previous state-of-the-arts by a large margin on several public datasets.

...read moreread less

Abstract: There has been remarkable progress on object detection and re-identification (re-ID) in recent years which are the key components of multi-object tracking. However, little attention has been focused on jointly accomplishing the two tasks in a single network. Our study shows that the previous attempts ended up with degraded accuracy mainly because the re-ID task is not fairly learned which causes many identity switches. The unfairness lies in two-fold: (1) they treat re-ID as a secondary task whose accuracy heavily depends on the primary detection task. So training is largely biased to the detection task but ignores the re-ID task; (2) they use ROI-Align to extract re-ID features which is directly borrowed from object detection. However, this introduces a lot of ambiguity in characterizing objects because many sampling points may belong to disturbing instances or background. To solve the problems, we present a simple approach \emph{FairMOT} which consists of two homogeneous branches to predict pixel-wise objectness scores and re-ID features. The achieved fairness between the tasks allows \emph{FairMOT} to obtain high levels of detection and tracking accuracy and outperform previous state-of-the-arts by a large margin on several public datasets. The source code and pre-trained models are released at this https URL.

...read moreread less

507 citations

Book Chapter•DOI•

TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

[...]

Yundong Zhang, Huiye Liu¹, Qiang Hu•Institutions (1)

Georgia Institute of Technology¹

27 Sep 2021

TL;DR: TransFuse as discussed by the authors combines Transformers and CNNs in a parallel style, where both global dependency and low-level spatial details can be efficiently captured in a much shallower manner.

...read moreread less

Abstract: Medical image segmentation - the prerequisite of numerous clinical needs - has been significantly prospered by recent advances in convolutional neural networks (CNNs). However, it exhibits general limitations on modeling explicit long-range relation, and existing cures, resorting to building deep encoders along with aggressive downsampling operations, leads to redundant deepened networks and loss of localized details. Hence, the segmentation task awaits a better solution to improve the efficiency of modeling global contexts while maintaining a strong grasp of low-level details. In this paper, we propose a novel parallel-in-branch architecture, TransFuse, to address this challenge. TransFuse combines Transformers and CNNs in a parallel style, where both global dependency and low-level spatial details can be efficiently captured in a much shallower manner. Besides, a novel fusion technique - BiFusion module is created to efficiently fuse the multi-level features from both branches. Extensive experiments demonstrate that TransFuse achieves the newest state-of-the-art results on both 2D and 3D medical image sets including polyp, skin lesion, hip, and prostate segmentation, with significant parameter decrease and inference speed improvement.

...read moreread less

365 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52

Collapse