H
Hang Zhao
Researcher at Tsinghua University
Publications - 108
Citations - 19405
Hang Zhao is an academic researcher from Tsinghua University. The author has contributed to research in topics: Computer science & Artificial neural network. The author has an hindex of 32, co-authored 83 publications receiving 12696 citations. Previous affiliations of Hang Zhao include Zhejiang University & Nvidia.
Papers
More filters
Proceedings ArticleDOI
CVC: Contrastive Learning for Non-Parallel Voice Conversion
TL;DR: In this article, a contrastive learning-based adversarial approach for voice conversion is proposed, which only requires an efficient one-way GAN training by taking the advantage of Contrastive Learning.
Posted Content
CLOUD: Contrastive Learning of Unsupervised Dynamics
Jianren Wang,Yujie Lu,Hang Zhao +2 more
TL;DR: This work proposes to learn forward and inverse dynamics in a fully unsupervised manner via contrastive estimation in the feature space of states and actions with data collected from random exploration.
Journal ArticleDOI
Training-Free Robust Multimodal Learning via Sample-Wise Jacobian Regularization
TL;DR: A training-free robust late-fusion method by exploiting conditional independence assumption and Jacobian regularization to minimize the Frobenius norm of a Jacobian matrix, where the resulting optimization problem is relaxed to a tractable Sylvester equation.
Journal ArticleDOI
PAND: Precise Action Recognition on Naturalistic Driving
TL;DR: An effective activity temporal localization and classification method to localize the temporal boundaries and predict the class label of activities for naturalistic driving and ranks the 6th on the Test-A2 of the 6 fourth AI City Challenge track 3.
Proceedings ArticleDOI
Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking
Longlong Jing,Ruichi Yu,Henrik Kretzschmar,Kang Li,Charles R. Qi,Hang Zhao,Alper Ayvaci,Xu Chen,Dillon Cower,Yingwei Li,Yurong You,Han Deng,Congcong Li,Dragomir Anguelov +13 more
TL;DR: This work proposes a multi-level fusion method that combines different representations (RGB and pseudo-LiDAR) and temporal information across multiple frames for objects (tracklets) to enhance per-object depth estimation and demonstrates that by simply replacing estimated depth with fusion-enhanced depth, it can achieve significant improvements in monocular 3D perception tasks, including detection and tracking.