scispace - formally typeset
G

Gao Huang

Researcher at Tsinghua University

Publications -  164
Citations -  43663

Gao Huang is an academic researcher from Tsinghua University. The author has contributed to research in topics: Computer science & Feature (computer vision). The author has an hindex of 37, co-authored 124 publications receiving 26697 citations. Previous affiliations of Gao Huang include Cornell University & University of Science and Technology of China.

Papers
More filters
Journal ArticleDOI

Fine-grained few shot learning with foreground object transformation

TL;DR: Zhang et al. as discussed by the authors proposed a novel method named foreground object transformation (FOT), which is composed of a foreground object extractor and a posture transformation generator to remove image background, which tends to increase the difficulty of fine-grained image classification as it amplifies the intra-class variance while reducing interclass variance.
Proceedings ArticleDOI

Contrastive Language-Image Pre-Training with Knowledge Graphs

TL;DR: This paper proposes a knowledge-based pre-training framework, dubbed Knowledge-CLIP, which injects semantic information into the widely used CLIP model, and can semantically align the representations in vision and language with higher quality, and enhance the reasoning ability across scenarios and modalities.
Journal ArticleDOI

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

TL;DR: In this article , the authors proposed a general multi-modal mutual information formula as a unified optimization target and demonstrate that all existing pre-training strategies are special cases of their framework.
Posted Content

Towards Learning Spatially Discriminative Feature Representations.

TL;DR: Zhang et al. as mentioned in this paper proposed a loss function, termed as CAM-loss, to constrain the embedded feature maps with the class activation maps (CAMs), which indicate the spatially discriminative regions of an image for particular categories.
Posted Content

FSD-10: A Dataset for Competitive Sports Content Analysis

TL;DR: A keyframe based temporal segment network (KTSN) for classification and achieve remarkable performance is proposed and is motivated by the idea that domain knowledge is of great concern in sports field.