Home
/
Authors
/
Huaibo Huang

Author

Huaibo Huang

Other affiliations: Anhui University, Center for Excellence in Education

Bio: Huaibo Huang is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Computer science & Autoencoder. The author has an hindex of 11, co-authored 52 publications receiving 747 citations. Previous affiliations of Huaibo Huang include Anhui University & Center for Excellence in Education.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2015
2014

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Wavelet-SRNet: A Wavelet-Based CNN for Multi-scale Face Super Resolution

[...]

Huaibo Huang¹, Ran He¹, Zhenan Sun¹, Tieniu Tan¹•Institutions (1)

Chinese Academy of Sciences¹

01 Oct 2017

TL;DR: A wavelet-based CNN approach that can ultra-resolve a very low resolution face image of 16 × 16 or smaller pixelsize to its larger version of multiple scaling factors in a unified framework with three types of loss: wavelet prediction loss, texture loss and full-image loss is presented.

...read moreread less

Abstract: Most modern face super-resolution methods resort to convolutional neural networks (CNN) to infer highresolution (HR) face images. When dealing with very low resolution (LR) images, the performance of these CNN based methods greatly degrades. Meanwhile, these methods tend to produce over-smoothed outputs and miss some textural details. To address these challenges, this paper presents a wavelet-based CNN approach that can ultra-resolve a very low resolution face image of 16 × 16 or smaller pixelsize to its larger version of multiple scaling factors (2×, 4×, 8× and even 16×) in a unified framework. Different from conventional CNN methods directly inferring HR images, our approach firstly learns to predict the LR’s corresponding series of HR’s wavelet coefficients before reconstructing HR images from them. To capture both global topology information and local texture details of human faces, we present a flexible and extensible convolutional neural network with three types of loss: wavelet prediction loss, texture loss and full-image loss. Extensive experiments demonstrate that the proposed approach achieves more appealing results both quantitatively and qualitatively than state-ofthe- art super-resolution methods.

...read moreread less

369 citations

Proceedings Article•

IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis

[...]

Huaibo Huang¹, Zhihang Li¹, Ran He¹, Zhenan Sun¹, Tieniu Tan¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

17 Jul 2018

TL;DR: A novel introspective variational autoencoder (IntroVAE) model for synthesizing high-resolution photographic images that is capable of self-evaluating the quality of its generated samples and improving itself accordingly and requires no extra discriminators.

...read moreread less

Abstract: We present a novel introspective variational autoencoder (IntroVAE) model for synthesizing high-resolution photographic images. IntroVAE is capable of self-evaluating the quality of its generated samples and improving itself accordingly. Its inference and generator models are jointly trained in an introspective way. On one hand, the generator is required to reconstruct the input images from the noisy outputs of the inference model as normal VAEs. On the other hand, the inference model is encouraged to classify between the generated and real samples while the generator tries to fool it as GANs. These two famous generative frameworks are integrated in a simple yet efficient single-stream architecture that can be trained in a single stage. IntroVAE preserves the advantages of VAEs, such as stable training and nice latent manifold. Unlike most other hybrid models of VAEs and GANs, IntroVAE requires no extra discriminators, because the inference model itself serves as a discriminator to distinguish between the generated and real samples. Experiments demonstrate that our method produces high-resolution photo-realistic images (e.g., CELEBA images at $1024^{2}$), which are comparable to or better than the state-of-the-art GANs.

...read moreread less

178 citations

Posted Content•

IntroVAE: Introspective Variational Autoencoders for Photographic Image Synthesis

[...]

Huaibo Huang¹, Zhihang Li¹, Ran He¹, Zhenan Sun¹, Tieniu Tan¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

17 Jul 2018-arXiv: Learning

TL;DR: In this paper, an introspective variational autoencoder (IntroVAE) model is proposed to synthesize high-resolution photographic images, which is capable of self-evaluating the quality of its generated samples and improving itself accordingly.

...read moreread less

91 citations

Journal Article•DOI•

Disentangled Variational Representation for Heterogeneous Face Recognition

[...]

Xiang Wu¹, Huaibo Huang¹, Vishal M. Patel², Ran He¹, Zhenan Sun¹ - Show less +1 more•Institutions (2)

Chinese Academy of Sciences¹, Johns Hopkins University²

17 Jul 2019

TL;DR: In this paper, a disentangled variational representation (DVR) was proposed for cross-modal matching, where a variational lower bound was employed to optimize the approximate posterior for NIR and VIS representations.

...read moreread less

Abstract: Visible (VIS) to near infrared (NIR) face matching is a challenging problem due to the significant domain discrepancy between the domains and a lack of sufficient data for training cross-modal matching algorithms. Existing approaches attempt to tackle this problem by either synthesizing visible faces from NIR faces, extracting domain-invariant features from these modalities, or projecting heterogeneous data onto a common latent space for cross-modal matching. In this paper, we take a different approach in which we make use of the Disentangled Variational Representation (DVR) for crossmodal matching. First, we model a face representation with an intrinsic identity information and its within-person variations. By exploring the disentangled latent variable space, a variational lower bound is employed to optimize the approximate posterior for NIR and VIS representations. Second, aiming at obtaining more compact and discriminative disentangled latent space, we impose a minimization of the identity information for the same subject and a relaxed correlation alignment constraint between the NIR and VIS modality variations. An alternative optimization scheme is proposed for the disentangled variational representation part and the heterogeneous face recognition network part. The mutual promotion between these two parts effectively reduces the NIR and VIS domain discrepancy and alleviates over-fitting. Extensive experiments on three challenging NIR-VIS heterogeneous face recognition databases demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods.

...read moreread less

73 citations

Proceedings Article•

Dual Variational Generation for Low-Shot Heterogeneous Face Recognition

[...]

Chaoyou Fu¹, Xiang Wu¹, Yibo Hu¹, Huaibo Huang¹, Ran He¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

25 Mar 2019

TL;DR: This paper considers HFR as a dual generation problem, and proposes a novel Dual Variational Generation (DVG) framework that generates large-scale new paired heterogeneous images with the same identity from noise, for the sake of reducing the domain gap of HFR.

...read moreread less

Abstract: Heterogeneous Face Recognition (HFR) is a challenging issue because of the large domain discrepancy and a lack of heterogeneous data. This paper considers HFR as a dual generation problem, and proposes a novel Dual Variational Generation (DVG) framework. It generates large-scale new paired heterogeneous images with the same identity from noise, for the sake of reducing the domain gap of HFR. Specifically, we first introduce a dual variational autoencoder to represent a joint distribution of paired heterogeneous images. Then, in order to ensure the identity consistency of the generated paired heterogeneous images, we impose a distribution alignment in the latent space and a pairwise identity preserving in the image space. Moreover, the HFR network reduces the domain discrepancy by constraining the pairwise feature distances between the generated paired heterogeneous images. Extensive experiments on four HFR databases show that our method can significantly improve state-of-the-art results. When using the generated paired images for training, our method gains more than 18\% True Positive Rate improvements over the baseline model when False Positive Rate is at $10^{-5}$.

...read moreread less

58 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Journal Article•DOI•

Deep Learning for Image Super-Resolution: A Survey

[...]

Zhihao Wang¹, Jian Chen¹, Steven C. H. Hoi²•Institutions (2)

South China University of Technology¹, Salesforce.com²

01 Oct 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A survey on recent advances of image super-resolution techniques using deep learning approaches in a systematic way, which can roughly group the existing studies of SR techniques into three major categories: supervised SR, unsupervised SR, and domain-specific SR.

...read moreread less

Abstract: Image Super-Resolution (SR) is an important class of image processing techniqueso enhance the resolution of images and videos in computer vision. Recent years have witnessed remarkable progress of image super-resolution using deep learning techniques. This article aims to provide a comprehensive survey on recent advances of image super-resolution using deep learning approaches. In general, we can roughly group the existing studies of SR techniques into three major categories: supervised SR, unsupervised SR, and domain-specific SR. In addition, we also cover some other important issues, such as publicly available benchmark datasets and performance evaluation metrics. Finally, we conclude this survey by highlighting several future directions and open issues which should be further addressed by the community in the future.

...read moreread less

837 citations

Proceedings Article•DOI•

FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors

[...]

Yu Chen¹, Ying Tai², Xiaoming Liu³, Chunhua Shen⁴, Jian Yang¹ - Show less +1 more•Institutions (4)

Nanjing University of Science and Technology¹, Tencent², Michigan State University³, University of Adelaide⁴

18 Jun 2018

TL;DR: Zhang et al. as discussed by the authors proposed a deep end-to-end trainable face super-resolution network (FSRNet), which makes use of the geometry prior, i.e., facial landmark heatmaps and parsing maps, to super-resolve very low-resolution (LR) face images without well-aligned requirement.

...read moreread less

Abstract: Face Super-Resolution (SR) is a domain-specific superresolution problem. The facial prior knowledge can be leveraged to better super-resolve face images. We present a novel deep end-to-end trainable Face Super-Resolution Network (FSRNet), which makes use of the geometry prior, i.e., facial landmark heatmaps and parsing maps, to super-resolve very low-resolution (LR) face images without well-aligned requirement. Specifically, we first construct a coarse SR network to recover a coarse high-resolution (HR) image. Then, the coarse HR image is sent to two branches: a fine SR encoder and a prior information estimation network, which extracts the image features, and estimates landmark heatmaps/parsing maps respectively. Both image features and prior information are sent to a fine SR decoder to recover the HR image. To generate realistic faces, we also propose the Face Super-Resolution Generative Adversarial Network (FSRGAN) to incorporate the adversarial loss into FSRNet. Further, we introduce two related tasks, face alignment and parsing, as the new evaluation metrics for face SR, which address the inconsistency of classic metrics w.r.t. visual perception. Extensive experiments show that FSRNet and FSRGAN significantly outperforms state of the arts for very LR face SR, both quantitatively and qualitatively.

...read moreread less

415 citations

Posted Content•

A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

[...]

Jie Gui¹, Zhenan Sun, Yonggang Wen², Dacheng Tao³, Jieping Ye⁴ - Show less +1 more•Institutions (4)

Southeast University¹, Nanyang Technological University², University of Sydney³, University of Michigan⁴

20 Jan 2020-arXiv: Learning

TL;DR: This paper attempts to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications, and compares the commonalities and differences of these GAns methods.

...read moreread less

Abstract: Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications. Firstly, the motivations, mathematical representations, and structure of most GANs algorithms are introduced in details. Furthermore, GANs have been combined with other machine learning algorithms for specific applications, such as semi-supervised learning, transfer learning, and reinforcement learning. This paper compares the commonalities and differences of these GANs methods. Secondly, theoretical issues related to GANs are investigated. Thirdly, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, medical field, and data science are illustrated. Finally, the future open research problems for GANs are pointed out.

...read moreread less

344 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse