Home
/
Authors
/
Jun Xing

Author

Jun Xing

Other affiliations: University of Science and Technology of China, Adobe Systems, University of Hong Kong ...read more

Bio: Jun Xing is an academic researcher from Institute for Creative Technologies. The author has contributed to research in topics: Deep learning & Rendering (computer graphics). The author has an hindex of 14, co-authored 30 publications receiving 732 citations. Previous affiliations of Jun Xing include University of Science and Technology of China & Adobe Systems.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Quantization Networks

[...]

Jiwei Yang¹, Xu Shen², Jun Xing, Xinmei Tian, Houqiang Li¹, Bing Deng², Jianqiang Huang², Xian-Sheng Hua² - Show less +4 more•Institutions (2)

University of Science and Technology of China¹, Alibaba Group²

01 Jun 2019

TL;DR: This paper provides a simple and uniform way for weights and activations quantization by formulating it as a differentiable non-linear function that will shed new lights on the interpretation of neural network quantization.

...read moreread less

Abstract: Although deep neural networks are highly effective, their high computational and memory costs severely hinder their applications to portable devices. As a consequence, lowbit quantization, which converts a full-precision neural network into a low-bitwidth integer version, has been an active and promising research topic. Existing methods formulate the low-bit quantization of networks as an approximation or optimization problem. Approximation-based methods confront the gradient mismatch problem, while optimizationbased methods are only suitable for quantizing weights and can introduce high computational cost during the training stage. In this paper, we provide a simple and uniform way for weights and activations quantization by formulating it as a differentiable non-linear function. The quantization function is represented as a linear combination of several Sigmoid functions with learnable biases and scales that could be learned in a lossless and end-to-end manner via continuous relaxation of the steepness of Sigmoid functions. Extensive experiments on image classification and object detection tasks show that our quantization networks outperform state-of-the-art methods. We believe that the proposed method will shed new lights on the interpretation of neural network quantization.

...read moreread less

224 citations

Journal Article•DOI•

paGAN: real-time avatars using dynamic textures

[...]

Koki Nagano¹, Jaewoo Seo, Jun Xing¹, Lingyu Wei, Zimo Li², Shunsuke Saito², Aviral Agarwal, Jens Fursund, Hao Li¹ - Show less +5 more•Institutions (2)

Institute for Creative Technologies¹, University of Southern California²

04 Dec 2018-ACM Transactions on Graphics

TL;DR: This work produces state-of-the-art quality image and video synthesis, and is the first to the knowledge that is able to generate a dynamically textured avatar with a mouth interior, all from a single image.

...read moreread less

Abstract: With the rising interest in personalized VR and gaming experiences comes the need to create high quality 3D avatars that are both low-cost and variegated. Due to this, building dynamic avatars from a single unconstrained input image is becoming a popular application. While previous techniques that attempt this require multiple input images or rely on transferring dynamic facial appearance from a source actor, we are able to do so using only one 2D input image without any form of transfer from a source image. We achieve this using a new conditional Generative Adversarial Network design that allows fine-scale manipulation of any facial input image into a new expression while preserving its identity. Our photoreal avatar GAN (paGAN) can also synthesize the unseen mouth interior and control the eye-gaze direction of the output, as well as produce the final image from a novel viewpoint. The method is even capable of generating fully-controllable temporally stable video sequences, despite not using temporal information during training. After training, we can use our network to produce dynamic image-based avatars that are controllable on mobile devices in real time. To do this, we compute a fixed set of output images that correspond to key blendshapes, from which we extract textures in UV space. Using a subject's expression blendshapes at run-time, we can linearly blend these key textures together to achieve the desired appearance. Furthermore, we can use the mouth interior and eye textures produced by our network to synthesize on-the-fly avatar animations for those regions. Our work produces state-of-the-art quality image and video synthesis, and is the first to our knowledge that is able to generate a dynamically textured avatar with a mouth interior, all from a single image.

...read moreread less

184 citations

Book Chapter•DOI•

Deep Volumetric Video From Very Sparse Multi-view Performance Capture

[...]

Zeng Huang¹, Tianye Li¹, Weikai Chen², Yajie Zhao², Jun Xing², Chloe LeGendre¹, Linjie Luo, Chongyang Ma, Hao Li¹ - Show less +5 more•Institutions (2)

University of Southern California¹, Institute for Creative Technologies²

08 Sep 2018

TL;DR: This work focuses on the task of template-free, per-frame 3D surface reconstruction from as few as three RGB sensors, for which conventional visual hull or multi-view stereo methods fail to generate plausible results.

...read moreread less

Abstract: We present a deep learning based volumetric approach for performance capture using a passive and highly sparse multi-view capture system. State-of-the-art performance capture systems require either pre-scanned actors, large number of cameras or active sensors. In this work, we focus on the task of template-free, per-frame 3D surface reconstruction from as few as three RGB sensors, for which conventional visual hull or multi-view stereo methods fail to generate plausible results. We introduce a novel multi-view Convolutional Neural Network (CNN) that maps 2D images to a 3D volumetric field and we use this field to encode the probabilistic distribution of surface points of the captured subject. By querying the resulting field, we can instantiate the clothed human body at arbitrary resolutions. Our approach scales to different numbers of input images, which yield increased reconstruction quality when more views are used. Although only trained on synthetic data, our network can generalize to handle real footage from body performance capture. Our method is suitable for high-quality low-cost full body volumetric capture solutions, which are gaining popularity for VR and AR content creation. Experimental results demonstrate that our method is significantly more robust and accurate than existing techniques when only very sparse views are available.

...read moreread less

127 citations

Proceedings Article•DOI•

Mesoscopic Facial Geometry Inference Using Deep Neural Networks

[...]

Loc Huynh¹, Weikai Chen¹, Shunsuke Saito¹, Jun Xing¹, Koki Nagano², Andrew Jones¹, Paul Debevec³, Hao Li¹ - Show less +4 more•Institutions (3)

Institute for Creative Technologies¹, AmeriCorps VISTA², Google³

01 Jun 2018

TL;DR: This work proposes to encode fine details in high-resolution displacement maps which are learned through a hybrid network adopting the state-of-the-art image-to-image translation network and super resolution network, enabling the full range of facial detail to be modeled.

...read moreread less

Abstract: We present a learning-based approach for synthesizing facial geometry at medium and fine scales from diffusely-lit facial texture maps. When applied to an image sequence, the synthesized detail is temporally coherent. Unlike current state-of-the-art methods [17, 5], which assume "dark is deep", our model is trained with measured facial detail collected using polarized gradient illumination in a Light Stage [20]. This enables us to produce plausible facial detail across the entire face, including where previous approaches may incorrectly interpret dark features as concavities such as at moles, hair stubble, and occluded pores. Instead of directly inferring 3D geometry, we propose to encode fine details in high-resolution displacement maps which are learned through a hybrid network adopting the state-of-the-art image-to-image translation network [29] and super resolution network [43]. To effectively capture geometric detail at both mid- and high frequencies, we factorize the learning into two separate sub-networks, enabling the full range of facial detail to be modeled. Results from our learning-based approach compare favorably with a high-quality active facial scanhening technique, and require only a single passive lighting condition without a complex scanning setup.

...read moreread less

82 citations

Journal Article•DOI•

Autocomplete hand-drawn animations

[...]

Jun Xing¹, Li-Yi Wei¹, Takaaki Shiratori², Koji Yatani³•Institutions (3)

University of Hong Kong¹, Microsoft², University of Tokyo³

26 Oct 2015

TL;DR: The key idea is to extend the local similarity method in [Xing et al. 2014], which handles only low-level spatial repetitions such as hatches within a single frame, to a global similarity that can capture high-level structures across multiple frames such as dynamic objects.

...read moreread less

Abstract: Hand-drawn animation is a major art form and communication medium, but can be challenging to produce. We present a system to help people create frame-by-frame animations through manual sketches. We design our interface to be minimalistic: it contains only a canvas and a few controls. When users draw on the canvas, our system silently analyzes all past sketches and predicts what might be drawn in the future across spatial locations and temporal frames. The interface also offers suggestions to beautify existing drawings. Our system can reduce manual workload and improve output quality without compromising natural drawing flow and control: users can accept, ignore, or modify such predictions visualized on the canvas by simple gestures. Our key idea is to extend the local similarity method in [Xing et al. 2014], which handles only low-level spatial repetitions such as hatches within a single frame, to a global similarity that can capture high-level structures across multiple frames such as dynamic objects. We evaluate our system through a preliminary user study and confirm that it can enhance both users' objective performance and subjective satisfaction.

...read moreread less

66 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Book•

A reflectance model for computer graphics

[...]

Robert L. Cook, Kenneth E. Torrance

01 Dec 1988

TL;DR: In this paper, the spectral energy distribution of the reflected light from an object made of a specific real material is obtained and a procedure for accurately reproducing the color associated with the spectrum is discussed.

...read moreread less

Abstract: This paper presents a new reflectance model for rendering computer synthesized images. The model accounts for the relative brightness of different materials and light sources in the same scene. It describes the directional distribution of the reflected light and a color shift that occurs as the reflectance changes with incidence angle. The paper presents a method for obtaining the spectral energy distribution of the light reflected from an object made of a specific real material and discusses a procedure for accurately reproducing the color associated with the spectral energy distribution. The model is applied to the simulation of a metal and a plastic.

...read moreread less

1,401 citations

Proceedings Article•DOI•

Learning Implicit Fields for Generative Shape Modeling

[...]

Zhiqin Chen¹, Hao Zhang¹•Institutions (1)

Simon Fraser University¹

15 Jun 2019

TL;DR: In this paper, an implicit field is used to assign a value to each point in 3D space, so that a shape can be extracted as an iso-surface, and a binary classifier is trained to perform this assignment.

...read moreread less

Abstract: We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder, called IM-NET, for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. IM-NET is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our implicit decoder for representation learning (via IM-AE) and shape generation (via IM-GAN), we demonstrate superior results for tasks such as generative shape modeling, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality. Code and supplementary material are available at https://github.com/czq142857/implicit-decoder.

...read moreread less

1,261 citations

Journal Article•DOI•

Deep Learning for 3D Point Clouds: A Survey

[...]

Yulan Guo¹, Hanyun Wang², Qingyong Hu³, Hao Liu¹, Li Liu⁴, Mohammed Bennamoun⁵ - Show less +2 more•Institutions (5)

Sun Yat-sen University¹, PLA Information Engineering University², University of Oxford³, National University of Defense Technology⁴, University of Western Australia⁵

01 Dec 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.

...read moreread less

Abstract: Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions

...read moreread less

1,021 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196

Collapse