Home
/
Authors
/
Pablo Garrido

Author

Pablo Garrido

Other affiliations: Federico Santa María Technical University, Universidad Francisco de Vitoria, University of Concepción ...read more

Bio: Pablo Garrido is an academic researcher from Max Planck Society. The author has contributed to research in topics: Face (geometry) & Heuristics. The author has an hindex of 18, co-authored 30 publications receiving 2521 citations. Previous affiliations of Pablo Garrido include Federico Santa María Technical University & Universidad Francisco de Vitoria.

Topics: Face (geometry), Heuristics, Psychology, Autoencoder, 3D reconstruction ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep video portraits

[...]

Hyeongwoo Kim¹, Pablo Garrido, Ayush Tewari¹, Weipeng Xu¹, Justus Thies², Matthias Niessner², Patrick Pérez, Christian Richardt³, Michael Zollhöfer⁴, Christian Theobalt¹ - Show less +6 more•Institutions (4)

Max Planck Society¹, Technische Universität München², University of Bath³, Stanford University⁴

30 Jul 2018

TL;DR: In this paper, a generative neural network with a novel space-time architecture is proposed to transfer the full 3D head position, head rotation, face expression, eye gaze, and eye blinking from a source actor to a portrait video of a target actor.

...read moreread less

Abstract: We present a novel approach that enables photo-realistic re-animation of portrait videos using only an input video. In contrast to existing approaches that are restricted to manipulations of facial expressions only, we are the first to transfer the full 3D head position, head rotation, face expression, eye gaze, and eye blinking from a source actor to a portrait video of a target actor. The core of our approach is a generative neural network with a novel space-time architecture. The network takes as input synthetic renderings of a parametric face model, based on which it predicts photo-realistic video frames for a given target actor. The realism in this rendering-to-video transfer is achieved by careful adversarial training, and as a result, we can create modified target videos that mimic the behavior of the synthetically-created input. In order to enable source-to-target video re-animation, we render a synthetic target video with the reconstructed head animation parameters from a source video, and feed it into the trained network - thus taking full control of the target. With the ability to freely recombine source and target parameters, we are able to demonstrate a large variety of video rewrite applications without explicitly modeling hair, body or background. For instance, we can reenact the full head using interactive user-controlled editing, and realize high-fidelity visual dubbing. To demonstrate the high quality of our output, we conduct an extensive series of experiments and evaluations, where for instance a user study shows that our video edits are hard to detect.

...read moreread less

611 citations

Posted Content•

MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

[...]

Ayush Tewari¹, Michael Zollhöfer¹, Hyeongwoo Kim¹, Pablo Garrido¹, Florian Bernard¹, Patrick Pérez², Christian Theobalt¹ - Show less +3 more•Institutions (2)

Max Planck Society¹, Valeo²

30 Mar 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image and can be trained end-to-end in an unsupervised manner, which renders training on very large real world data feasible.

...read moreread less

Abstract: In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

...read moreread less

355 citations

Proceedings Article•DOI•

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

[...]

Ayush Tewari¹, Michael Zollhöfer¹, Hyeongwoo Kim¹, Pablo Garrido¹, Florian Bernard¹, Patrick Pérez², Christian Theobalt¹ - Show less +3 more•Institutions (2)

Max Planck Society¹, Valeo²

01 Oct 2017

...read moreread less

Abstract: In this work we propose a novel model-based deep convolutional autoencoder that addresses the highly challenging problem of reconstructing a 3D human face from a single in-the-wild color image. To this end, we combine a convolutional encoder network with an expert-designed generative model that serves as decoder. The core innovation is the differentiable parametric decoder that encapsulates image formation analytically based on a generative model. Our decoder takes as input a code vector with exactly defined semantic meaning that encodes detailed face pose, shape, expression, skin reflectance and scene illumination. Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image. For the first time, a CNN encoder and an expert-designed generative model can be trained end-to-end in an unsupervised manner, which renders training on very large (unlabeled) real world data feasible. The obtained reconstructions compare favorably to current state-of-the-art approaches in terms of quality and richness of representation.

...read moreread less

316 citations

Proceedings Article•DOI•

Self-Supervised Multi-level Face Model Learning for Monocular Reconstruction at Over 250 Hz

[...]

Ayush Tewari¹, Michael Zollhöfer¹, Pablo Garrido¹, Florian Bernard¹, Hyeongwoo Kim¹, Patrick Pérez, Christian Theobalt¹ - Show less +3 more•Institutions (1)

Max Planck Society¹

18 Jun 2018

TL;DR: This first approach that jointly learns a regressor for face shape, expression, reflectance and illumination on the basis of a concurrently learned parametric face model is presented, which compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.

...read moreread less

Abstract: The reconstruction of dense 3D models of face geometry and appearance from a single image is highly challenging and ill-posed. To constrain the problem, many approaches rely on strong priors, such as parametric face models learned from limited 3D scan data. However, prior models restrict generalization of the true diversity in facial geometry, skin reflectance and illumination. To alleviate this problem, we present the first approach that jointly learns 1) a regressor for face shape, expression, reflectance and illumination on the basis of 2) a concurrently learned parametric face model. Our multi-level face model combines the advantage of 3D Morphable Models for regularization with the out-of-space generalization of a learned corrective space. We train end-to-end on in-the-wild images without dense annotations by fusing a convolutional encoder with a differentiable expert-designed renderer and a self-supervised training loss, both defined at multiple detail levels. Our approach compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.

...read moreread less

275 citations

Journal Article•DOI•

Reconstruction of Personalized 3D Face Rigs from Monocular Video

[...]

Pablo Garrido¹, Michael Zollhöfer¹, Dan Casas¹, Levi Valgaerts¹, Kiran Varanasi, Patrick Pérez, Christian Theobalt¹ - Show less +3 more•Institutions (1)

Max Planck Society¹

18 May 2016-ACM Transactions on Graphics

TL;DR: A novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data, based on three distinct layers that allow the actor's facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles.

...read moreread less

Abstract: We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data (e.g., vintage movies). Our rig is based on three distinct layers that allow us to model the actor’s facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles. At the heart of our approach is a parametric shape prior that encodes the plausible subspace of facial identity and expression variations. Based on this prior, a coarse-scale reconstruction is obtained by means of a novel variational fitting approach. We represent person-specific idiosyncrasies, which cannot be represented in the restricted shape and expression space, by learning a set of medium-scale corrective shapes. Fine-scale skin detail, such as wrinkles, are captured from video via shading-based refinement, and a generative detail formation model is learned. Both the medium- and fine-scale detail layers are coupled with the parametric prior by means of a novel sparse linear regression formulation. Once reconstructed, all layers of the face rig can be conveniently controlled by a low number of blendshape expression parameters, as widely used by animation artists. We show captured face rigs and their motions for several actors filmed in different monocular video formats, including legacy footage from YouTube, and demonstrate how they can be used for 3D animation and 2D video editing. Finally, we evaluate our approach qualitatively and quantitatively and compare to related state-of-the-art methods.

...read moreread less

267 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Journal Article•DOI•

Biogenic amines: their importance in foods

[...]

M.H.Silla Santos¹•Institutions (1)

Spanish National Research Council¹

01 Apr 1996-International Journal of Food Microbiology

TL;DR: A better knowledge of the factors controlling the formation of amines is necessary in order to improve the quality and safety of food as discussed by the authors, which can be found in both raw and processed foods.

...read moreread less

1,283 citations

Journal Article•DOI•

Hyper-heuristics: a survey of the state of the art

[...]

Edmund K. Burke¹, Michel Gendreau², Matthew Hyde³, Graham Kendall⁴, Gabriela Ochoa¹, Ender Özcan⁴, Rong Qu⁴ - Show less +3 more•Institutions (4)

University of Stirling¹, Université de Montréal², University of East Anglia³, University of Nottingham⁴

10 Jul 2013-Journal of the Operational Research Society

TL;DR: A critical discussion of the scientific literature on hyper-heuristics including their origin and intellectual roots, a detailed account of the main types of approaches, and an overview of some related areas are presented.

...read moreread less

Abstract: Hyper-heuristics comprise a set of approaches that are motivated (at least in part) by the goal of automating the design of heuristic methods to solve hard computational search problems. An underlying strategic research challenge is to develop more generally applicable search methodologies. The term hyper-heuristic is relatively new; it was first used in 2000 to describe heuristics to choose heuristics in the context of combinatorial optimisation. However, the idea of automating the design of heuristics is not new; it can be traced back to the 1960s. The definition of hyper-heuristics has been recently extended to refer to a search method or learning mechanism for selecting or generating heuristics to solve computational search problems. Two main hyper-heuristic categories can be considered: heuristic selection and heuristic generation. The distinguishing feature of hyper-heuristics is that they operate on a search space of heuristics (or heuristic components) rather than directly on the search space of solutions to the underlying problem that is being addressed. This paper presents a critical discussion of the scientific literature on hyper-heuristics including their origin and intellectual roots, a detailed account of the main types of approaches, and an overview of some related areas. Current research trends and directions for future research are also discussed.

...read moreread less

1,023 citations

Proceedings Article•DOI•

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos

[...]

Justus Thies¹, Michael Zollhöfer², Marc Stamminger¹, Christian Theobalt², Matthias NieBner³ - Show less +1 more•Institutions (3)

University of Erlangen-Nuremberg¹, Max Planck Society², Stanford University³

27 Jun 2016

TL;DR: A novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video) that addresses the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling and re-render the manipulated output video in a photo-realistic fashion.

...read moreread less

Abstract: We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.

...read moreread less

1,011 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse