scispace - formally typeset
Search or ask a question

Showing papers by "William A. P. Smith published in 2021"


Journal ArticleDOI
TL;DR: The most complete 3DMM of the human head to date is presented that includes face, cranium, ears, eyes, teeth and tongue and is used to reconstruct full head representations from single, unconstrained images allowing us to parameterize craniofacial shape and texture.
Abstract: Three-dimensional morphable models (3DMMs) are powerful statistical tools for representing the 3D shapes and textures of an object class. Here we present the most complete 3DMM of the human head to date that includes face, cranium, ears, eyes, teeth and tongue. To achieve this, we propose two methods for combining existing 3DMMs of different overlapping head parts: (i). use a regressor to complete missing parts of one model using the other, and (ii). use the Gaussian Process framework to blend covariance matrices from multiple models. Thus, we build a new combined face-and-head shape model that blends the variability and facial detail of an existing face model (the LSFM) with the full head modelling capability of an existing head model (the LYHM). Then we construct and fuse a highly-detailed ear model to extend the variation of the ear shape. Eye and eye region models are incorporated into the head model, along with basic models of the teeth, tongue and inner mouth cavity. The new model achieves state-of-the-art performance. We use our model to reconstruct full head representations from single, unconstrained images allowing us to parameterize craniofacial shape and texture, along with the ear shape, eye gaze and eye color.

73 citations


Journal ArticleDOI
TL;DR: In this article, a measurement procedure for marineterminating glaciers using structure-from-motion including proper survey planning, control point design and model alignment is proposed, which is effective for documentation of small-scale glacier dynamics, show the importance of appropriate alignment strategies for models with poorly distributed control points and compare two SfM tools (Agisoft Metashape and Bentley ContextCapture) concluding that ContextCapture offers around 17% lower error, 25% faster processing and better reconstruction of fine details and shadowed concavities.

12 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed several alternative photo-polarimetric constraints that depend upon the partial derivatives of the surface and showed how to express them in a unified system of partial differential equations.
Abstract: In this paper we present methods for estimating shape from polarisation and shading information, i.e. photo-polarimetric shape estimation, under varying, but unknown, illumination, i.e. in an uncalibrated scenario. We propose several alternative photo-polarimetric constraints that depend upon the partial derivatives of the surface and show how to express them in a unified system of partial differential equations of which previous work is a special case. By careful combination and manipulation of the constraints, we show how to eliminate non-linearities such that a discrete version of the problem can be solved using linear least squares. We derive a minimal, combinatorial approach for two source illumination estimation which we use with RANSAC for robust light direction and intensity estimation. We also introduce a new method for estimating a polarisation image from multichannel data and provide methods for estimating albedo and refractive index. We evaluate lighting, shape, albedo and refractive index estimation methods on both synthetic and real-world data showing improvements over existing state-of-the-art.

5 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: In this article, a probabilistic, vertex-wise projection of the 3D model to the image plane is used to estimate shape restricted to a single object class via a 3D morphable model using solely a semantic segmentation of a single 2D image.
Abstract: In this paper, we show how to estimate shape (restricted to a single object class via a 3D morphable model) using solely a semantic segmentation of a single 2D image. We propose a novel loss function based on a probabilistic, vertex-wise projection of the 3D model to the image plane. We represent both these projections and pixel labels as mixtures of Gaussians and compute the discrepancy between the two based on the geometric Renyi divergence. The resulting loss is differentiable and has a wide basin of convergence. We propose both classical, direct optimisation of this loss ("analysis-by-synthesis") and its use for training a parameter regression CNN. We show significant ad-vantages over existing segmentation losses used in state-of-the-art differentiable renderers Soft Rasterizer and Neural Mesh Renderer.

1 citations


Posted Content
TL;DR: In this article, a self-supervised approach for outdoor scene relighting is proposed, which is trained only on corpora of images collected from the internet without any user-supervision.
Abstract: Outdoor scene relighting is a challenging problem that requires good understanding of the scene geometry, illumination and albedo. Current techniques are completely supervised, requiring high quality synthetic renderings to train a solution. Such renderings are synthesized using priors learned from limited data. In contrast, we propose a self-supervised approach for relighting. Our approach is trained only on corpora of images collected from the internet without any user-supervision. This virtually endless source of training data allows training a general relighting solution. Our approach first decomposes an image into its albedo, geometry and illumination. A novel relighting is then produced by modifying the illumination parameters. Our solution capture shadow using a dedicated shadow prediction map, and does not rely on accurate geometry estimation. We evaluate our technique subjectively and objectively using a new dataset with ground-truth relighting. Results show the ability of our technique to produce photo-realistic and physically plausible results, that generalizes to unseen scenes.

Posted Content
TL;DR: In this article, a fully convolutional neural network is trained using large uncontrolled multiview and timelapse image collections without ground truth to recover shape, reflectance and lighting from a single, uncontrolled image.
Abstract: In this paper we show how to perform scene-level inverse rendering to recover shape, reflectance and lighting from a single, uncontrolled image using a fully convolutional neural network. The network takes an RGB image as input, regresses albedo, shadow and normal maps from which we infer least squares optimal spherical harmonic lighting coefficients. Our network is trained using large uncontrolled multiview and timelapse image collections without ground truth. By incorporating a differentiable renderer, our network can learn from self-supervision. Since the problem is ill-posed we introduce additional supervision. Our key insight is to perform offline multiview stereo (MVS) on images containing rich illumination variation. From the MVS pose and depth maps, we can cross project between overlapping views such that Siamese training can be used to ensure consistent estimation of photometric invariants. MVS depth also provides direct coarse supervision for normal map estimation. We believe this is the first attempt to use MVS supervision for learning inverse rendering. In addition, we learn a statistical natural illumination prior. We evaluate performance on inverse rendering, normal map estimation and intrinsic image decomposition benchmarks.

Journal ArticleDOI
TL;DR: In this article, an algorithm for rotational motion in molecular dynamics simulations is described, which requires neither quaternions nor Euler angles and works by updating the local Cartesian axes of the r...
Abstract: We describe an algorithm for rotational motion in molecular dynamics simulations. The algorithm requires neither quaternions nor Euler angles and works by updating the local Cartesian axes of the r...