scispace - formally typeset
Search or ask a question
Institution

Institute for Creative Technologies

About: Institute for Creative Technologies is a based out in . It is known for research contribution in the topics: Virtual reality & Rendering (computer graphics). The organization has 201 authors who have published 347 publications receiving 11861 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2017
TL;DR: This work proposes a multi-scale neural patch synthesis approach based on joint optimization of image content and texture constraints, which not only preserves contextual structures but also produces high-frequency details by matching and adapting patches with the most similar mid-layer feature correlations of a deep classification network.
Abstract: Recent advances in deep learning have shown exciting promise in filling large holes in natural images with semantically plausible and context aware details, impacting fundamental image manipulation tasks such as object removal. While these learning-based methods are significantly more effective in capturing high-level features than prior techniques, they can only handle very low-resolution inputs due to memory limitations and difficulty in training. Even for slightly larger images, the inpainted regions would appear blurry and unpleasant boundaries become visible. We propose a multi-scale neural patch synthesis approach based on joint optimization of image content and texture constraints, which not only preserves contextual structures but also produces high-frequency details by matching and adapting patches with the most similar mid-layer feature correlations of a deep classification network. We evaluate our method on the ImageNet and Paris Streetview datasets and achieved state-of-the-art inpainting accuracy. We show our approach produces sharper and more coherent results than prior methods, especially for high-resolution images.

780 citations

Journal ArticleDOI
TL;DR: Faces Learned with an Articulated Model and Expressions is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model and is compared to these models by fitting them to static 3D scans and 4D sequences using the same optimization method.
Abstract: The field of 3D face modeling has a large gap between high-end and low-end methods. At the high end, the best facial animation is indistinguishable from real humans, but this comes at the cost of extensive manual labor. At the low end, face capture from consumer depth sensors relies on 3D face models that are not expressive enough to capture the variability in natural facial shape and expression. We seek a middle ground by learning a facial model from thousands of accurately aligned 3D scans. Our FLAME model (Faces Learned with an Articulated Model and Expressions) is designed to work with existing graphics software and be easy to fit to data. FLAME uses a linear shape space trained from 3800 scans of human heads. FLAME combines this linear shape space with an articulated jaw, neck, and eyeballs, pose-dependent corrective blendshapes, and additional global expression blendshapes. The pose and expression dependent articulations are learned from 4D face sequences in the D3DFACS dataset along with additional 4D sequences. We accurately register a template mesh to the scan sequences and make the D3DFACS registrations available for research purposes. In total the model is trained from over 33, 000 scans. FLAME is low-dimensional but more expressive than the FaceWarehouse model and the Basel Face Model. We compare FLAME to these models by fitting them to static 3D scans and 4D sequences using the same optimization method. FLAME is significantly more accurate and is available for research purposes (http://flame.is.tue.mpg.de).

629 citations

Proceedings ArticleDOI
03 Apr 2019
TL;DR: This work proposes a truly differentiable rendering framework that is able to directly render colorized mesh using differentiable functions and back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images.
Abstract: Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental discretization step called rasterization, which prevents the rendering process to be differentiable, hence able to be learned. Unlike the state-of-the-art differentiable renderers, which only approximate the rendering gradient in the back propagation, we propose a truly differentiable rendering framework that is able to (1) directly render colorized mesh using differentiable functions and (2) back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images. The key to our framework is a novel formulation that views rendering as an aggregation function that fuses the probabilistic contributions of all mesh triangles with respect to the rendered pixels. Such formulation enables our framework to flow gradients to the occluded and far-range vertices, which cannot be achieved by the previous state-of-the-arts. We show that by using the proposed renderer, one can achieve significant improvement in 3D unsupervised single-view reconstruction both qualitatively and quantitatively. Experiments also demonstrate that our approach is able to handle the challenging tasks in image-based shape fitting, which remain nontrivial to existing differentiable renderers. Code is available at https://github.com/ShichenLiu/SoftRas.

566 citations

Journal ArticleDOI
TL;DR: An architecture and toolkit for building dialogue managers currently being developed in the TRINDI project is introduced, based on the notions of information state and dialogue move engine, to make implementation of dialogue processing theories easier.
Abstract: We introduce an architecture and toolkit for building dialogue managers currently being developed in the TRINDI project, based on the notions of information state and dialogue move engine. The aim is to provide a framework for experimenting with implementations of different theories of information state, information state update and dialogue control. A number of dialogue managers are currently being built using the toolkit, and we present overviews of two of them. We believe that this framework will make implementation of dialogue processing theories easier, also facilitating comparison of different types of dialogue systems, thus helping to achieve a prerequisite for arriving at a best practice for the development of the dialogue management component of a spoken dialogue system.

541 citations

Proceedings ArticleDOI
05 May 2014
TL;DR: SimSensei Kiosk is presented, an implemented virtual human interviewer designed to create an engaging face-to-face interaction where the user feels comfortable talking and sharing information and development of a fully automatic virtual interviewer able to engage users in 15-25 minute interactions.
Abstract: We present SimSensei Kiosk, an implemented virtual human interviewer designed to create an engaging face-to-face interaction where the user feels comfortable talking and sharing information. SimSensei Kiosk is also designed to create interactional situations favorable to the automatic assessment of distress indicators, defined as verbal and nonverbal behaviors correlated with depression, anxiety or post-traumatic stress disorder (PTSD). In this paper, we summarize the design methodology, performed over the past two years, which is based on three main development cycles: (1) analysis of face-to-face human interactions to identify potential distress indicators, dialogue policies and virtual human gestures, (2) development and analysis of a Wizard-of-Oz prototype system where two human operators were deciding the spoken and gestural responses, and (3) development of a fully automatic virtual interviewer able to engage users in 15-25 minute interactions. We show the potential of our fully automatic virtual human interviewer in a user study, and situate its performance in relation to the Wizard-of-Oz prototype.

459 citations


Authors

Showing all 201 results

NameH-indexPapersCitations
Louis-Philippe Morency7439218307
Albert Rizzo6733116040
Stacy Marsella5827612969
Jonathan Gratch5739314163
Hao Li5622110232
Paul Debevec5624018468
Michele T. Pato5218129841
David Traum5030310800
Thomas D. Parsons472138034
Paul S. Rosenbloom4117311230
Mark Bolas381475238
Stefan Scherer381495766
David V. Pynadath371014614
William R. Swartout34736189
Kenji Sagae321023699
Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20221
202116
202019
201923
201844
201728