scispace - formally typeset
Search or ask a question

Showing papers in "Computer Graphics Forum in 2018"


Journal ArticleDOI
TL;DR: In this article, a patch-based learning method is proposed for estimating local shape properties in point clouds, where a series of local patches at multiple scales around each point are encoded in a structured manner.
Abstract: In this paper, we propose PCPNET, a deep-learning based approach for estimating local 3D shape properties in point clouds. In contrast to the majority of prior techniques that concentrate on global or mid-level attributes, e.g., for shape classification or semantic labeling, we suggest a patch-based learning method, in which a series of local patches at multiple scales around each point is encoded in a structured manner. Our approach is especially well-adapted for estimating local shape properties such as normals (both unoriented and oriented) and curvature from raw point clouds in the presence of strong noise and multi-scale features. Our main contributions include both a novel multi-scale variant of the recently proposed PointNet architecture with emphasis on local shape information, and a series of novel applications in which we demonstrate how learning from training data arising from well-structured triangle meshes, and applying the trained model to noisy point clouds can produce superior results compared to specialized state-of-the-art techniques. Finally, we demonstrate the utility of our approach in the context of shape reconstruction, by showing how it can be used to extract normal orientation information from point clouds.

253 citations


Journal ArticleDOI
TL;DR: The computer graphics and vision communities have dedicated long standing efforts in building computerized tools for reconstructing, tracking, and analyzing human faces based on visual input to novel and powerful algorithms that obtain impressive results even in the very challenging case of reconstruction from a single RGB or RGB‐D camera.
Abstract: The computer graphics and vision communities have dedicated long standing efforts in building computerized tools for reconstructing, tracking, and analyzing human faces based on visual input. Over the past years rapid progress has been made, which led to novel and powerful algorithms that obtain impressive results even in the very challenging case of reconstruction from a single RGB or RGB‐D camera. The range of applications is vast and steadily growing as these technologies are further improving in speed, accuracy, and ease of use.

251 citations


Journal ArticleDOI
TL;DR: This report explains, compare, and critically analyze the common underlying algorithmic concepts that enabled recent developments in RGB‐D scene reconstruction in detail, and shows how algorithms are designed to best exploit the benefits ofRGB‐D data while suppressing their often non‐trivial data distortions.
Abstract: The advent of affordable consumer grade RGB‐D cameras has brought about a profound advancement of visual scene reconstruction methods. Both computer graphics and computer vision researchers spend significant effort to develop entirely new algorithms to capture comprehensive shape models of static and dynamic scenes with RGB‐D cameras. This led to significant advances of the state of the art along several dimensions. Some methods achieve very high reconstruction detail, despite limited sensor resolution. Others even achieve real‐time performance, yet possibly at lower quality. New concepts were developed to capture scenes at larger spatial and temporal extent. Other recent algorithms flank shape reconstruction with concurrent material and lighting estimation, even in general scenes and unconstrained conditions. In this state‐of‐the‐art report, we analyze these recent developments in RGB‐D scene reconstruction in detail and review essential related work. We explain, compare, and critically analyze the common underlying algorithmic concepts that enabled these recent advancements. Furthermore, we show how algorithms are designed to best exploit the benefits of RGB‐D data while suppressing their often non‐trivial data distortions. In addition, this report identifies and discusses important open research questions and suggests relevant directions for future work.

248 citations


Journal ArticleDOI
TL;DR: This survey presents a comprehensive review of the IK problem and the solutions developed over the years from the computer graphics point of view, and suggests which IK family of solvers is best suited for particular problems.
Abstract: Inverse kinematics (IK) is the use of kinematic equations to determine the joint parameters of a manipulator so that the end effector moves to a desired position; IK can be applied in many areas, including robotics, engineering, computer graphics and video games. In this survey, we present a comprehensive review of the IK problem and the solutions developed over the years from the computer graphics point of view. The paper starts with the definition of forward and IK, their mathematical formulations and explains how to distinguish the unsolvable cases, indicating when a solution is available. The IK literature in this report is divided into four main categories: the analytical, the numerical, the data-driven and the hybrid methods. A timeline illustrating key methods is presented, explaining how the IK approaches have progressed over the years. The most popular IK methods are discussed with regard to their performance, computational cost and the smoothness of their resulting postures, while we suggest which IK family of solvers is best suited for particular problems. Finally, we indicate the limitations of the current IK methodologies and propose future research directions.

171 citations


Journal ArticleDOI
TL;DR: This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet, which accepts LDR images as input and generates images with an expanded range in an end‐to‐end fashion.
Abstract: High dynamic range (HDR) imaging provides the capability of handling real world lighting as opposed to the traditional low dynamic range (LDR) which struggles to accurately represent images with higher dynamic range. However, most imaging content is still available only in LDR. This paper presents a method for generating HDR content from LDR content based on deep Convolutional Neural Networks (CNNs) termed ExpandNet. ExpandNet accepts LDR images as input and generates images with an expanded range in an end-to-end fashion. The model attempts to reconstruct missing information that was lost from the original signal due to quantization, clipping, tone mapping or gamma correction. The added information is reconstructed from learned features, as the network is trained in a supervised fashion using a dataset of HDR images. The approach is fully automatic and data driven; it does not require any heuristics or human expertise. ExpandNet uses a multiscale architecture which avoids the use of upsampling layers to improve image quality. The method performs well compared to expansion/inverse tone mapping operators quantitatively on multiple metrics, even for badly exposed inputs.

164 citations


Journal ArticleDOI
TL;DR: This survey attempts to report, categorize and unify the diverse understandings and aims to establish a common vocabulary that will enable a wide audience to understand their differences and subtleties.
Abstract: The visualization community has developed to date many intuitions and understandings of how to judge the quality of views in visualizing data. The computation of a visualization's quality and usefulness ranges from measuring clutter and overlap, up to the existence and perception of specific (visual) patterns. This survey attempts to report, categorize and unify the diverse understandings and aims to establish a common vocabulary that will enable a wide audience to understand their differences and subtleties. For this purpose, we present a commonly applicable quality metric formalization that should detail and relate all constituting parts of a quality metric. We organize our corpus of reviewed research papers along the data types established in the information visualization community: multi‐ and high‐dimensional, relational, sequential, geospatial and text data. For each data type, we select the visualization subdomains in which quality metrics are an active research field and report their findings, reason on the underlying concepts, describe goals and outline the constraints and requirements. One central goal of this survey is to provide guidance on future research opportunities for the field and outline how different visualization communities could benefit from each other by applying or transferring knowledge to their respective subdomain. Additionally, we aim to motivate the visualization community to compare computed measures to the perception of humans.

126 citations


Journal ArticleDOI
TL;DR: This report analyzes current research contributions through the lens of three categories of sports data: box score data, tracking data, and meta‐data (data about the sport and its participants but not necessarily a given game), identifying critical research gaps and valuable opportunities for the visualization community.
Abstract: In this report, we organize and reflect on recent advances and challenges in the field of sports data visualization. The exponentially-growing body of visualization research based on sports data is a prime indication of the importance and timeliness of this report. Sports data visualization research encompasses the breadth of visualization tasks and goals: exploring the design of new visualization techniques; adapting existing visualizations to a novel domain; and conducting design studies and evaluations in close collaboration with experts, including practitioners, enthusiasts, and journalists. Frequently this research has impact beyond sports in both academia and in industry because it is i) grounded in realistic, highly heterogeneous data, ii) applied to real-world problems, and iii) designed in close collaboration with domain experts. In this report, we analyze current research contributions through the lens of three categories of sports data: box score data (data containing statistical summaries of a sport event such as a game), tracking data (data about in-game actions and trajectories), and meta-data (data about the sport and its participants but not necessarily a given game). We conclude this report with a high-level discussion of sports visualization research informed by our analysis—identifying critical research gaps and valuable opportunities for the visualization community. More information is available at the STAR’s website: https://sportsdataviz.github.io/.

103 citations


Journal ArticleDOI
TL;DR: This paper presents a coherent survey of methods that utilize Monte Carlo integration for estimating light transport in scenes containing participating media, and includes earlier methods that are key for building light transport paths in a stochastic manner.
Abstract: The wide adoption of path‐tracing algorithms in high‐end realistic rendering has stimulated many diverse research initiatives. In this paper we present a coherent survey of methods that utilize Monte Carlo integration for estimating light transport in scenes containing participating media. Our work complements the volume‐rendering state‐of‐the‐art report by Cerezo et al. [ CPP*05 ]; we review publications accumulated since its publication over a decade ago, and include earlier methods that are key for building light transport paths in a stochastic manner. We begin by describing analog and non‐analog procedures for free‐path sampling and discuss various expected‐value, collision, and track‐length estimators for computing transmittance. We then review the various rendering algorithms that employ these as building blocks for path sampling. Special attention is devoted to null‐collision methods that utilize fictitious matter to handle spatially varying densities; we import two “next‐flight” estimators originally developed in nuclear sciences. Whenever possible, we draw connections between image‐synthesis techniques and methods from particle physics and neutron transport to provide the reader with a broader context.

96 citations


Journal ArticleDOI
TL;DR: This contribution describes the background of sentiment analysis, introduces a categorization for sentiment visualization techniques that includes 7 groups with 35 categories in total, and discusses 132 techniques from peer‐reviewed publications together with an interactive web‐based survey browser.
Abstract: Visualization of sentiments and opinions extracted from or annotated in texts has become a prominent topic of research over the last decade. From basic pie and bar charts used to illustrate customer reviews to extensive visual analytics systems involving novel representations, sentiment visualization techniques have evolved to deal with complex multidimensional data sets, includingtemporal, relational, and geospatial aspects. This contribution presents a survey of sentiment visualization techniques based on a detailed categorization. We describe the background of sentiment analysis, introduce a categorization for sentiment visualization techniques that includes 7 groups with 35 categories in total, and discuss 132 techniques from peer-reviewed publications together with an interactive web-based survey browser. Finally, we discuss insights and opportunities for further research in sentiment visualization. We expect this survey to be useful for visualization researchers whose interests include sentiment or other aspects of text data as well as researchers and practitioners from other disciplines in search of efficient visualization techniques applicable to their tasks and data. (Less)

89 citations



Journal ArticleDOI
TL;DR: This work finds that colored scatterplots (with positionally‐coded quantities and color‐coded categories) perform well for comparing individual points, but perform poorly for summary tasks as the number of categories increases, and suggests improved approaches for automated design.
Abstract: In addition to the choice of visual encodings, the effectiveness of a data visualization may vary with the analytical task being performed and the distribution of data values. To better assess these effects and create refined rankings of visual encodings, we conduct an experiment measuring subject performance across task types (e.g., comparing individual versus aggregate values) and data distributions (e.g., with varied cardinalities and entropies). We compare performance across 12 encoding specifications of trivariate data involving 1 categorical and 2 quantitative fields, including the use of x, y, color, size, and spatial subdivision (i.e., faceting). Our results extend existing models of encoding effectiveness and suggest improved approaches for automated design. For example, we find that colored scatterplots (with positionally‐coded quantities and color‐coded categories) perform well for comparing individual points, but perform poorly for summary tasks as the number of categories increases.

Journal ArticleDOI
TL;DR: This work argues that the main goal of doing visual analytics is to build a mental and/or formal model of a certain piece of reality reflected in data, and proposes a detailed conceptual framework in which the visual analytics process is considered as a goal‐oriented workflow producing a model as a result.
Abstract: To complement the currently existing definitions and conceptual frameworks of visual analytics, which focus mainly on activities performed by analysts and types of techniques they use, we attempt to define the expected results of these activities. We argue that the main goal of doing visual analytics is to build a mental and/or formal model of a certain piece of reality reflected in data. The purpose of the model may be to understand, to forecast or to control this piece of reality. Based on this model-building perspective, we propose a detailed conceptual framework in which the visual analytics process is considered as a goal-oriented workflow producing a model as a result. We demonstrate how this framework can be used for performing an analytical survey of the visual analytics research field and identifying the directions and areas where further research is needed.

Journal ArticleDOI
TL;DR: The novel enhancements to simplify the 3D geometric calibration task can now be reliably carried out either interactively or automatically using self‐calibration methods, and improvements regarding radiometric calibration and compensation are summarized.
Abstract: This State‐of‐the‐Art‐Report covers the recent advances in research fields related to projection mapping applications. We summarize the novel enhancements to simplify the 3D geometric calibration task, which can now be reliably carried out either interactively or automatically using self‐calibration methods. Furthermore, improvements regarding radiometric calibration and compensation as well as the neutralization of global illumination effects are summarized. We then introduce computational display approaches to overcome technical limitations of current projection hardware in terms of dynamic range, refresh rate, spatial resolution, depth‐of‐field, view dependency, and color space. These technologies contribute towards creating new application domains related to projection‐based spatial augmentations. We summarize these emerging applications, and discuss new directions for industries.

Journal ArticleDOI
TL;DR: The method is based on using the previously‐proposed persistence diagrams associated with real‐valued functions, and on the analysis of the derivatives of these diagrams with respect to changes in the function values allows for continuous optimization techniques to modify a given function, while optimizing an energy based purely on the values in the persistence diagrams.
Abstract: We present a novel approach for optimizing real-valued functions based on a wide range of topological criteria. In particular, we show how to modify a given function in order to remove topological noise and to exhibit prescribed topological features. Our method is based on using the previously-proposed persistence diagrams associated with real-valued functions, and on the analysis of the derivatives of these diagrams with respect to changes in the function values. This analysis allows us to use continuous optimization techniques to modify a given function, while optimizing an energy based purely on the values in the persistence diagrams. We also present a procedure for aligning persistence diagrams of functions on different domains, without requiring a mapping between them. Finally, we demonstrate the utility of these constructions in the context of the functional map framework, by first giving a characterization of functional maps that are associated with continuous point-to-point correspondences, directly in the functional domain, and then by presenting an optimization scheme that helps to promote the continuity of functional maps, when expressed in the reduced basis, without imposing any restrictions on metric distortion. We demonstrate that our approach is efficient and can lead to improvement in the accuracy of maps computed in practice.

Journal ArticleDOI
TL;DR: This paper provides a literature survey to catalogue the range of tasks where the embeddings are employed across a broad range of applications, and presents visual interactive designs that address many of these tasks.
Abstract: Word vector embeddings are an emerging tool for natural language processing. They have proven beneficial for a wide variety of language processing tasks. Their utility stems from the ability to encode word relationships within the vector space. Applications range from components in natural language processing systems to tools for linguistic analysis in the study of language and literature. In many of these applications, interpreting embeddings and understanding the encoded grammatical and semantic relations between words is useful, but challenging. Visualization can aid in such interpretation of embeddings. In this paper, we examine the role for visualization in working with word vector embeddings. We provide a literature survey to catalogue the range of tasks where the embeddings are employed across a broad range of applications. Based on this survey, we identify key tasks and their characteristics. Then, we present visual interactive designs that address many of these tasks. The designs integrate into an exploration and analysis environment for embeddings. Finally, we provide example use cases for them and discuss domain user feedback.

Journal ArticleDOI
TL;DR: A review of the use of crowdsourcing for evaluation in visualization research, which analyzed 190 crowdsourcing experiments reported in 82 papers that were published in major visualization conferences and journals between 2006 and 2017.
Abstract: Visualization researchers have been increasingly leveraging crowdsourcing approaches to overcome a number of limitations of controlled laboratory experiments, including small participant sample sizes and narrow demographic backgrounds of study participants. However, as a community, we have little understanding on when, where, and how researchers use crowdsourcing approaches for visualization research. In this paper, we review the use of crowdsourcing for evaluation in visualization research. We analyzed 190 crowdsourcing experiments, reported in 82 papers that were published in major visualization conferences and journals between 2006 and 2017. We tagged each experiment along 36 dimensions that we identified for crowdsourcing experiments. We grouped our dimensions into six important aspects: study design & procedure, task type, participants, measures & metrics, quality assurance, and reproducibility. We report on the main findings of our review and discuss challenges and opportunities for improvements in conducting crowdsourcing studies for visualization research.

Journal ArticleDOI
TL;DR: In this paper, a neural network is used to generate small-scale splashes for the fluid-implicit-particle method using training data acquired from physically parametrized, high-resolution simulations.
Abstract: This paper proposes a new data-driven approach to model detailed splashes for liquid simulations with neural networks. Our model learns to generate small-scale splash detail for the fluid-implicit-particle method using training data acquired from physically parametrized, high resolution simulations. We use neural networks to model the regression of splash formation using a classifier together with a velocity modifier. For the velocity modification, we employ a heteroscedastic model. We evaluate our method for different spatial scales, simulation setups, and solvers. Our simulation results demonstrate that our model significantly improves visual fidelity with a large amount of realistic droplet formation and yields splash detail much more efficiently than finer discretizations.

Journal ArticleDOI
TL;DR: This work considers five categories of data reduction techniques based on their information loss, and surveys available techniques in each, summarize their properties from a practical point of view and discuss relative merits within a category.
Abstract: Data reduction is increasingly being applied to scientific data for numerical simulations, scientific visualizations and data analyses. It is most often used to lower I/O and storage costs, and sometimes to lower in‐memory data size as well. With this paper, we consider five categories of data reduction techniques based on their information loss: (1) truly lossless, (2) near lossless, (3) lossy, (4) mesh reduction and (5) derived representations. We then survey available techniques in each of these categories, summarize their properties from a practical point of view and discuss relative merits within a category. We believe, in total, this work will enable simulation scientists and visualization/data analysis scientists to decide which data reduction techniques will be most helpful for their needs.

Journal ArticleDOI
TL;DR: GazeDirector allows us to change where people are looking without person‐specific training data, and with full articulation, i.e. the authors can precisely specify new gaze directions in 3D.
Abstract: We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior.

Journal ArticleDOI
TL;DR: A learning based method to recover low‐frequency scene illumination represented as spherical harmonic functions by pairwise photos from rear and front cameras on mobile devices is proposed and produces visually and quantitatively superior results compared to the state‐of‐the‐arts.
Abstract: Illumination estimation is an essential problem in computer vision, graphics and augmented reality. In this paper, we propose a learning based method to recover low‐frequency scene illumination represented as spherical harmonic (SH) functions by pairwise photos from rear and front cameras on mobile devices. An end‐to‐end deep convolutional neural network (CNN) structure is designed to process images on symmetric views and predict SH coefficients. We introduce a novel Render Loss to improve the rendering quality of the predicted illumination. A high quality high dynamic range (HDR) panoramic image dataset was developed for training and evaluation. Experiments show that our model produces visually and quantitatively superior results compared to the state‐of‐the‐arts. Moreover, our method is practical for mobile‐based applications.

Journal ArticleDOI
TL;DR: An interactive tool that allows users to automatically generate a grammar to generate a complete procedural building from a single image of a building, using convolutional neural networks and optimization to select and parameterize procedural grammars that reproduce the building elements of the picture.
Abstract: Creating a virtual city is demanded for computer games, movies, and urban planning, but it takes a lot of time to create numerous 3D building models. Procedural modeling has become popular in recent years to overcome this issue, but creating a grammar to get a desired output is difficult and time consuming even for expert users. In this paper, we present an interactive tool that allows users to automatically generate such a grammar from a single image of a building. The user selects a photograph and highlights the silhouette of the target building as input to our method. Our pipeline automatically generates the building components, from large-scale building mass to fine-scale windows and doors geometry. Each stage of our pipeline combines convolutional neural networks (CNNs) and optimization to select and parameterize procedural grammars that reproduce the building elements of the picture. In the first stage, our method jointly estimates camera parameters and building mass shape. Once known, the building mass enables the rectification of the facades, which are given as input to the second stage that recovers the facade layout. This layout allows us to extract individual windows and doors that are subsequently fed to the last stage of the pipeline that selects procedural grammars for windows and doors. Finally, the grammars are combined to generate a complete procedural building as output. We devise a common methodology to make each stage of this pipeline tractable. This methodology consists in simplifying the input image to match the visual appearance of synthetic training data, and in using optimization to refine the parameters estimated by CNNs. We used our method to generate a variety of procedural models of buildings from existing photographs.

Journal ArticleDOI
TL;DR: A hierarchical framework for the generation of building interiors based on a mixed integer quadratic programming (MIQP) formulation that can be used for residential building layouts and can be scaled up to large layouts such as office buildings, shopping malls, and supermarkets.
Abstract: This work was supported by the KAUST Office of Sponsored Research (OSR) under Award No. OCRF-2014-CGR3-62140401, and the Visual Computing Center at KAUST. Ligang Liu is supported by the National Natural Science Foundation of China (61672482, 61672481, 11626253) and the One Hundred Talent Project of the Chinese Academy of Sciences. We would like to thank Virginia Unkefer for proofreading the paper.

Journal ArticleDOI
TL;DR: It is argued that the use of function products can have a wide‐reaching effect in extending the power of functional maps in a variety of applications, in particular by enabling the transfer of high‐frequency functions without changing the representation size or complexity.
Abstract: In this paper, we consider the problem of information transfer across shapes and propose an extension to the widely used functional map representation. Our main observation is that in addition to the vector space structure of the functional spaces, which has been heavily exploited in the functional map framework, the functional algebra (i.e., the ability to take pointwise products of functions) can significantly extend the power of this framework. Equipped with this observation, we show how to improve one of the key applications of functional maps, namely transferring real-valued functions without conversion to point-to-point correspondences. We demonstrate through extensive experiments that by decomposing a given function into a linear combination consisting not only of basis functions but also of their pointwise products, both the representation power and the quality of the function transfer can be improved significantly. Our modification, while computationally simple, allows us to achieve higher transfer accuracy while keeping the size of the basis and the functional map fixed. We also analyze the computational complexity of optimally representing functions through linear combinations of products in a given basis and prove NP-completeness in some general cases. Finally, we argue that the use of function products can have a wide-reaching effect in extending the power of functional maps in a variety of applications, in particular by enabling the transfer of high-frequency functions without changing the representation size or complexity.

Journal ArticleDOI
TL;DR: This work implements and evaluates an alternative data exploration metaphor where the user remains seated and viewpoint change is only realisable through physical movements, and demonstrates that the prototype setup, named VirtualDesk, presents excellent results regarding user comfort and immersion, and performs equally or better in all analytical tasks.
Abstract: 3D representations are potentially useful under many circumstances, but suffer from long known perception and interaction challenges. Current immersive technologies, which combine stereoscopic displays and natural interaction, are being progressively seen as an opportunity to tackle this issue, but new guidelines and studies are still needed, especially regarding information visualization. Many proposed approaches are impractical for actual usage, resulting in user discomfort or requiring too much time or space. In this work, we implement and evaluate an alternative data exploration metaphor where the user remains seated and viewpoint change is only realisable through physical movements. All manipulation is done directly by natural mid‐air gestures, with the data being rendered at arm's reach. The virtual reproduction of the analyst's desk aims to increase immersion and enable tangible interaction with controls and two dimensional associated information. A comparative user study was carried out against a desktop‐based equivalent, exploring a set of 9 perception and interaction tasks based on previous literature and a multidimensional projection use case. We demonstrate that our prototype setup, named VirtualDesk, presents excellent results regarding user comfort and immersion, and performs equally or better in all analytical tasks, while adding minimal or no time overhead and amplifying user subjective perceptions of efficiency and engagement. Results are also contrasted to a previous experiment employing artificial flying navigation, with significant observed improvements.

Journal ArticleDOI
TL;DR: This paper presents an approach to directly estimate an HDR light probe from a single LDR photograph, shot outdoors with a consumer camera, without specialized calibration targets or equipment, and shows that relighting objects with HDR light probes estimated by the method yields realistic results in a wide variety of settings.
Abstract: Image‐based lighting has allowed the creation of photo‐realistic computer‐generated content. However, it requires the accurate capture of the illumination conditions, a task neither easy nor intuitive, especially to the average digital photography enthusiast. This paper presents an approach to directly estimate an HDR light probe from a single LDR photograph, shot outdoors with a consumer camera, without specialized calibration targets or equipment. Our insight is to use a person's face as an outdoor light probe. To estimate HDR light probes from LDR faces we use an inverse rendering approach which employs data‐driven priors to guide the estimation of realistic, HDR lighting. We build compact, realistic representations of outdoor lighting both parametrically and in a data‐driven way, by training a deep convolutional autoencoder on a large dataset of HDR sky environment maps. Our approach can recover high‐frequency, extremely high dynamic range lighting environments. For quantitative evaluation of lighting estimation accuracy and relighting accuracy, we also contribute a new database of face photographs with corresponding HDR light probes. We show that relighting objects with HDR light probes estimated by our method yields realistic results in a wide variety of settings.


Journal ArticleDOI
TL;DR: This work identified computational building blocks of user strategies, formalize them, and investigate their potentials for different machine learning tasks in systematic experiments, and observed that data‐based user strategies work considerably well in early phases, while model‐ based user strategies perform better during later phases.
Abstract: The labeling of data sets is a time‐consuming task, which is, however, an important prerequisite for machine learning and visual analytics. Visual‐interactive labeling (VIAL) provides users an active role in the process of labeling, with the goal to combine the potentials of humans and machines to make labeling more efficient. Recent experiments showed that users apply different strategies when selecting instances for labeling with visual‐interactive interfaces. In this paper, we contribute a systematic quantitative analysis of such user strategies. We identify computational building blocks of user strategies, formalize them, and investigate their potentials for different machine learning tasks in systematic experiments. The core insights of our experiments are as follows. First, we identified that particular user strategies can be used to considerably mitigate the bootstrap (cold start) problem in early labeling phases. Second, we observed that they have the potential to outperform existing active learning strategies in later phases. Third, we analyzed the identified core building blocks, which can serve as the basis for novel selection strategies. Overall, we observed that data‐based user strategies (clusters, dense areas) work considerably well in early phases, while model‐based user strategies (e.g., class separation) perform better during later phases. The insights gained from this work can be applied to develop novel active learning approaches as well as to better guide users in visual interactive labeling.

Journal ArticleDOI
TL;DR: A novel physically consistent implicit solver for the simulation of highly viscous fluids using the Smoothed Particle Hydrodynamics (SPH) formalism and it is demonstrated that the solver outperforms former approaches in terms of physical accuracy and memory consumption while it is comparable in Terms of computational performance.
Abstract: In this paper, we present a novel physically consistent implicit solver for the simulation of highly viscous fluids using the Smoothed Particle Hydrodynamics (SPH) formalism. Our method is the result of a theoretical and practical in‐depth analysis of the most recent implicit SPH solvers for viscous materials. Based on our findings, we developed a list of requirements that are vital to produce a realistic motion of a viscous fluid. These essential requirements include momentum conservation, a physically meaningful behavior under temporal and spatial refinement, the absence of ghost forces induced by spurious viscosities and the ability to reproduce complex physical effects that can be observed in nature. On the basis of several theoretical analyses, quantitative academic comparisons and complex visual experiments we show that none of the recent approaches is able to satisfy all requirements. In contrast, our proposed method meets all demands and therefore produces realistic animations in highly complex scenarios. We demonstrate that our solver outperforms former approaches in terms of physical accuracy and memory consumption while it is comparable in terms of computational performance. In addition to the implicit viscosity solver, we present a method to simulate melting objects. Therefore, we generalize the viscosity model to a spatially varying viscosity field and provide an SPH discretization of the heat equation.

Journal ArticleDOI
TL;DR: An overview of state of the art in multi‐modal medical data visualization techniques is given and the specific challenges that arise and how recent works aimed to solve these are analysed, often using smart visibility techniques.
Abstract: Multi-modal data of the complex human anatomy contain a wealth of information. To visualize and explore such data, techniques for emphasizing important structures and controlling visibility are essential. Such fused overview visualizations guide physicians to suspicious regions to be analysed in detail, e.g. with slice-based viewing. We give an overview of state of the art in multi-modal medical data visualization techniques. Multi-modal medical data consist of multiple scans of the same subject using various acquisition methods, often combining multiple complimentary types of information. Three-dimensional visualization techniques for multi-modal medical data can be used in diagnosis, treatment planning, doctor–patient communication as well as interdisciplinary communication. Over the years, multiple techniques have been developed in order to cope with the various associated challenges and present the relevant information from multiple sources in an insightful way. We present an overview of these techniques and analyse the specific challenges that arise in multi-modal data visualization and how recent works aimed to solve these, often using smart visibility techniques. We provide a taxonomy of these multi-modal visualization applications based on the modalities used and the visualization techniques employed. Additionally, we identify unsolved problems as potential future research directions.

Journal ArticleDOI
Sen-Zhe Xu1, Jun Hu1, Miao Wang1, Tai-Jiang Mu1, Shi-Min Hu1 
TL;DR: A novel online deep learning framework to learn the stabilization transformation for each unsteady frame, given historical steady frames, composed of a generative network with spatial transformer networks embedded in different layers, and generates a stable frame for the incoming unstable frame by computing an appropriate affine transformation.
Abstract: Video stabilization is necessary for many hand‐held shot videos. In the past decades, although various video stabilization methods were proposed based on the smoothing of 2D, 2.5D or 3D camera paths, hardly have there been any deep learning methods to solve this problem. Instead of explicitly estimating and smoothing the camera path, we present a novel online deep learning framework to learn the stabilization transformation for each unsteady frame, given historical steady frames. Our network is composed of a generative network with spatial transformer networks embedded in different layers, and generates a stable frame for the incoming unstable frame by computing an appropriate affine transformation. We also introduce an adversarial network to determine the stability of apiece of video. The network is trained directly using the pair of steady and unsteady videos. Experiments show that our method can produce similar results as traditional methods, moreover, it is capable of handling challenging unsteady video of low quality, where traditional methods fail, such as video with heavy noise or multiple exposures. Our method runs in real time, which is much faster than traditional methods.