Visualization and Visual Analysis of Multifaceted Scientific Data: A Survey
01 Mar 2013-IEEE Transactions on Visualization and Computer Graphics (IEEE)-Vol. 19, Iss: 3, pp 495-513
TL;DR: This survey studies existing methods for visualization and interactive visual analysis of multifaceted scientific data and suggests new solutions for multirun and multimodel data as well as techniques that support a multitude of facets.
Abstract: Visualization and visual analysis play important roles in exploring, analyzing, and presenting scientific data. In many disciplines, data and model scenarios are becoming multifaceted: data are often spatiotemporal and multivariate; they stem from different data sources (multimodal data), from multiple simulation runs (multirun/ensemble data), or from multiphysics simulations of interacting phenomena (multimodel data resulting from coupled simulation models). Also, data can be of different dimensionality or structured on various types of grids that need to be related or fused in the visualization. This heterogeneity of data characteristics presents new opportunities as well as technical challenges for visualization research. Visualization and interaction techniques are thus often combined with computational analysis. In this survey, we study existing methods for visualization and interactive visual analysis of multifaceted scientific data. Based on a thorough literature review, a categorization of approaches is proposed. We cover a wide range of fields and discuss to which degree the different challenges are matched with existing solutions for visualization and visual analysis. This leads to conclusions with respect to promising research directions, for instance, to pursue new solutions for multirun and multimodel data as well as techniques that support a multitude of facets.
Citations
More filters
••
TL;DR: This work provides guidance for data practitioners to navigate through a modular view of the recent advances in high-dimensional data visualization, inspiring the creation of new visualizations along the enriched visualization pipeline, and identifying future opportunities for visualization research.
Abstract: Massive simulations and arrays of sensing devices, in combination with increasing computing resources, have generated large, complex, high-dimensional datasets used to study phenomena across numerous fields of study. Visualization plays an important role in exploring such datasets. We provide a comprehensive survey of advances in high-dimensional data visualization that focuses on the past decade. We aim at providing guidance for data practitioners to navigate through a modular view of the recent advances, inspiring the creation of new visualizations along the enriched visualization pipeline, and identifying future opportunities for visualization research.
253 citations
••
TL;DR: This work introduces a framework for supervised segmentation based on multiple modality intensity, geometry, and asymmetry feature sets that interface the supervised learning capabilities of the random forest model with regularized probabilistic segmentation using the recently developed ANTsR package.
Abstract: Segmenting and quantifying gliomas from MRI is an important task for diagnosis, planning intervention, and for tracking tumor changes over time. However, this task is complicated by the lack of prior knowledge concerning tumor location, spatial extent, shape, possible displacement of normal tissue, and intensity signature. To accommodate such complications, we introduce a framework for supervised segmentation based on multiple modality intensity, geometry, and asymmetry feature sets. These features drive a supervised whole-brain and tumor segmentation approach based on random forest-derived probabilities. The asymmetry-related features (based on optimal symmetric multimodal templates) demonstrate excellent discriminative properties within this framework. We also gain performance by generating probability maps from random forest models and using these maps for a refining Markov random field regularized probabilistic segmentation. This strategy allows us to interface the supervised learning capabilities of the random forest model with regularized probabilistic segmentation using the recently developed ANTsR package--a comprehensive statistical and visualization interface between the popular Advanced Normalization Tools (ANTs) and the R statistical project. The reported algorithmic framework was the top-performing entry in the MICCAI 2013 Multimodal Brain Tumor Segmentation challenge. The challenge data were widely varying consisting of both high-grade and low-grade glioma tumor four-modality MRI from five different institutions. Average Dice overlap measures for the final algorithmic assessment were 0.87, 0.78, and 0.74 for "complete", "core", and "enhanced" tumor components, respectively.
245 citations
Cites background from "Visualization and Visual Analysis o..."
...Similarly, concomitant with the era of “big data” (specifically with respect to neuroimaging (VanHorn and Toga 2013)) are new visualization needs and challenges (Childs et al. 2013; Kehrer and Hauser 2013)....
[...]
01 Jan 2015
TL;DR: A comprehensive survey of advances in high-dimensional data visualization that focuses on the past decade is provided in this article, with guidance for data practitioners to navigate through a modular view of the recent advances, inspiring the creation of new visualizations along the enriched visualization pipeline, and identifying future opportunities for visualization research.
Abstract: Massive simulations and arrays of sensing devices, in combination with increasing computing resources, have generated large, complex, high-dimensional datasets used to study phenomena across numerous fields of study. Visualization plays an important role in exploring such datasets. We provide a comprehensive survey of advances in high-dimensional data visualization that focuses on the past decade. We aim at providing guidance for data practitioners to navigate through a modular view of the recent advances, inspiring the creation of new visualizations along the enriched visualization pipeline, and identifying future opportunities for visualization research.
220 citations
••
TL;DR: This work characterize the input data space, projection techniques, and the quality of projections, by several quantitative metrics, and samples these three spaces according to these metrics, aiming at good coverage with bounded effort.
Abstract: Dimensionality reduction methods, also known as projections, are frequently used in multidimensional data exploration in machine learning, data science, and information visualization. Tens of such techniques have been proposed, aiming to address a wide set of requirements, such as ability to show the high-dimensional data structure, distance or neighborhood preservation, computational scalability, stability to data noise and/or outliers, and practical ease of use. However, it is far from clear for practitioners how to choose the best technique for a given use context. We present a survey of a wide body of projection techniques that helps answering this question. For this, we characterize the input data space, projection techniques, and the quality of projections, by several quantitative metrics. We sample these three spaces according to these metrics, aiming at good coverage with bounded effort. We describe our measurements and outline observed dependencies of the measured variables. Based on these results, we draw several conclusions that help comparing projection techniques, explain their results for different types of data, and ultimately help practitioners when choosing a projection for a given context. Our methodology, datasets, projection implementations, metrics, visualizations, and results are publicly open, so interested stakeholders can examine and/or extend this benchmark.
188 citations
Cites background or methods from "Visualization and Visual Analysis o..."
...As such, high-dimensional visualization has become an important sub-field of Information Visualization (InfoVis) [1], [2], [3], [4]....
[...]
...[3] present a survey of methods for visualization of so-called “multi-faceted” scientific data....
[...]
••
TL;DR: This study attempts to employ visual analytics that combines the state-of-the-art mining and visualization techniques to tackle the problem of formulating solutions immediately and comparing them rapidly for billboard placements using large-scale GPS trajectory data.
Abstract: The problem of formulating solutions immediately and comparing them rapidly for billboard placements has plagued advertising planners for a long time, owing to the lack of efficient tools for in-depth analyses to make informed decisions. In this study, we attempt to employ visual analytics that combines the state-of-the-art mining and visualization techniques to tackle this problem using large-scale GPS trajectory data. In particular, we present SmartAdP, an interactive visual analytics system that deals with the two major challenges including finding good solutions in a huge solution space and comparing the solutions in a visual and intuitive manner. An interactive framework that integrates a novel visualization-driven data mining model enables advertising planners to effectively and efficiently formulate good candidate solutions. In addition, we propose a set of coupled visualizations: a solution view with metaphor-based glyphs to visualize the correlation between different solutions; a location view to display billboard locations in a compact manner; and a ranking view to present multi-typed rankings of the solutions. This system has been demonstrated using case studies with a real-world dataset and domain-expert interviews. Our approach can be adapted for other location selection problems such as selecting locations of retail stores or restaurants using trajectory data.
165 citations
References
More filters
•
[...]
TL;DR: The Self-Organising Map (SOM) algorithm was introduced by the author in 1981 as mentioned in this paper, and many applications form one of the major approaches to the contemporary artificial neural networks field, and new technologies have already been based on it.
Abstract: The Self-Organising Map (SOM) algorithm was introduced by the author in 1981. Its theory and many applications form one of the major approaches to the contemporary artificial neural networks field, and new technologies have already been based on it. The most important practical applications are in exploratory data analysis, pattern recognition, speech analysis, robotics, industrial and medical diagnostics, instrumentation, and control, and literally hundreds of other tasks. In this monograph the mathematical preliminaries, background, basic ideas, and implications are expounded in a manner which is accessible without prior expert knowledge.
12,920 citations
••
TL;DR: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research, and a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods.
Abstract: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research. Chapter 12 concludes the book with some commentary about the scienti c contributions of MTS. The Taguchi method for design of experiment has generated considerable controversy in the statistical community over the past few decades. The MTS/MTGS method seems to lead another source of discussions on the methodology it advocates (Montgomery 2003). As pointed out by Woodall et al. (2003), the MTS/MTGS methods are considered ad hoc in the sense that they have not been developed using any underlying statistical theory. Because the “normal” and “abnormal” groups form the basis of the theory, some sampling restrictions are fundamental to the applications. First, it is essential that the “normal” sample be uniform, unbiased, and/or complete so that a reliable measurement scale is obtained. Second, the selection of “abnormal” samples is crucial to the success of dimensionality reduction when OAs are used. For example, if each abnormal item is really unique in the medical example, then it is unclear how the statistical distance MD can be guaranteed to give a consistent diagnosis measure of severity on a continuous scale when the larger-the-better type S/N ratio is used. Multivariate diagnosis is not new to Technometrics readers and is now becoming increasingly more popular in statistical analysis and data mining for knowledge discovery. As a promising alternative that assumes no underlying data model, The Mahalanobis–Taguchi Strategy does not provide suf cient evidence of gains achieved by using the proposed method over existing tools. Readers may be very interested in a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods. Overall, although the idea of MTS/MTGS is intriguing, this book would be more valuable had it been written in a rigorous fashion as a technical reference. There is some lack of precision even in several mathematical notations. Perhaps a follow-up with additional theoretical justi cation and careful case studies would answer some of the lingering questions.
11,507 citations
••
TL;DR: A Computer Movie Simulating Urban Growth in the Detroit Region as discussed by the authors was made to simulate urban growth in the city of Detroit, Michigan, United States of America, 1970, 1970.
Abstract: (1970). A Computer Movie Simulating Urban Growth in the Detroit Region. Economic Geography: Vol. 46, PROCEEDINGS International Geographical Union Commission on Quantitative Methods, pp. 234-240.
7,533 citations