DataSite: Proactive visual data exploration with computation of insight-based recommendations:

doi:10.1177/1473871618806555

Home
/
Papers
/
DataSite: Proactive visual data exploration with computation of insight-based recommendations:

Journal Article•DOI•

DataSite: Proactive visual data exploration with computation of insight-based recommendations:

Zhe Cui¹, Sriram Karthik Badam¹, M. Adil Yalçin, Niklas Elmqvist¹•Institutions (1)

University of Maryland, College Park¹

01 Apr 2019-Information Visualization (SAGE PublicationsSage UK: London, England)-Vol. 18, Iss: 2, pp 251-267

TL;DR: In this paper, the authors propose that effective data analysis ideally requires the analyst to have high expertise as well as high knowledge of the data, even with such familiarity, manually pursuing all potential hypotheses and explor...

read less

Abstract: Effective data analysis ideally requires the analyst to have high expertise as well as high knowledge of the data. Even with such familiarity, manually pursuing all potential hypotheses and explori...

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Augmenting Visualizations with Interactive Data Facts to Facilitate Interpretation and Communication

[...]

Arjun Srinivasan¹, Steven M. Drucker², Alex Endert¹, John Stasko¹•Institutions (2)

Georgia Institute of Technology¹, Microsoft²

01 Jan 2019-IEEE Transactions on Visualization and Computer Graphics

TL;DR: Voder is presented, a system that lets users interact with automatically-generated data facts to explore both alternative visualizations to convey a data fact as well as a set of embellishments to highlight a fact within a visualization.

...read moreread less

Abstract: Recently, an increasing number of visualization systems have begun to incorporate natural language generation (NLG) capabilities into their interfaces. NLG-based visualization systems typically leverage a suite of statistical functions to automatically extract key facts about the underlying data and surface them as natural language sentences alongside visualizations. With current systems, users are typically required to read the system-generated sentences and mentally map them back to the accompanying visualization. However, depending on the features of the visualization (e.g., visualization type, data density) and the complexity of the data fact, mentally mapping facts to visualizations can be a challenging task. Furthermore, more than one visualization could be used to illustrate a single data fact. Unfortunately, current tools provide little or no support for users to explore such alternatives. In this paper, we explore how system-generated data facts can be treated as interactive widgets to help users interpret visualizations and communicate their findings. We present Voder , a system that lets users interact with automatically-generated data facts to explore both alternative visualizations to convey a data fact as well as a set of embellishments to highlight a fact within a visualization. Leveraging data facts as interactive widgets, Voder also facilitates data fact-based visualization search. To assess Voder's design and features, we conducted a preliminary user study with 12 participants having varying levels of experience with visualization tools. Participant feedback suggested that interactive data facts aided them in interpreting visualizations. Participants also stated that the suggestions surfaced through the facts helped them explore alternative visualizations and embellishments to communicate individual data facts.

...read moreread less

132 citations

Journal Article•DOI•

DataShot: Automatic Generation of Fact Sheets from Tabular Data

[...]

Yun Wang¹, Zhida Sun², Haidong Zhang¹, Weiwei Cui¹, Ke Xu², Xiaojuan Ma², Dongmei Zhang¹ - Show less +3 more•Institutions (2)

Microsoft¹, Hong Kong University of Science and Technology²

01 Jan 2020-IEEE Transactions on Visualization and Computer Graphics

TL;DR: This work presents DataShot, the first automated system that creates fact sheets automatically from tabular data, and proposes a fact sheet generation pipeline, consisting of fact extraction, fact composition, and presentation synthesis, for the auto-generation workflow.

...read moreread less

Abstract: Fact sheets with vivid graphical design and intriguing statistical insights are prevalent for presenting raw data. They help audiences understand data-related facts effectively and make a deep impression. However, designing a fact sheet requires both data and design expertise and is a laborious and time-consuming process. One needs to not only understand the data in depth but also produce intricate graphical representations. To assist in the design process, we present DataShot which, to the best of our knowledge, is the first automated system that creates fact sheets automatically from tabular data. First, we conduct a qualitative analysis of 245 infographic examples to explore general infographic design space at both the sheet and element levels. We identify common infographic structures, sheet layouts, fact types, and visualization styles during the study. Based on these findings, we propose a fact sheet generation pipeline, consisting of fact extraction, fact composition, and presentation synthesis, for the auto-generation workflow. To validate our system, we present use cases with three real-world datasets. We conduct an in-lab user study to understand the usage of our system. Our evaluation results show that DataShot can efficiently generate satisfactory fact sheets to support further customization and data presentation.

...read moreread less

110 citations

Journal Article•DOI•

Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements

[...]

Weiwei Cui¹, Xiaoyu Zhang², Yun Wang¹, He Huang¹, Bei Chen¹, Lei Fang¹, Haidong Zhang¹, Jian-Guan Lou¹, Dongmei Zhang¹ - Show less +5 more•Institutions (2)

Microsoft¹, University of California, Davis²

01 Jan 2020-IEEE Transactions on Visualization and Computer Graphics

TL;DR: In this paper, a proof-of-concept system that automatically converts statements about simple proportion-related statistics to a set of infographics with pre-designed styles is presented, based on the preliminary study.

...read moreread less

Abstract: Combining data content with visual embellishments, infographics can effectively deliver messages in an engaging and memorable manner. Various authoring tools have been proposed to facilitate the creation of infographics. However, creating a professional infographic with these authoring tools is still not an easy task, requiring much time and design expertise. Therefore, these tools are generally not attractive to casual users, who are either unwilling to take time to learn the tools or lacking in proper design expertise to create a professional infographic. In this paper, we explore an alternative approach: to automatically generate infographics from natural language statements. We first conducted a preliminary study to explore the design space of infographics. Based on the preliminary study, we built a proof-of-concept system that automatically converts statements about simple proportion-related statistics to a set of infographics with pre-designed styles. Finally, we demonstrated the usability and usefulness of the system through sample results, exhibits, and expert reviews.

...read moreread less

79 citations

Journal Article•DOI•

Calliope: Automatic Visual Data Story Generation from a Spreadsheet

[...]

Danqing Shi¹, Xinyue Xu¹, Fuling Sun¹, Yang Shi¹, Nan Cao¹ - Show less +1 more•Institutions (1)

Tongji University¹

28 Jan 2021-IEEE Transactions on Visualization and Computer Graphics

TL;DR: This paper introduces a novel visual data story generating system, Calliope, which creates visual data stories from an input spreadsheet through an automatic process and facilities the easy revision of the generated story based on an online story editor.

...read moreread less

Abstract: Visual data stories shown in the form of narrative visualizations such as a poster or a data video, are frequently used in data-oriented storytelling to facilitate the understanding and memorization of the story content. Although useful, technique barriers, such as data analysis, visualization, and scripting, make the generation of a visual data story difficult. Existing authoring tools rely on users' skills and experiences, which are usually inefficient and still difficult. In this paper, we introduce a novel visual data story generating system, Calliope, which creates visual data stories from an input spreadsheet through an automatic process and facilities the easy revision of the generated story based on an online story editor. Particularly, Calliope incorporates a new logic-oriented Monte Carlo tree search algorithm that explores the data space given by the input spreadsheet to progressively generate story pieces (i.e., data facts) and organize them in a logical order. The importance of data facts is measured based on information theory, and each data fact is visualized in a chart and captioned by an automatically generated description. We evaluate the proposed technique through three example stories, two controlled experiments, and a series of interviews with 10 domain experts. Our evaluation shows that Calliope is beneficial to efficient visual data story generation.

...read moreread less

77 citations

Proceedings Article•DOI•

QuickInsights: Quick and Automatic Discovery of Insights from Multi-Dimensional Data

[...]

Rui Ding¹, Shi Han¹, Yong Xu¹, Haidong Zhang¹, Dongmei Zhang¹ - Show less +1 more•Institutions (1)

Microsoft¹

25 Jun 2019

TL;DR: This work proposes a unified formulation of interesting patterns, called insights, and designs a systematic mining framework to discover high-quality insights efficiently, and demonstrates the effectiveness and efficiency of QuickInsights through evaluation on 447 real datasets as well as user studies on both expert users and non-expert users.

...read moreread less

Abstract: Discovering interesting data patterns is a common and important analytical need in data, with increasing user demand for automated discovery abilities. However, automatically discovering interesting patterns from multi-dimensional data remains challenging. Existing techniques focus on mining individual types of patterns. There is a lack of unified formulation for different pattern types, as well as general mining frameworks to derive them effectively and efficiently. We present a novel technique QuickInsights, which quickly and automatically discovers interesting patterns from multi-dimensional data. QuickInsights proposes a unified formulation of interesting patterns, called insights, and designs a systematic mining framework to discover high-quality insights efficiently. We demonstrate the effectiveness and efficiency of QuickInsights through our evaluation on 447 real datasets as well as user studies on both expert users and non-expert users. QuickInsights is released in Microsoft Power BI.

...read moreread less

63 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Random effects structure for confirmatory hypothesis testing: Keep it maximal

[...]

Dale J. Barr¹, Roger Levy², Christoph Scheepers¹, Harry Tily•Institutions (2)

University of Glasgow¹, University of California, San Diego²

01 Apr 2013-Journal of Memory and Language

TL;DR: It is argued that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades, and it is shown thatLMEMs generalize best when they include the maximal random effects structure justified by the design.

...read moreread less

6,878 citations

Journal Article•DOI•

Evaluating collaborative filtering recommender systems

[...]

Jonathan L. Herlocker¹, Joseph A. Konstan², Loren Terveen², John Riedl²•Institutions (2)

Oregon State University¹, University of Minnesota²

01 Jan 2004-ACM Transactions on Information Systems

TL;DR: The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.

...read moreread less

Abstract: Recommender systems have been evaluated in many, often incomparable, ways. In this article, we review the key decisions in evaluating collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency class were strongly correlated, while metrics from different equivalency classes were uncorrelated.

...read moreread less

5,686 citations

Journal Article•DOI•

D³ Data-Driven Documents

[...]

Michael Bostock¹, Vadim Ogievetsky¹, Jeffrey Heer¹•Institutions (1)

Stanford University¹

01 Dec 2011-IEEE Transactions on Visualization and Computer Graphics

TL;DR: This work shows how representational transparency improves expressiveness and better integrates with developer tools than prior approaches, while offering comparable notational efficiency and retaining powerful declarative components.

...read moreread less

Abstract: Data-Driven Documents (D3) is a novel representation-transparent approach to visualization for the web Rather than hide the underlying scenegraph within a toolkit-specific abstraction, D3 enables direct inspection and manipulation of a native representation: the standard document object model (DOM) With D3, designers selectively bind input data to arbitrary document elements, applying dynamic transforms to both generate and modify content We show how representational transparency improves expressiveness and better integrates with developer tools than prior approaches, while offering comparable notational efficiency and retaining powerful declarative components Immediate evaluation of operators further simplifies debugging and allows iterative development Additionally, we demonstrate how D3 transforms naturally enable animation and interaction with dramatic performance improvements over intermediate representations

...read moreread less

2,550 citations

Journal Article•DOI•

Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods

[...]

William S. Cleveland¹, Robert McGill¹•Institutions (1)

Bell Labs¹

01 Sep 1984-Journal of the American Statistical Association

TL;DR: The approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory, providing a guideline for graph construction.

...read moreread less

Abstract: The subject of graphical methods for data analysis and for data presentation needs a scientific foundation. In this article we take a few steps in the direction of establishing such a foundation. Our approach is based on graphical perception—the visual decoding of information encoded on graphs—and it includes both theory and experimentation to test the theory. The theory deals with a small but important piece of the whole process of graphical perception. The first part is an identification of a set of elementary perceptual tasks that are carried out when people extract quantitative information from graphs. The second part is an ordering of the tasks on the basis of how accurately people perform them. Elements of the theory are tested by experimentation in which subjects record their judgments of the quantitative information on graphs. The experiments validate these elements but also suggest that the set of elementary tasks should be expanded. The theory provides a guideline for graph construction...

...read moreread less

1,545 citations

Journal Article•DOI•

Automating the design of graphical presentations of relational information

[...]

Jock D. Mackinlay¹•Institutions (1)

Stanford University¹

01 Apr 1986-ACM Transactions on Graphics

TL;DR: APT as discussed by the authors is an application-independent presentation tool that automatically designs effective graphical presentations (such as bar charts, scatter plots, and connected graphs) of relational information, based on the view that graphical presentations are sentences of graphical languages.

...read moreread less

Abstract: The goal of the research described in this paper is to develop an application-independent presentation tool that automatically designs effective graphical presentations (such as bar charts, scatter plots, and connected graphs) of relational information. Two problems are raised by this goal: The codification of graphic design criteria in a form that can be used by the presentation tool, and the generation of a wide variety of designs so that the presentation tool can accommodate a wide variety of information. The approach described in this paper is based on the view that graphical presentations are sentences of graphical languages. The graphic design issues are codified as expressiveness and effectiveness criteria for graphical languages. Expressiveness criteria determine whether a graphical language can express the desired information. Effectiveness criteria determine whether a graphical language exploits the capabilities of the output medium and the human visual system. A wide variety of designs can be systematically generated by using a composition algebra that composes a small set of primitive graphical languages. Artificial intelligence techniques are used to implement a prototype presentation tool called APT (A Presentation Tool), which is based on the composition algebra and the graphic design criteria.

...read moreread less

1,483 citations