scispace - formally typeset
Search or ask a question
Author

Bum Chul Kwon

Bio: Bum Chul Kwon is an academic researcher from IBM. The author has contributed to research in topics: Visual analytics & Visualization. The author has an hindex of 18, co-authored 49 publications receiving 1377 citations. Previous affiliations of Bum Chul Kwon include Purdue University & University of Konstanz.


Papers
More filters
Journal ArticleDOI
TL;DR: A knowledge generation model for visual analytics is proposed that ties together these diverse frameworks, yet retains previously developed models (e.g., KDD process) to describe individual segments of the overall visual analytic processes.
Abstract: Visual analytics enables us to analyze huge information spaces in order to support complex decision making and data exploration. Humans play a central role in generating knowledge from the snippets of evidence emerging from visual data analysis. Although prior research provides frameworks that generalize this process, their scope is often narrowly focused so they do not encompass different perspectives at different levels. This paper proposes a knowledge generation model for visual analytics that ties together these diverse frameworks, yet retains previously developed models (e.g., KDD process) to describe individual segments of the overall visual analytic processes. To test its utility, a real world visual analytics system is compared against the model, demonstrating that the knowledge generation process model provides a useful guideline when developing and evaluating such systems. The model is used to effectively compare different data analysis systems. Furthermore, the model provides a common language and description of visual analytic processes, which can be used for communication between researchers. At the end, our model reflects areas of research that future researchers can embark on.

340 citations

Journal ArticleDOI
TL;DR: This paper unpacks the uncertainties that propagate through visual analytics systems, illustrates how human's perceptual and cognitive biases influence the user's awareness of such uncertainties, and how this affects the users' trust building.
Abstract: Visual analytics supports humans in generating knowledge from large and often complex datasets. Evidence is collected, collated and cross-linked with our existing knowledge. In the process, a myriad of analytical and visualisation techniques are employed to generate a visual representation of the data. These often introduce their own uncertainties, in addition to the ones inherent in the data, and these propagated and compounded uncertainties can result in impaired decision making. The user's confidence or trust in the results depends on the extent of user's awareness of the underlying uncertainties generated on the system side. This paper unpacks the uncertainties that propagate through visual analytics systems, illustrates how human's perceptual and cognitive biases influence the user's awareness of such uncertainties, and how this affects the user's trust building. The knowledge generation model for visual analytics is used to provide a terminology and framework to discuss the consequences of these aspects in knowledge construction and though examples, machine uncertainty is compared to human trust measures with provenance. Furthermore, guidelines for the design of uncertainty-aware systems are presented that can aid the user in better decision making.

238 citations

Journal ArticleDOI
TL;DR: This study designs a visual analytics solution to increase interpretability and interactivity of RNNs via a joint effort of medical experts, artificial intelligence scientists, and visual analytics researchers, and demonstrates how it made substantial changes to the state-of-the-art RNN model called RETAIN in order to make use of temporal information and increase interactivity.
Abstract: We have recently seen many successful applications of recurrent neural networks (RNNs) on electronic medical records (EMRs), which contain histories of patients' diagnoses, medications, and other various events, in order to predict the current and future states of patients. Despite the strong performance of RNNs, it is often challenging for users to understand why the model makes a particular prediction. Such black-box nature of RNNs can impede its wide adoption in clinical practice. Furthermore, we have no established methods to interactively leverage users' domain expertise and prior knowledge as inputs for steering the model. Therefore, our design study aims to provide a visual analytics solution to increase interpretability and interactivity of RNNs via a joint effort of medical experts, artificial intelligence scientists, and visual analytics researchers. Following the iterative design process between the experts, we design, implement, and evaluate a visual analytics tool called RetainVis, which couples a newly improved, interpretable, and interactive RNN-based model called RetainEX and visualizations for users' exploration of EMR data in the context of prediction tasks. Our study shows the effective use of RetainVis for gaining insights into how individual medical codes contribute to making risk predictions, using EMRs of patients with heart failure and cataract symptoms. Our study also demonstrates how we made substantial changes to the state-of-the-art RNN model called RETAIN in order to make use of temporal information and increase interactivity. This study will provide a useful guideline for researchers that aim to design an interpretable and interactive visual analytics tool for RNNs.

185 citations

Journal ArticleDOI
TL;DR: This work systematically developed a visualization literacy assessment test (VLAT), especially for non-expert users in data visualization, by following the established procedure of test development in Psychological and Educational Measurement.
Abstract: The Information Visualization community has begun to pay attention to visualization literacy; however, researchers still lack instruments for measuring the visualization literacy of users. In order to address this gap, we systematically developed a visualization literacy assessment test (VLAT), especially for non-expert users in data visualization, by following the established procedure of test development in Psychological and Educational Measurement: (1) Test Blueprint Construction, (2) Test Item Generation, (3) Content Validity Evaluation, (4) Test Tryout and Item Analysis, (5) Test Item Selection, and (6) Reliability Evaluation. The VLAT consists of 12 data visualizations and 53 multiple-choice test items that cover eight data visualization tasks. The test items in the VLAT were evaluated with respect to their essentialness by five domain experts in Information Visualization and Visual Analytics (average content validity ratio = 0.66). The VLAT was also tried out on a sample of 191 test takers and showed high reliability (reliability coefficient omega = 0.76). In addition, we demonstrated the relationship between users' visualization literacy and aptitude for learning an unfamiliar visualization and showed that they had a fairly high positive relationship (correlation coefficient = 0.64). Finally, we discuss evidence for the validity of the VLAT and potential research areas that are related to the instrument.

124 citations

Journal ArticleDOI
TL;DR: Clustervision is a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available and empowers users to choose an effective representation of their complex data.
Abstract: Clustering, the process of grouping together similar items into distinct partitions, is a common type of unsupervised machine learning that can be useful for summarizing and aggregating complex multi-dimensional data. However, data can be clustered in many ways, and there exist a large body of algorithms designed to reveal different patterns. While having access to a wide variety of algorithms is helpful, in practice, it is quite difficult for data scientists to choose and parameterize algorithms to get the clustering results relevant for their dataset and analytical tasks. To alleviate this problem, we built Clustervision , a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available. Our system clusters data using a variety of clustering techniques and parameters and then ranks clustering results utilizing five quality metrics. In addition, users can guide the system to produce more relevant results by providing task-relevant constraints on the data. Our visual user interface allows users to find high quality clustering results, explore the clusters using several coordinated visualization techniques, and select the cluster result that best suits their task. We demonstrate this novel approach using a case study with a team of researchers in the medical domain and showcase that our system empowers users to choose an effective representation of their complex data.

120 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Reading a book as this basics of qualitative research grounded theory procedures and techniques and other references can enrich your life quality.

13,415 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: In this large, community-based sample, increased body-mass index was associated with an increased risk of heart failure and strategies to promote optimal body weight may reduce the population burden ofheart failure.

1,388 citations