An assessment of the state and historic development of evaluation practices as reported in papers published at the IEEE Visualization conference found that evaluations specific to assessing resulting images and algorithm performance are the most prevalent and generally the studies reporting requirements analyses and domain-specific work practices are too informally reported.
Abstract:
We present an assessment of the state and historic development of evaluation practices as reported in papers published at the IEEE Visualization conference. Our goal is to reflect on a meta-level about evaluation in our community through a systematic understanding of the characteristics and goals of presented evaluations. For this purpose we conducted a systematic review of ten years of evaluations in the published papers using and extending a coding scheme previously established by Lam et al. [2012]. The results of our review include an overview of the most common evaluation goals in the community, how they evolved over time, and how they contrast or align to those of the IEEE Information Visualization conference. In particular, we found that evaluations specific to assessing resulting images and algorithm performance are the most prevalent (with consistently 80-90% of all papers since 1997). However, especially over the last six years there is a steady increase in evaluation methods that include participants, either by evaluating their performances and subjective feedback or by evaluating their work practices and their improved analysis and reasoning capabilities using visual tools. Up to 2010, this trend in the IEEE Visualization conference was much more pronounced than in the IEEE Information Visualization conference which only showed an increasing percentage of evaluation through user performance and experience testing. Since 2011, however, also papers in IEEE Information Visualization show such an increase of evaluations of work practices and analysis as well as reasoning using visual tools. Further, we found that generally the studies reporting requirements analyses and domain-specific work practices are too informally reported which hinders cross-comparison and lowers external validity.
TL;DR: In this article, the authors present a textbook-style introduction to the processes required for use in university teaching and for self-study purposes by people working in the field of IT system development.
TL;DR: This work systematically studied the visual analytics and visualization literature to investigate how analysts interact with automatic DR techniques, and proposes a “human in the loop” process model that provides a general lens for the evaluation of visual interactive DR systems.
TL;DR: This organization is intended to serve as a framework to help researchers specify types of provenance and coordinate design knowledge across projects and can be used to guide the selection of evaluation methodology and the comparison of study outcomes in provenance research.
TL;DR: The framework is based on the author's own experience and a structured analysis of the visualization literature, and contains a data flow model that helps to abstractly describe visual parameter space analysis problems independent of their application domain.
TL;DR: A novel, nonparametric method for summarizing ensembles of 2D and 3D curves is presented and an extension of a method from descriptive statistics, data depth, to curves is proposed, which is a generalization of traditional whisker plots or boxplots to multidimensional curves.
TL;DR: History Conceptual Foundations Uses and Kinds of Inference The Logic of Content Analysis Designs Unitizing Sampling Recording Data Languages Constructs for Inference Analytical Techniques The Use of Computers Reliability Validity A Practical Guide
TL;DR: This Discussion focuses on the design of the methodology section of a Qualitative Research Study, which involves mining data from Documents and Artifacts and dealing with Validity, Reliability, and Ethics.
TL;DR: In this paper, the authors present a methodology for the collection and reporting of qualitative data from documents, dealing with reliability, reliability, and ethics issues in a qualitative research study. But they focus on the qualitative case studies.
TL;DR: The Open Society and Its Enemies as discussed by the authors is regarded as one of Popper's most enduring books and contains insights and arguments that demand to be read to this day, as well as many of the ideas in the book.
TL;DR: The Sixth Edition of Designing the User Interface provides a comprehensive, authoritative, and up-to-date introduction to the dynamic field of human-computer interaction and user experience (UX) design.
Q1. What have the authors contributed in "A systematic review on the practice of evaluating visualization" ?
The authors present an assessment of the state and historic development of evaluation practices as reported in papers published at the IEEE Visualization conference. For this purpose the authors conducted a systematic review of ten years of evaluations in the published papers using and extending a coding scheme previously established by Lam et al. [ 2012 ]. In particular, the authors found that evaluations specific to assessing resulting images and algorithm performance are the most prevalent ( with consistently 80–90 % of all papers since 1997 ). Further, the authors found that generally the studies reporting requirements analyses and domain-specific work practices are too informally reported which hinders cross-comparison and lowers external validity.
Q2. What future works have the authors mentioned in the paper "A systematic review on the practice of evaluating visualization" ?
It will also certainly be interesting to extend this analysis to papers published at other visualization venues such as the Eurographics Conference on Visualization ( EuroVis ) as a place that does not make a dedicated difference between ‘ scientific ’ and ‘ information ’ visualization, to see whether similar trends exist. Also, a separate analysis of the IEEE Conference on Visual Analytics Science and Technology would be of value to see whether there is a stronger emphasis on UWP and VDAR processes as well as on collaborative visualization ( CTV/CDA ), as suggested by its name and agenda. Still, in a few years comparing VAST to their data will be able to given an even more complete picture of evaluation practices in the visualization community.
Q3. What is the importance of reporting on study protocols?
Reporting on study protocols: Especially for AP, UP, and UE studies it is important to follow established reporting protocols to facilitate reproducibility and comparability.
Q4. What is the common pitfall in UE evaluations?
One prevalent pitfall the authors observed that relates to the issue of rigor was to consider positive subjective judgments from domain experts as a sufficient form of evaluation.
Q5. What is the main argument for the use of domain expert feedback?
In addition, Tory and Möller [66] argue that domain expert feedback can be a viable complement to controlled studies, both for heuristic evaluation of usability as well as for understanding the support of high-level cognitive tasks.
Q6. What is the argument against evaluating studies?
In particular, the authors argue against considering evaluation only as controlled quantitative studies with null hypothesis significance tests (NHST).
Q7. What are the evaluation methods used to study UWP, VDAR, CTV,?
Evaluation methods used to study UWP, VDAR, CTV, and CDA are often qualitative in nature, such as interviews with domain experts, observations of work practices, or longitudinal case studies of newly proposed tools.
Q8. What is the APA’s general guidelines for reporting studies?
The APA (American Psychological Association) provides general guidelines for reporting studies and statistical test results [71].