scispace - formally typeset
Search or ask a question
Journal ArticleDOI

ConVis: A Visual Text Analytic System for Exploring Blog Conversations

01 Jun 2014-Vol. 33, Iss: 3, pp 221-230
TL;DR: A visual text analytic system that tightly integrates interactive visualization with novel text mining and summarization techniques to fulfill information needs of users in exploring conversations is presented.
Abstract: Today it is quite common for people to exchange hundreds of comments in online conversations e.g., blogs. Often, it can be very difficult to analyze and gain insights from such long conversations. To address this problem, we present a visual text analytic system that tightly integrates interactive visualization with novel text mining and summarization techniques to fulfill information needs of users in exploring conversations. At first, we perform a user requirement analysis for the domain of blog conversations to derive a set of design principles. Following these principles, we present an interface that visualizes a combination of various metadata and textual analysis results, supporting the user to interactively explore the blog conversations. We conclude with an informal user evaluation, which provides anecdotal evidence about the effectiveness of our system and directions for further design.
Citations
More filters
Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper presents a visual analytics method for understanding and comparing RNN models for NLP tasks, and proposes a glyph-based sequence visualization based on aggregate information to analyze the behavior of an RNN’s hidden state at the sentence-level.
Abstract: Recurrent neural networks (RNNs) have been successfully applied to various natural language processing (NLP) tasks and achieved better results than conventional methods. However, the lack of understanding of the mechanisms behind their effectiveness limits further improvements on their architectures. In this paper, we present a visual analytics method for understanding and comparing RNN models for NLP tasks. We propose a technique to explain the function of individual hidden state units based on their expected response to input texts. We then co-cluster hidden state units and words based on the expected response and visualize co-clustering results as memory chips and word clouds to provide more structured knowledge on RNNs’ hidden states. We also propose a glyph-based sequence visualization based on aggregate information to analyze the behavior of an RNN’s hidden state at the sentence-level. The usability and effectiveness of our method are demonstrated through case studies and reviews from domain experts.

170 citations


Cites background from "ConVis: A Visual Text Analytic Syst..."

  • ...Here the circular layout of word clouds is designed to reduce visual clutter of the edges between bipartite clusters [21]....

    [...]

  • ...PivotPath [11], ConVis [21] and NameClarifier [39] have inspired our layout design, which reduces visual clutter across different entities and provides an easy interface for interaction design....

    [...]

Journal ArticleDOI
TL;DR: A taxonomy of visual analytics techniques is built, which includes three first-level categories: techniques before model building, techniques during modeling building, and techniques after model building.
Abstract: Visual analytics for machine learning has recently evolved as one of the most exciting areas in the field of visualization. To better identify which research topics are promising and to learn how to apply relevant techniques in visual analytics, we systematically review 259 papers published in the last ten years together with representative works before 2010. We build a taxonomy, which includes three first-level categories: techniques before model building, techniques during modeling building, and techniques after model building. Each category is further characterized by representative analysis tasks, and each task is exemplified by a set of recent influential works. We also discuss and highlight research challenges and promising potential future research opportunities useful for visual analytics researchers.

150 citations


Additional excerpts

  • ...[106] E....

    [...]

  • ...After Model Building Understanding Static Data Analysis Results (43) [4], [15], [22], [27], [29], [37], [43], [47], [57], [66], [72], [75], [81], [83], [85], [87], [89], [92], [100], [105], [106], [107], [112], [113], [114], [117], [121], [126], [128], [129], [146], [160], [161], [167], [200], [203],[206], [225], [248], [251], [277], [294], [296]...

    [...]

Proceedings ArticleDOI
18 Apr 2015
TL;DR: Analysis of patterns of communication within an online diabetes community TuDiabetes suggests that members of TuDi diabetes often construct shared meaning through deep discussions, back and forth negotiation of perspectives, and resolution of conflicts in opinions.
Abstract: Online health communities collect vast amounts of information and opinions in regards to health and wellness management. However, these opinions are usually stored within lengthy and loosely structured discussion threads; synthesizing information in these threads can be challenging. In this mixed-methods study, grounded in the theoretical perspective of collective sensemaking, we examined patterns of communication within an online diabetes community TuDiabetes. The results of the study suggest that members of TuDiabetes often construct shared meaning through deep discussions, back and forth negotiation of perspectives, and resolution of conflicts in opinions. However, unlike participants of other sensemaking communities, members of TuDiabetes often value multiplicity of opinions rather than consensus. We use study results to draw implications for the design of computing platforms for facilitating collective sensemaking that promote construction of shared knowledge yet embrace diversity of opinions.

94 citations


Cites methods from "ConVis: A Visual Text Analytic Syst..."

  • ...Taking these trends further, Hoque et al used a combination of NLP and data visualization techniques to visualize salient properties of discussion threads, such as topics discussed and sentiment towards these topics [15]....

    [...]

Journal ArticleDOI
TL;DR: This contribution describes the background of sentiment analysis, introduces a categorization for sentiment visualization techniques that includes 7 groups with 35 categories in total, and discusses 132 techniques from peer‐reviewed publications together with an interactive web‐based survey browser.
Abstract: Visualization of sentiments and opinions extracted from or annotated in texts has become a prominent topic of research over the last decade. From basic pie and bar charts used to illustrate customer reviews to extensive visual analytics systems involving novel representations, sentiment visualization techniques have evolved to deal with complex multidimensional data sets, includingtemporal, relational, and geospatial aspects. This contribution presents a survey of sentiment visualization techniques based on a detailed categorization. We describe the background of sentiment analysis, introduce a categorization for sentiment visualization techniques that includes 7 groups with 35 categories in total, and discuss 132 techniques from peer-reviewed publications together with an interactive web-based survey browser. Finally, we discuss insights and opportunities for further research in sentiment visualization. We expect this survey to be useful for visualization researchers whose interests include sentiment or other aspects of text data as well as researchers and practitioners from other disciplines in search of efficient visualization techniques applicable to their tasks and data. (Less)

89 citations

Journal ArticleDOI
TL;DR: This research comprehensively analyzed 263 visualization papers and 4,346 mining papers published between 1992-2017 in two fields: visualization and text mining and derived around 300 concepts and built a taxonomy for each type of concept.
Abstract: Visual text analytics has recently emerged as one of the most prominent topics in both academic research and the commercial world. To provide an overview of the relevant techniques and analysis tasks, as well as the relationships between them, we comprehensively analyzed 263 visualization papers and 4,346 mining papers published between 1992-2017 in two fields: visualization and text mining. From the analysis, we derived around 300 concepts (visualization techniques, mining techniques, and analysis tasks) and built a taxonomy for each type of concept. The co-occurrence relationships between the concepts were also extracted. Our research can be used as a stepping-stone for other researchers to 1) understand a common set of concepts used in this research topic; 2) facilitate the exploration of the relationships between visualization techniques, mining techniques, and analysis tasks; 3) understand the current practice in developing visual text analytics tools; 4) seek potential research opportunities by narrowing the gulf between visualization and mining techniques based on the analysis tasks; and 5) analyze other interdisciplinary research areas in a similar way. We have also contributed a web-based visualization tool for analyzing and understanding research trends and opportunities in visual text analytics.

88 citations


Additional excerpts

  • ...radial visualizations (15) [73], [100], [102], [108], [126], [144], [155], [169], [174], [176], [182], [194], [247], [248], [249]...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation, and is applied to the polarity classification task.
Abstract: We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the text's opinion towards its main subject matter. We show that SO-CAL's performance is consistent across domains and in completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability.

2,798 citations

Journal Article
TL;DR: Since the lexical chains are computable, and exist in non-domain-specific text, they provide a valuable indicator of text structure, and provide a semantic context for interpreting words, concepts, and sentences.
Abstract: In text, lexical cohesion is the result of chains of related words that contribute to the continuity of lexical meaning. These lexical chains are a direct result of units of text being "about the same thing," and finding text structure involves finding units of text that are about the same thing. Hence, computing the chains is useful, since they will have a correspondence to the structure of the text. Determining the structure of text is an essential step in determining the deep meaning of the text. In this paper, a thesaurus is used as the major knowledge base for computing lexical chains. Correspondences between lexical chains and structural elements are shown to exist. Since the lexical chains are computable, and exist in non-domain-specific text, they provide a valuable indicator of text structure. The lexical chains also provide a semantic context for interpreting words, concepts, and sentences.

1,026 citations

Journal ArticleDOI
TL;DR: A nested model for the visualization design and validation with four layers: characterize the task and data in the vocabulary of the problem domain, abstract into operations and data types, design visual encoding and interaction techniques, and create algorithms to execute techniques efficiently.
Abstract: We present a nested model for the visualization design and validation with four layers: characterize the task and data in the vocabulary of the problem domain, abstract into operations and data types, design visual encoding and interaction techniques, and create algorithms to execute techniques efficiently. The output from a level above is input to the level below, bringing attention to the design challenge that an upstream error inevitably cascades to all downstream levels. This model provides prescriptive guidance for determining appropriate evaluation approaches by identifying threats to validity unique to each level. We also provide three recommendations motivated by this model: authors should distinguish between these levels when claiming contributions at more than one of them, authors should explicitly state upstream assumptions at levels above the focus of a paper, and visualization venues should accept more papers on domain characterization.

790 citations

Journal ArticleDOI
TL;DR: This publication contains reprint articles for which IEEE does not hold copyright and which are likely to be copyrighted.
Abstract: Online spaces that enable shared public interpersonal communications are of significant social, organizational, and economic importance. In this paper, a theoretical model and associated unobtrusive method are proposed for researching the relationship between online spaces and the behavior they host. The model focuses on the collective impact that individual information-overload coping strategies have on the dynamics of open, interactive public online group discourse. Empirical research was undertaken to assess the validity of both the method and the model, based on the analysis of over 2.65 million postings to 600 Usenet newsgroups over a 6-month period. Our findings support the assertion that individual strategies for coping with "information overload" have an observable impact on large-scale online group discourse. Evidence was found for the hypotheses that: (1) users are more likely to respond to simpler messages in overloaded mass interaction; (2) users are more likely to end active participation as the overloading of mass interaction increases; and (3) users are more likely to generate simpler responses as the overloading of mass interaction grows.The theoretical model outlined offers insight into aspects of computer-mediated communication tool usability, technology design, and provides a road map for future empirical research.

684 citations

Journal ArticleDOI
TL;DR: The aim is to provide a succinct summary of the state-of-the-art interface schemes, to illuminate both successful and unsuccessful interface strategies, and to identify potentially fruitful areas for further work.
Abstract: There are many interface schemes that allow users to work at, and move between, focused and contextual views of a dataset. We review and categorize these schemes according to the interface mechanisms used to separate and blend views. The four approaches are overview+detail, which uses a spatial separation between focused and contextual views; zooming, which uses a temporal separation; focus+context, which minimizes the seam between views by displaying the focus within the context; and cue-based techniques which selectively highlight or suppress items within the information space. Critical features of these categories, and empirical evidence of their success, are discussed. The aim is to provide a succinct summary of the state-of-the-art, to illuminate both successful and unsuccessful interface strategies, and to identify potentially fruitful areas for further work.

666 citations