scispace - formally typeset
Search or ask a question
Author

Jishang Wei

Bio: Jishang Wei is an academic researcher from University of California, Davis. The author has contributed to research in topics: Cluster analysis & Data visualization. The author has an hindex of 7, co-authored 7 publications receiving 229 citations.

Papers
More filters
Proceedings ArticleDOI
14 Oct 2012
TL;DR: A visual analytics system to explore the various user behavior patterns reflected by distinct clickstream clusters using a Self-Organizing Map with Markov chain models and enhancing their ability to reason is introduced.
Abstract: Web clickstream data are routinely collected to study how users browse the web or use a service. It is clear that the ability to recognize and summarize user behavior patterns from such data is valuable to e-commerce companies. In this paper, we introduce a visual analytics system to explore the various user behavior patterns reflected by distinct clickstream clusters. In a practical analysis scenario, the system first presents an overview of clickstream clusters using a Self-Organizing Map with Markov chain models. Then the analyst can interactively explore the clusters through an intuitive user interface. He can either obtain summarization of a selected group of data or further refine the clustering result. We evaluated our system using two different datasets from eBay. Analysts who were working on the same data have confirmed the system's effectiveness in extracting user behavior patterns from complex datasets and enhancing their ability to reason.

97 citations

Proceedings ArticleDOI
02 Mar 2010
TL;DR: This work employs an automatic clustering method to generate field-line templates for the user to locate subfields of interest and leverages the user's knowledge about the flow field through intuitive user interaction, resulting in a promising alternative to existing flow visualization solutions.
Abstract: In flow visualization, field lines are often used to convey both global and local structure and movement of the flow. One challenge is to find and classify the representative field lines. Most existing solutions follow an automatic approach that generates field lines characterizing the flow and arranges these lines into a single picture. In our work, we advocate a user-centric approach to exploring 3D vector fields. Our method allows the user to sketch 2D curves for pattern matching in 2D and field lines clustering in 3D. Specifically, a 3D field line whose view-dependent 2D projection is most similar to the user drawing will be identified and utilized to extract all similar 3D field lines. Furthermore, we employ an automatic clustering method to generate field-line templates for the user to locate subfields of interest. This semi-automatic process leverages the user's knowledge about the flow field through intuitive user interaction, resulting in a promising alternative to existing flow visualization solutions. With our sketch-based interface, the user can effectively dissect the flow field and make more structured visualization for analysis or presentation.

47 citations

Proceedings ArticleDOI
13 Dec 2012
TL;DR: This work introduces a two-tier visual analysis system, TrailExplorer2, to discover knowledge from massive log data by visualizing a sorted list of web sessions' temporal patterns and enables data exploration at different levels of details.
Abstract: Tracking and recording users' browsing behaviors on the web down to individual mouse clicks can create massive web session logs. While such web session data contains valuable information about user behaviors, the ever-increasing data size has placed a big challenge to analyzing and visualizing the data. An efficient data analysis framework requires both powerful computational analysis and interactive visualization. Following the visual analytics mantra “Analyze first, show the important, zoom, filter and analyze further, details on demand”, we introduce a two-tier visual analysis system, TrailExplorer2, to discover knowledge from massive log data. The system supports a visual analysis process iterating between two steps: querying web sessions and visually analyzing the retrieved data. The query happens at the lower tier where terabytes of web session data are processed in a cluster. At the upper tier, the extracted web sessions with much smaller scale are visualized on a personal computer for interactive exploration. Our system visualizes a sorted list of web sessions' temporal patterns and enables data exploration at different levels of details. The query-visualization-exploration process iterates until a satisfactory conclusion is achieved. We present two case studies of TrailExplorer2 using real world session data from eBay to demonstrate the system's effectiveness.

32 citations

Proceedings ArticleDOI
01 Mar 2011
TL;DR: A dual-space method is introduced to analyze particle data in combustion simulations, starting by clustering the time series curves in the phase space of the data, and then visualizing the corresponding trajectories of each cluster in the physical space.
Abstract: Current simulations of turbulent flames are instrumented with particles to capture the dynamic behavior of combustion in next-generation engines. Categorizing the set of many millions of particles, each of which is featured with a history of its movement positions and changing thermo-chemical states, helps understand the turbulence mechanism. We introduce a dual-space method to analyze such data, starting by clustering the time series curves in the phase space of the data, and then visualizing the corresponding trajectories of each cluster in the physical space. To cluster time series curves, we adopt a model-based clustering technique in a two-stage scheme. In the first stage, the characteristics of shape and relative position are particularly concerned in classifying the time series curves, and in the second stage, within each group of curves, clustering is further conducted based on how the curves change over time. In our work, we perform the model-based clustering in a semi-supervised manner. Users' domain knowledge is integrated through intuitive interaction tools to steer the clustering process. Our dual-space method has been used to analyze particle data in combustion simulations and can also be applied to other scientific simulations involving particle trajectory analysis work.

21 citations

Proceedings ArticleDOI
13 Dec 2012
TL;DR: This paper presents the first full group tracking framework in which they track groups (clusters) of features in time-varying 3D fluid flow simulations, and uses a clustering algorithm to group interacting features.
Abstract: The ability to visually extract and track features is appealing to scientists in many simulations including flow fields. However, as the resolution of the simulation becomes higher, the number of features to track increases and so does the cost in large-scale simulations. Since many of these features act in groups, it seems more cost-effective to follow groups of features rather than individual ones. Very little work has been done for tracking groups of features. In this paper, we present the first full group tracking framework in which we track groups (clusters) of features in time-varying 3D fluid flow simulations. Our framework uses a clustering algorithm to group interacting features. We demonstrate the use of our framework on data output from a 3D simulation of wall bounded turbulent flow.

17 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Book ChapterDOI
16 Jul 2014
TL;DR: This paper aims to analyze some of the different analytics methods and tools which can be applied to big data, as well as the opportunities provided by the application of big data analytics in various decision domains.
Abstract: . In the information era, enormous amounts of data have become available on hand to decision makers. Big data refers to datasets that are not only big, but also high in variety and velocity, which makes them difficult to handle using traditional tools and techniques. Due to the rapid growth of such data, solutions need to be studied and provided in order to handle and extract value and knowledge from these datasets. Furthermore, decision makers need to be able to gain valuable insights from such varied and rapidly changing data, ranging from daily transactions to customer interactions and social network da-ta. Such value can be provided using big data analytics, which is the application of advanced analytics techniques on big data. This paper aims to analyze some of the different analytics methods and tools which can be applied to big data, as well as the opportunities provided by the application of big data analytics in various decision domains. Keywords: big data, data mining, analytics, decision making.

299 citations

Proceedings ArticleDOI
07 May 2016
TL;DR: An unsupervised system to capture dominating user behaviors from clickstream data, and visualize the detected behaviors in an intuitive manner, which effectively identifies previously unknown behaviors, e.g., dormant users, hostile chatters.
Abstract: Online services are increasingly dependent on user participation Whether it's online social networks or crowdsourcing services, understanding user behavior is important yet challenging In this paper, we build an unsupervised system to capture dominating user behaviors from clickstream data (traces of users' click events), and visualize the detected behaviors in an intuitive manner Our system identifies "clusters" of similar users by partitioning a similarity graph (nodes are users; edges are weighted by clickstream similarity) The partitioning process leverages iterative feature pruning to capture the natural hierarchy within user clusters and produce intuitive features for visualizing and understanding captured user behaviors For evaluation, we present case studies on two large-scale clickstream traces (142 million events) from real social networks Our system effectively identifies previously unknown behaviors, eg, dormant users, hostile chatters Also, our user study shows people can easily interpret identified behaviors using our visualization tool

211 citations

Journal ArticleDOI
TL;DR: A taxonomy of visual analytics techniques is built, which includes three first-level categories: techniques before model building, techniques during modeling building, and techniques after model building.
Abstract: Visual analytics for machine learning has recently evolved as one of the most exciting areas in the field of visualization. To better identify which research topics are promising and to learn how to apply relevant techniques in visual analytics, we systematically review 259 papers published in the last ten years together with representative works before 2010. We build a taxonomy, which includes three first-level categories: techniques before model building, techniques during modeling building, and techniques after model building. Each category is further characterized by representative analysis tasks, and each task is exemplified by a set of recent influential works. We also discuss and highlight research challenges and promising potential future research opportunities useful for visual analytics researchers.

150 citations

Proceedings ArticleDOI
18 Apr 2015
TL;DR: Grounded in the real-world characteristics of web clickstream data, MatrixWave is designed, a matrix-based representation that allows analysts to get an overview of differences in traffic patterns and interactively explore paths through the website.
Abstract: Event sequence data analysis is common in many domains, including web and software development, transportation, and medical care. Few have investigated visualization techniques for comparative analysis of multiple event sequence datasets. Grounded in the real-world characteristics of web clickstream data, we explore visualization techniques for comparison of two clickstream datasets collected on different days or from users with different demographics. Through iterative design with web analysts, we designed MatrixWave, a matrix-based representation that allows analysts to get an overview of differences in traffic patterns and interactively explore paths through the website. We use color to encode differences and size to offer context over traffic volume. User feedback on MatrixWave is positive. Our study participants made fewer errors with MatrixWave and preferred it over the more familiar Sankey diagram.

128 citations