Research

Visualization Surrogates for Ensemble Simulations

In the scientific community, the simulation of phenomena with a broad range of potential outcomes is a common practice. These simulations are designed to determine the parameters that generate results that are consistent with empirical observations. Running many simulations is expensive, however, because both computational time and storage for the output can be prohibitively large. Recent advancements in deep learning methods offer a new and innovative approach to parameter space exploration in scientific applications. Through the application of deep learning techniques, the exploration of parameter space can be framed as either a generative or regression problem. Our research group, the GRAVITY lab, is actively investigating two distinct categories of deep learning models for this purpose: image-based and data-based surrogate models. Image-based surrogates directly predict 2D visualization images, while data-based surrogates synthesize 3D visual data, such as volumetric data. Image-based surrogates are often trained with predefined visual parameters, such as view angles and visual mappings, and typically require relatively low training costs. Data-based surrogates offer greater flexibility in terms of 3D interactions and post-processing operations, such as isosurfacing and feature extraction. 

Publications:

Deep Learning based Data Representation

Computation resources such as node-hours, storage space, memory, and bandwidth are often limited in supply for scientific computing, which pushes scientists and researchers to develop new strategies to perform the desired tasks quicker and use a smaller storage footprint. At GRAVITY lab, we have proposed various tools and techniques to reduce computation resources. For example, using a neural network based hierarchical super-resolution algorithm to upscale low-resolution data,  or transform data in a more compact latent space for importance-driven scientific data explorations as well as to reduce data that are not deemed important. We have also proposed a particle latent representation method for efficient feature analysis and tracking. 

Publications:

Text analysis + NLP

The increasing availability of large volumes of text data has spurred the development of natural language processing (NLP) techniques for extracting useful information from unstructured text. NLP has been applied to various fields, including sentiment analysis, topic modeling, and entity recognition, among others. While these techniques can reveal valuable insights, they often produce large and complex outputs, which can be difficult to interpret and analyze.

To overcome these challenges, there has been growing interest in combining NLP with text visualization techniques to create a more intuitive representation of the data. Text visualization is the process of representing text data visually to enable more effective exploration and interpretation. By combining NLP with text visualization, researchers can analyze large volumes of text data more efficiently and gain a better understanding of the underlying trends and patterns.

Publications:

Graph Analysis, Inference, and Visualization

The graph is a mathematical structure used to model networks (e.g. social networks, transportation networks) in many different applications. Since the graph is a unique non-Euclidean data structure, modeling graph data remained a challenging task until Graph Neural Networks (GNNs) emerged. Graph Neural Networks (GNNs) have significantly advanced the performance of machine learning tasks on graphs. GNNs for graph visualization is an important topic but is still under-explored. In the GRAVITY lab, we aim to visualize graphs with diverse aesthetic goals via GNNs, such that the topological characteristic of graphs can be clearly identified. In addition to applying GNNs in the visualization field, we also focus on visualizing and explaining the decision-making process of GNNs because GNNs' lack of self-explainability becomes a serious obstacle for applying GNNs to real-world problems. In summary, our ultimate goal is not only to visualize the graphs with the most advanced deep learning technique, but also to open the black-box (i.e., graph-based deep learning model) by disclosing its decision-making process.

Publications:

Understanding Deep Learning Models with Visual Analytics (ML+VA) 

Machine learning, especially deep learning with neural networks, has achieved unprecedented success in a variety of disciplines, such as object recognition with convolutional neural networks (CNN), speech recognition with recurrent neural networks (RNN), and image generation with generative adversarial networks (GAN). However, to date, there is no clear understanding on why these complicated neural networks perform so well, and how they might be improved. In GRAVITY lab, we resort to visual analytics approaches to fill the gap between the success of deep learning models and the deficiency in model interpretations. In collaboration with domain scientists, we develop integrated visual analytics systems to demonstrate model details, explore training dynamics in different levels with friendly user interactions, and propose potential solutions to improve the performance of machine learning models. 

Publications:

In Situ  Analysis, Summarization, and Visualization of Extreme-scale Data Sets

Traditional post-processing based analysis cannot be always readily applicable to big data problems, since storing all the raw data for off-line analysis is becoming prohibitive due to the bottleneck stemming from the slower disk I/O and extreme-scale data sizes. Therefore, to enable flexible exploration of extreme-scale data sets, in this project, we explore in situ analysis techniques which have emerged as one of the frontiers in big data analysis and visualization. In situ analysis encapsulates simulation time data analysis, triage, and summarization while the data still resides in computer memory. It ensures minimal data movement while maximizing the utilization of computational resources. This body of research aims at developing practical and scalable solutions for data analysis and summarization which are suitable for in situ environment, and demonstrate that the reduced and compact data summaries can be used flexibly during post-hoc analysis to perform scalable uncertainty-aware visual analysis for feature exploration.

Publications:

Distribution-based Representation, Analysis, and Visualization for Large-Scale Datasets 

As it becomes more difficult to analyze large-scale simulation output at full resolution, users will have to review and identify regions of interest by transforming data into compact information descriptors that characterize simulation results and allow detailed analysis on demand. Among many different feature descriptors, the statistical information derived from data samples is a promising approach to tame the big data avalanche, because data distributions computed from a population can compactly describe the presence and characteristics of salient features with minimal data movement. The ability to computationally summarize and process data using distributions provides an efficient and representative capture of the information content of a large-scale data set. In GRAVITY lab, we aim for developing novel and compact distribution-based data representations which can on one hand reduce the size of the overall data significantly via statistical summarization techniques, and on the other hand allows for compact and efficient stochastic data analysis and visualization for feature discovery. 

Publications:

Large-Scale Data Exploration based on Query-Driven Visualization  

Query-driven visualization has been applied to efficiently analyze and visualize large-scale data set by focusing on a smaller subset of raw data. In order to reduce data exploration time, scientists usually only focus on the interesting or important part of data that matches on some specified criteria for further analysis and decision making. Through highlighting a part of raw data, it constraints the computational complexity of data visualization and provides a much faster data exploration. In order to rapidly retrieve the subset of data queried by the user, query-driven visualization usually incorporates particular data structures, such as tree or indexing data structure. In GRAVITY lab, we are developing novel approaches to provide efficient and qualitative query-driven data analysis and visualization. 

Publications:

Human-Computer Interaction (HCI) and Virtual Reality (VR)

Human-Computer Interaction techniques have been proven in various of research fields to improve the efficiency of data exploration and analysis. By creating an immersive environment, users obtain the feeling that datasets are in the same world with them, and all communications with data are conducted directly in the 3D space, bypassing the traditional 2D screen and mouse/keyborad. In our lab, we study (1) innovative interactions to manipulate the data, through not only traditional devices, but also tactile input (such as using a touch screen) and body-gesture input (such as using a motion camera); (2) how the information perception from the data can be enhanced by stereoscopic rendering (such as when using a head-mounted device).

Publications:

Uncertainty Analysis & Visualization in Ensemble Data Sets

Ensemble simulations are one of the primary sources of uncertain data sets in scientific studies. While modeling and measuring a real-world phenomenon via simulations, the lack of knowledge regarding the ground-truth compels the scientists to use multiple initial conditions and/or different input parameters to get an estimate of the possible outcomes. The resulting ensemble data sets are used for decision making in real world and thus, are of prime importance to the weather and the geo-scientists. At GRAVITY lab, we have proposed various tools and techniques to analyze and visualize such ensemble datasets. Using information theoretic measures we quantify and visualize the uncertainty of ensemble features like isosurfaces and streamlines. We also develop effective visual analytic solutions to study the effect of input parameters and initial conditions on the ensemble results by performing various types of sensitivity analysis.  

Analyzing and Visualizing Uncertain Flow Fields

Uncertain flow analysis is becoming prevalent in various scientific and engineering domains, such as computational fluid dynamics, aerodynamics, climate, and weather research. In uncertain flow fields, a spatial location often contains a distribution of possible vector directions, which makes traditional flow analysis techniques difficult to apply. In this project, we proposed various techniques to analyze and visualize uncertain flow behaviors, including uncertain Finite-Time Lyapunov Exponent (FTLE) calculation and visualization, uncertain Lagrangian Coherent Structure (LCS) extraction, and density estimation of uncertain stream surfaces.

Information-theoretic Framework for Visualization 

The goal of this project is to develop a quantitative data analysis framework to facilitate effective visualization of large-scale scientific data sets. By considering the process of visualization as a communication channel,  we can quantitatively model the information flow between the data input and the visualization output. With information theory as the theoretical foundation, we are developing a framework to evaluate and optimize the quality of visualization based on the information content of the input data, the visualization output, and the discrepancy between the two. The framework can systematically guide the visual analysis process by iteratively optimizing the visualization result so that the information gap between the two ends of the visual analysis pipeline be quickly narrowed.

The project is supported in part by National Science Foundation [NSF project page]  and Department of Energy

Publications:

Information Visualization 

Information visualization (InfoVis) uses visual elements to represent abstract data. It communicates information with people by making use of human vision, which is recognized as having the widest bandwidth of all senses. The goal of InfoVis is to seamlessly integrate the visual representations and explorative interfaces together, aiming to provide users with an informative, convenient, and pleasant data exploring and communicating environment.

Our research in information visualization focuses on structured data visualization and query-driven interaction. Graphs and trees are the most classic types of structures. We study the visual representation and the layout algorithms of these structures to fulfill the desired visual properties. As interactive query becomes an indispensable means for data analysis, we study the visual operations that can assist the user to glean insight into the data.

Publications: