In Situ Analysis, Summarization, and Visualization of Large-scale Data Sets

Traditional post-processing based analysis cannot be always readily applicable to big data problems, since storing all the raw data for off-line analysis is becoming prohibitive due to the bottleneck stemming from the slower disk I/O and extreme-scale data sizes. Therefore, to enable flexible exploration of extreme-scale data sets, in this project, we explore in situ analysis techniques which have emerged as one of the frontiers in big data analysis and visualization. In situ analysis encapsulates simulation time data analysis, triage, and summarization while the data still resides in computer memory. It ensures minimal data movement while maximizing the utilization of computational resources. This body of research aims at developing practical and scalable solutions for data analysis and summarization which are suitable for in situ environment, and demonstrate that the reduced and compact data summaries can be used flexibly during post-hoc analysis to perform scalable uncertainty-aware visual analysis for feature exploration.

Publications:

  • Soumya Dutta, Han-Wei Shen, and Jen-Ping Chen: In Situ Prediction Driven Feature Analysis in Jet Engine Simulations, IEEE PacificVis 2018
  • Tzu-Hsuan Wei, Soumya Dutta, and Han-Wei Shen: Information Guided Data Sampling and Recovery using Bitmap Indexing, IEEE PacificVis 2018
  • Soumya Dutta, Jonathan Woodring, Han-Wei Shen, Jen-Ping Chen, James P. Ahrens: Homogeneity guided probabilistic data summaries for analysis and visualization of large-scale data sets. PacificVis 2017: 111-120
  • Soumya Dutta, Chun-Ming Chen, Gregory Heinlein, Han-Wei Shen, Jen-Ping Chen: In Situ Distribution Guided Analysis and Visualization of Transonic Jet Engine Simulations.IEEE Trans. Vis. Comput. Graph. 23(1): 811-820 (2017). [Best Paper Honorable Mention award, SciVis 2016]