Information Visualization via Data Signatures

Visual analytics is the formation of abstract visual metaphors together with human discourse which guides in the realization of the expected and discovering the unexpected in a massive and dynamically changing information space. It is an approach to combine the art of human intuition and the science of mathematical deduction to directly perceive patterns and derive knowledge and insight from them. This technology applies to almost all fields but is being driven by the needs of biology and national security. The data sets used in computation grows in size and complexity. Therefore the technology available to deal with these has become less effective. Moreover, common operations in visualizing data could exhaust the capability of our desktop workstation as our data size approaches some limits. While building new machines with more resources in terms of space would help with our data size concern, analyzing them in an efficient and effective manner is still hard to accomplish.

In this project, we explore the concept of the so-called data signature to capture the meaning of a data set in compact format. Data signature, as defined in [1], is a mathematical data vector that captures the essence of a large data set in small fraction of its original size. These signatures allow us to conduct analysis in a higher level of abstraction and yet still reflect the intended results as if we are using the original data.

We derive appropriate data signatures for the purpose of extracting optimal representations for time series periodic data sets. We produce visualizations based on these data signatures. We formulate visualizations that can highlight interesting information from the data. By formulating techniques to improve the data signature construction, we further discover previously unknown behavior of the data set. These techniques also allow us to further reduce the dimensionality of the data signatures, thereby addressing the need to have a significantly smaller dimension representing the original data set of high dimensionality.

[1] P.C. Wong, H. Foote, R. Leung, D. Adams and J. Thomas: Data Signatures and Visualization of Scientific Data Sets, Pacific Northwest National Laboratory, USA, IEEE 2000.
Data Set Under Study

2006 North Luzon Expressway Traffic Volume