Project Description

Most research on Big Data has focused on its automated processing (machine-learning, etc.), leading to a knowledge and technological gap in tools and methods for its use in analytical scenarios, by human decision makers. Indeed, while many well established techniques from Visual Analytics can be used for regular datasets, their applicability to large volumes of data is questionable. Compounding on this problem, Streaming Big Data (SBD), constantly updating in real time, poses additional challenges: an adequate solution must not only cope with the incoming barrage of data, but also be able to highlight relevant trends and changes in the data stream in a timely fashion, in such a way as to allow relevant decisions to be taken.

The scale of SBD, with its volume and velocity, poses new challenges and the need for real-time flexible, interactive, and dynamic visualization techniques, beyond the limitations of existing approaches. In VisBig we will research and develop solutions for the problem of SBD Visual Analytics.

We will investigate techniques and methods that allow the clear and understandable visualization of SBD. This will include: (a) efficient processing and consumption of streaming data; (b) automated detection of relevant changes in the data stream, highlighting entities that merit a detailed analysis; (c) selection and development of visualization idioms adequate for SBD; (d) appropriate use of idiom transformations to allow for the real-time visualization changes in the stream.

At the most fundamental level the system must cope with the data that keeps pouring in. Then, we will detect of points of interest in the data. An uninteresting stream can suddenly become relevant when the data therein changes in nature, due perhaps to some external event. Those changes must be highlighted to the analyst lest they go unnoticed. We will employ machine-learning to do this and address how the use of domain-specific knowledge can lead to more robust solutions. Having identified interesting changes to the data stream, the question of how to visualize them remains. In a typical Business Intelligence scenario, the dataset and relevant analyses are known beforehand, allowing the creation of custom-tailor dashboards matching the data and the questions analysts have. With SBD, especially in rapidly evolving, complex, domains, different situations may arise where different visualization idioms may be more relevant, raising the question of how to (semi-)automatically adapt visualization idioms depending on the data and its properties at each moment, to highlight facets relevant for analysis. How to transition between different states of the visualization while maintaining the context of the analysis is also an open question that this project will solve.

VisBig's results, both scientific and as a product, will have an impact on the Visual Analytics and Business Intelligence communities, where the need for solutions for SBD Analytics is increasingly dire.