Due to ubiquitous sensors (GPS, Accelerometer), easy of use apps (Facebook, Twitter etc), presence of audio & video recording devices and higher internet connectivity, the key characteristics of raw data is changing. This “new” data can be characterized by 4V’s – Volume, Velocity, Variety and Veracity. Moreover, due to popular trend of crowd sourcing or citizen sensors, it is reasonable to assume that people will provide multiple evidence of same event using different data types. For example during a Football match, some people will Tweet about Goals, Penalties, etc., while others will take a picture and upload it. Although the underlying modalities are different (text and image), the data describes the same event. Such multimodal evidences should be used to strengthen the belief in underlying physical event. Finally, each of the data point will have inherent uncertainty. The uncertainty can arise from inconsistent, incomplete, and ambiguous data as well as the trust worthiness of the user. Similarly, some sources are more reliable than others which will also play a part in overall reliability. The volume, velocity and variety are measurable and observable; however, there is no measure of truthfulness.

This workshop will be held along with SIAM International Conference on Data Mining 2013 (SDM 2013).

Featured Speakers Include:

Dan Miranker, UT Austin
Dan Wolfson, IBM Software Group
Latifur Khan, UT Dallas
Galina Rogova, SUNY Buffalo
Soundarrajan Srinivasan, Bosch Research Center
Pontus Svenson, Defence Research Agency, Sweden