Moderator: Shashi Shekhar (UMN)
Topic: Understanding and narrowing gaps between Data Science (e.g., Data Mining, Machine Learning, Statistics) and Mechanistic theories in physical scienes (e.g., underlying processes driving patterns, extrapolating beyond observed conditions)
Background: Data science methods have found success in analyzing many complicated systems, such as social networks. However, success is limited within the physical systems (e.g., epidermis, climate). For example, articles in Science [2], Nature [3], and PLOS [4] noted failures of Google flu trend and a New York Times article [5] said, "no scientist thinks you can solve this problem by crunching data alone, no matter how powerful the statistical analysis; you will always need to start with an analysis that relies on an understanding of physics and biochemistry". A 2014 Geo-Physical Letters paper [1] added: "failure to account for dependence between [Physical] models, variables, locations and seasons yield misleading results".
[1] Statistical significance of climate sensitivity predictors obtained by data mining, P. M. Caldwell et al., Geophys. Res. Lett., 41:1803-1808, 2014.
[2] The Parable of Google Flu: Traps in Big Data Analysis, David Lazer et al., AAAS Science, 343, March 14,2014.
[3] When Google got flu wrong: US outbreak foxes a leading web-based method for tracking seasonal flu.,D. Butler, Nature, 494(7436), 155-6, 2013.
[4] Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales, D. Olson, K. Konty, M. Paladini, C. Viboud, L. Simonsen, PLOS Comp. Biology, 9, Oct. 17th, 2013.
[5] Eight (No, Nine!) Problems With Big Data, G. Marcus and Ernest Davis, New York Times, April 6th, 2014.
Questions for Panelists:
Panelists:
Chid Apte, IBM
Imme Ebert-Uphoff, Colorado State
Joydeep Ghosh, UT Austin
Anuj Karpatne, UMN