Title: Accelerating Data-Driven Discovery by Outsourcing the Mundane (slides)

Ian Foster
Computation Institute,  Argonne National Laboratory & University of Chicago


Dr. Ian Foster is Director of the Computation Institute, Professor in the Department of Computer Science at the University of Chicago. He is also a Senior Scientist and Distinguished Fellow at the Argonne National Laboratory. He is well known for the Globus Project, his research and development effort addressing computational and communications problems for collaborative computing.

We have made much progress over the past decade toward effectively harnessing the collective power of IT resources distributed across the globe. In fields such as high-energy physics, astronomy, and climate, thousands benefit daily from tools that manage and analyze large quantities of data produced and consumed by large collaborative teams.

But we now face a far greater challenge: Exploding data volumes and powerful simulation tools mean that far more--ultimately most?--researchers will soon require capabilities not so different from those used by these big-science teams. How is the general population of researchers and institutions to meet these needs? Must every lab be filled with computers loaded with sophisticated software, and every researcher  become an information technology (IT) specialist? Can we possibly afford  to equip our labs in this way, and where would we find the experts to operate them?

Consumers and businesses face similar challenges, and industry has responded by moving IT out of homes and offices to so-called cloud providers (e.g., GMail, Google Docs, Salesforce), slashing costs and complexity. I suggest that by similarly moving research IT out of the lab, we can realize comparable economies of scale and reductions in complexity.  More importantly, we can free researchers from the burden of managing IT, giving them back their time to focus on research and empowering them to go beyond the scope of what was previously possible.

I describe work we are doing at the Computation Institute to realize this approach, focusing initially on research data lifecycle management. I present promising results obtained to date with the Globus Online system, and suggest a path towards large-scale delivery of these capabilities.