Delivering a Dynamic Network-Centric Platform for Data-Driven Science (DyNamo)
Computational science today depends on many complex, data-intensive applications operating on datasets that originate from a variety of scientific instruments and data stores. A major challenge for data-driven science applications is the integration of data into the scientist’s workflow. Recent advances in dynamic, networked cloud infrastructures like NSF GENI and Software Defined Networking (SDN), provide the building blocks to construct integrated, reconfigurable, end-to-end infrastructure that has the potential to increase scientific productivity. However, applications and workflows have seldom taken advantage of these advanced capabilities. By leveraging prior work in programmatic control of dynamic infrastructure and novel SDN-aware network services, the DyNamo project aims to develop new algorithms, policies, and mechanisms in a novel network-centric platform that will enable high-performance, adaptive data flows and coordinated access to multi-campus CI facilities and community data repositories for observational scientists. Coupled with innovations in network-aware workflow management, DyNamo will bridge a crucial gap between data-driven, observational science workflows and novel network mechanisms and services through engagement with domain scientists from the Collaborative & Adaptive Weather Sensing (CASA) and Ocean Observatory Initiative (OOI) communities. The project leverages the experience of DyNamo partners in advancing novel solutions in dynamic infrastructure management for use in domain science and the expertise of domain scientists and their CI enablers, with a track record of prior collaborations. Its synergy with previous NSF investments in ExoGENI, the Pegasus workflow management system, and CC-NIE ADAMANT, yields a template and model implementation to develop a high-performance, end-to-end platform for observational science data flows and workflow processing.
The innovative network-centric algorithms, policies and mechanisms developed in DyNamo will enable programmable, on-demand access to high-bandwidth, configurable network paths from community data repositories to national CI facilities, and help satisfy data, computational and storage requirements of CASA and OOI workflows. This will enable researchers to test new algorithms and models in real time with live streaming data, which is currently not possible in many scientific domains including CASA and OOI. Through enhanced interactions between Pegasus and the network-centric platform and new network-aware workflow scheduling algorithms in Pegasus, CASA and OOI workflows will benefit from workflow automation and data management over dynamically provisioned infrastructure, while transforming application-level, data QoS expectations to transparent, SDN-aware infrastructure actions aided by real-time analysis of end-to-end monitoring data.
- Renaissance Computing Institute (RENCI) (Lead)
- University of Southern California Information Sciences Institute (USC/ISI)
- University of Massachusetts, Amherst
- Rutgers University
The DyNamo Project is supported by the National Science Foundation under Grant 1826997. The views expressed do not necessarily reflect the views of the National Science Foundation or any other organization.