Use Cases

The Panorama project is developing several workflow application use-cases to investigate new technologies for managing for large-scale workflow applications. These workflows are real applications developed to both help DOE scientists automate their computational problems, and to help computer scientists investigate analytical modeling and diagnostic monitoring.

Spallation Neutron Source (SNS)

The Spallation Neutron Source (SNS) is a DOE research facility at Oak Ridge National Laboratory that provides pulsed neutron beams for scientific and industrial research. SNS uses a particle accelerator to impact a mercury-filled target with short proton pulses to produce neutrons by the process of spallation. A wide variety of experiment instruments provide different capabilities for researchers across a broad range of disciplines, including: physics, chemistry, materials science, and biology.

SNS hosts hundreds of researchers every year who conduct experiments within short reservations of a few days to a few weeks. Providing these researchers with efficient, user-friendly and highly configurable workflows that reduce the turnaround time from data collection to analysis and back is essential to the success of SNS. Figure 1 shows the data flow for a typical SNS instrument, in this case NOMAD. Neutron events scattered from the scientific sample under investigation are collected by an array of detectors. These raw events are processed into a representation familiar to the domain scientist depending on the type of experiment. For NOMAD, the reduced form is a powder diffraction pattern. This reduced data is then analyzed and compared to materials simulations to extract scientific information.

Figure 1: SNS example workflow for NOMAD.

In addition to workflows for processing data from SNS experiments, there are also workflows for data analysis and simulation to support and guide SNS experiments, and to validate computer models against experimental data. These workflows automate tedious manual processes to reduce time to solution and improve researcher productivity. In collaboration with the Center for Accelerating Materials Modeling (CAMM) of SNS data, we are adapting a workflow that executes simulations to support experimental design and the validation of molecular models as a use case for the PANORAMA project. The workflow executes an ensemble of molecular dynamics and neutron scattering simulations to optimize a model parameter value. This workflow is being used to investigate temperature and hydrogen charge parameters for models of water molecules. The results are compared to data from QENS experiments using the BASIS instrument at SNS

An illustration of this parameter refinement workflow is shown in Figure 2. The workflow executes one batch job to unpack the database along with 5 jobs for each set of parameters. First, each set of parameters is fed into a series of parallel molecular dynamics simulations using NAMD. The first simulation calculates an equilibrium and the second inputs that equilibrium and calculates the production dynamics. Each NAMD simulation runs on 288 cores for 1 to 6 hours. The output from the MD simulations has the global translation and rotation removed using AMBER’s ptraj utility and is passed into Sassena for the calculation of coherent and incoherent neutron scattering intensities from the trajectories. Each Sassena simulation runs on 144 cores for up to 6 hours. The final outputs of the workflow are transferred to the user’s desktop and loaded into Mantid for analysis and visualization.

Figure 2: The SNS refinement workflow executes a parameter sweep of molecular dynamics and neutron scattering simulations to optimize the value for a target parameter to fit experimental data.

Accelerated Climate Modeling for Energy (ACME)

The Accelerated Climate Modeling for Energy (ACME) project is using coupled models of ocean, land, atmosphere and ice to study the complex interaction between climate change and societal energy requirements. One of the flagship workflows of this effort is the fully-coupled climate model running at high resolution.

The complete workflow for ACME is illustrated in Figure 3. The ACME climate modeling effort involves the interaction of many different software and hardware components distributed across computing resources at several DOE laboratories. As part of the ACME project, many of the workflow activities that were previously done manually are being automated. The goal is to have an automated, end-to-end workflow that provides full data provenance.

Figure 3: The complete Accelerated Climate Modeling for Energy (ACME) includes many interacting components distributed across DOE labs.

One important step towards that goal is to automate the small portion of the full workflow that involves running the ACME climate model. The Panorama project is developing a workflow that automates the manual effort involved in monitoring and resubmitting the model code in case of failures, and provides periodic reporting for validation of science outputs. The workflow, illustrated in Figure 4, divides a large climate simulation into several stages. Each stage completes a portion of the total target simulation time. For example, a 40-year simulation may be divided into 8, 5-year stages. This enables each stage of the workflow to be completed within the maximum walltime permitted for batch jobs on the target DOE leadership class computing system. Restart files generated at the end of each stage are used as input to the next stage in order to continue the simulation. Each stage also produces history files, which are used by the workflow to automatically compute summary data called climatologies. This climatology data can be reviewed periodically by project scientists to ensure that the simulation is progressing as expected, so that problems can be identified, and corrections made, before computing resources are wasted. Both the history files and the climatologies are transferred to HPSS and CADES (open infrastructure) for long-term storage and future analysis.

Figure 4: The ACME workflow runs one climate sim- ulation in several stages. The output of each stage is used to compute climatologies for validation. All outputs are stored in HPSS and CADES for archiving and further analysis.