GridH1N1Comments

For H1N1, the response has leveraged a few existing grid components that were open source, flexible and tested by external partners:

1) GIPSE format

2) GIPSE Store database

3) SOAP API ("GIPSEService")

* (GridViewer is not officially in scope and not used for H1N1 so I'm not including it as used, but it is a capability).

1) The GIPSE format was an xml format developed with the CoE partners (including UW's DiSTRIBuTE team) for querying federated surveillance data using 'Indicators' and 'Stratifiers'. This format was drastically simplified and adjusted to match the existing DiSTRIBuTE format, but kept the idea of using Indicator sets specific to programs to conduct surveillance at an aggregate level. Two indicator sets were initially selected: BioSense's top 30 subsyndromes plus ILI subsyndromes by temperature and DiSTRIBuTE's ILI conditions (EDVisit total, ILI-broad, ILI-narrow) by temperature.

2) The GIPSE Store database was a very lightweight star schema database that allowed for collecting together aggregate values (either counts, ratios or percentages) from identified data sources based on a specific set of stratifying variables: state, zip code, age group, disposition, gender and facility and indicator set. For the H1N1 response, the store used only the zip3, age group and disposition stratifying variables with an indicator set including temperature. This data store was extended by the H1N1 technical team to include multiple instances (loading, staging, master, public), performance tuning options to decrease the time required for queries, and SQL Stored Procedure based Import & Export routines.

3) The SOAP API (aka GIPSEService) is a Java based web service using the Globus grid middleware stack, generated using the caBIG Introduce service toolkit. It uses a standard Java SQL Mapping framework (iBATIS) to perform a set of 12 basic queries against the GIPSE Store to provide a secure, quick response, access controlled remote interface to the database. This service is easily configurable to point at any JDBC/ODBC database view. This has been tested at external partners like Denver Public Health, as well as with RODS, ESSENCE, and the NEDSS Base System. The basic service software project was enhanced by the H1N1 technical team to add in performance tuning to reduce the response time for queries used through the SOAP API.

*The team built the Grid Viewer as a demo/test tool to invoke various SOAP API services deployed within CDC and at partner sites to do a simple maps mashup allowing testers to visualize the performance of the various SOAP APIs deployed. This is not directly used in the H1N1 response, but we've found it to be a valuable tool in testing the SOAP API and GIPSE Store.

Additionally, the project architecture of grid was leveraged to use collaborative development, real-time communication with partners and open source licenses.

Future steps/applicability really requires some more thought that Tom/Ken can provide, but my initial response is (to be tempered by Tom/Ken):

1) That the grid pieces provide a services base that allows for distributed analytics (NLP, GIS) that can be mashed into existing data services (GIPSE).

2) GIPSE provides a framework to allow for the rapid development of indicator sets based on the aggregate data points that need to be shared. It's rather flexible in that you can use local code sets for indicators and then harmonize after the fact. Now there are a few tools to share this indicator based surveillance (SOAP API, viewer, PHINMS file transfer)