AQ SBA Engineering Report

1.1     Scope of this document

This Engineering Report describes how the Air Quality Workgroup used and tested the GEOSS Common Infrastructure (GCI) in order to register, discover, and access datasets relevant to air quality management during AIP Phase 2. The air quality data used in working with the GCI during AIP-2 were OGC WMS or WCS services. The AQ Workgroup developed a process to create ISO 19115 metadata records for the AQ Community Catalog from WMS/WCS GetCapabilities documents. Using a service oriented architecture approach, data and metadata flowed from the data providers through the GCI to the users. This methodology and infrastructure was demonstrated using a scenario entitled, "Southern California Smoke" which describes how air quality event managers would use data available through GEOSS to predict and analyze the effect of smoke plumes on air quality during and after the Southern California Wildfire of October 2007. This demonstration is just one example of the broad capabilities of the infrastructure, which allows a single dataset to be reused for multiple decision support activities and supports a single decision support activity that needs multiple datasets. 

1.2     GEOSS AIP

The GEOSS Architecture Implementation Pilot (AIP) leads the incorporation of contributed components consistent with the GEOSS Architecture using a GEO Web Portal, WAF/CSW, AQ Community Portal, and a Clearinghouse search facility to access services through GEOSS Interoperability Arrangements in support of the GEOSS Societal Benefit Areas.  AIP is a GEO task for elaborating the GEOSS Architecture under the purview of the GEO Architecture and Data Committee. 

This Engineering Report (ER) is a key result of the second phase of AIP.  AIP-2 was conducted from July 2008 to June 2009.  A separate AIP-2 ER describes the overall process and results of AIP-2 and thereby provides a context for this Community SBA ER.[1] 

2.     Community SBA Objectives

The overall objective of the AQ Workgroup during AIP-2 was to test and evaluate the GEOSS Common Infrastructure (GCI) from the perspective of the air quality community and, in the process, define an initial AQ Community Infrastructure that connects with the GCI (Fig.1) to share, find, and use distributed data, visualization and analysis services.  There are numerous Earth Observations that are available and, in principle, useful for air quality applications such as informing the public and enforcing AQ standards. However, connecting a user to the right observations or models is accompanied by an array of hurdles. The GEOSS Common Infrastructure allows the reuse of observations and models for multiple purposes.

Figure 1. GEOSS Architecture applied to the Air Quality Community; Air Quality Community Infrastructure 

The goal of the AIP-2 for the AQ Community is to make better use of data and information from a variety of sources in order to support science, management and other decision-making processes. In particular, the AIP-2 effort focused on a wildfire smoke monitoring, forecasting and assessment scenario in order to:

  • understand conditions (both observed and forecasted) related to wildfires and their smoke
  • compare smoke forecasts with satellite and surface observations
  • provide information useful in assessing whether the regional smoke event is considered an "exceptional event"
  • provide information useful for public health applications. 

3.     Scenario

3.1     Actors

A number of actors process earth observations information upstream of the decision makers, who base their decisions on highly synthesized data.  They are described in more detail in the full scenario [2]. 

For illustration, a “value chain” of actors (Fig.2) involved in the Intercontinental Pollutant Transport events is listed here; similar chains for the other events are described in the full scenario. 
  • End use decision maker: Policy maker negotiating an agreement on intercontinental pollutant transport  
    • Information needed:  Synthetic assessment reports quantifying the impact of long-range pollutant transport
  • Upstream information processor:  Scientific advisory group 
    • Information needed:  Technical assessments of model experiments and synthesized datasets to assess transport
  • Upstream information processor:  Scientific task force assessing long-range transport 
    •  Information needed:  Synthetic description of the atmosphere, using multiple observations and models
  • Upstream information processor: Air quality data analysts 
    • Information needed:  Wide variety of atmospheric observations, synthetic integrations of this data
 Actors Processing and Using Data:  Intercontinental Pollutant Transport Example

Image:Value Chain.png

Figure 2. Actors in Air Quality Community Infrastructure

       Other Actors: Earth Observations Providers

      The earth observations required are generally needed for each set of scenario events. 

      • Government agencies (National, State/Provincial/Tribal, and/or Local):
        • Environmental, Meteorological, Land management, Space agencies
      • Industry, Consultants
      • Academic and Other Research Institutes
      • International cooperative fora (e.g. WMO, CEOS, EEA) 

      3.2     Context and pre-conditions

      Data providers create standards-based services to their data and register those services with the GEOSS AQ Community Catalog, thereby making those services accessible via GEOSS Clearinghosues. Data analysts and air quality managers find and access those services via GEOPortals or AQ Community Portal for subsequent use analyses. The available services are tested to ensure they are active and that they comply with the relevant standards so that they can be reliably used with visualization and analysis tools.

      Information Needs:

      • meteorological data, such as observations from ground-based networks, satellites, radiosondes, and forecasts from numerical models at various scales
      •  geographical data (land use, demographics, emissions-related activity, etc.)
      • atmospheric composition (air quality) observations such as surface monitoring networks, satellite observations, radiosondes, ground-based remote sensors, and aircraft measurements
      • numerical air quality chemical transport models (at regional to global scales)
      • WMS compliant data services providing data/visualization that can be cataloged through Community Catalogs and accessed through Community Portals.
      • WCS compliant data services providing data that can be cataloged through Community Catalogs and accessed through Community Portals

      Collaboration Functionality Needs:

      1. Service (WMS, Catalog, WCS) clients that can visualize service responses and interact with end users as well as being able to be embedded in AQ portals.
      2. Functionality for standard-based access to spatio-temporal data and metadata, and workflow software for service orchestration
      3. WAF/CSW-compliant Community Catalog(s) for registering data and services to be harvested by the GEOSS Clearinghouse 
      4. AQ JSR268 Compliant Community Portal(s) for finding, accessing the data and services needed for the execution of the scenario,
      5. Community of Practice Workspace(s) where the actors in the scenario can communicate and coordinate their activities.

       Table 1. Deployment and Registration of AQ Community Components

      Wildfire & Smoke Scenario Steps

       Transverse Technology Use Cases

       Air Quality Use Cases

       Service/Component Instances

       AIP-2 Demonstration Storyboard

      Pre-AQi. Register AQ Community Catalog

       UC #1 - register resources

       Register Catalog

       WAF and CSW Catalog: http://eie.cos.gmu.edu/CSWClient/

       Both are registered in GEOSS registry

       Pre-AQii. Create and deploy smoke forecast output WCS

       UC #2 - deploy components & services

       Create WCS

       LandCover WMS/WCS
      http://wms.gmu.edu/wms/landcover.jsp

       Mention that data providers serve data in standard interfaces, including WCS, OPeNDAP, and others

       Pre-AQiii. Register WMS, WCS in AQ Community Catalog

       UC #3 - Publish, Harvest, & Query Metadata via Clearinghouse

       Register WCS

        Submit GetCapabilities from CALPUFF WCS
          Submit GetCapabilities from 
      AIRNow WCS
      Register OPeNDAP?

       Show how the ISO 19115 metadata document is created for a service using the semi-automated method based on GetCapabilities

       Pre-AQiv. Search for fire occurrence, smoke forecast and air quality observation services

       UC #4 - client search of metadata
      UC #5 - Present Reachable Services and Alerts to Clients

       Search Clearinghouses

      Submit search request by key word “air” in CSW client and return metadata with links to visualization if WMS is provided.

       need to define specific metadata searches for the storyboard
      1) Search for community catalog and/or community portal
      2) Search for data services

        Pre-AQv. Search for spatial-temporal data analysis tools

       UC #4 - client search of metadata
      UC #5 - Present Reachable Services and Alerts to Clients

       Search clearinghouses for visualization and analysis tools

       CSW Client

       

       Pre-AQvi. Test services for reliability and compliance

       UC #9 - test services

       Test WMS services

       Time-enabled WMS Viewer in AQ Community Portal

       FGDC Service Checker
      ESA 
      ISO 19115 Validator Tool

       

      3.3     Scenario Events

      There are numerous Earth Observations that are available and, in principle, useful for air quality applications such as informing the public and enforcing AQ standards. However, connecting a user to the right observations or models is accompanied by an array of hurdles.The GEOSS Common Infrastructure allows the reuse of observations and models for multiple purposes. Even in the narrow application of Wildfire smoke, observations and models can be reused.

      The cyberinfrastructure envisioned by this scenario will enable analysts to combine wide range of air quality observations, models, and other information, which will ultimately be used to produce a broad range of decision support products for a number of different audiences. Current projects (see full version of the scenario) are significant building blocks along with the evolving data mediators of the needed networks and tools.

      Table 1 – Smoke Event Process


       Wildfire & Smoke Scenario Steps

       Transverse Technology Use Cases

       Specialized Use Cases

       Service/Component Instances

       AIP-2 Demonstration Storyboard

       AQ01. A wildfire occurs and an air quality analysts, smoke forecasters and others seeks fire occurrence, smoke and particulate matter data are derived from satellite and surface observations and accessible through SOAP/WSDL and OGC WFS, WCS services.

       UC #6 interact with services and alerts

       Access fire occurrence observations

      Future: Bind - Accessing a Sensor Alert Service, notification send to a Fire detection Workflow service

       Focus on 2007 S. California Wildfires.

       AQ02.   A modeler uses the fire locations to initialize smoke forecast models that are run to predict downwind impacts 1-3 days in the future which indicate a regional smoke pollution event

      UC #6 interact with services and alerts

      Future: Process - Running a forecast 
      (--- Workflow & Processing WG)

       

      No forecast model web servies are available at the moment - forecast models are run offline.

       

       AQ03.   Smoke forecast products are available to themanager/analyst through OGC WCS and OGC WMS

      UC #6 interact with services and alerts

       Access smoke forecast output.

      • Make GetMap request to CALPUFF_WMS 
      • Make GetCoverage request toCALPUFF WCS

       Access hourly CALPUFF runs for Oct 21-26, 2007.

       AQ04.   Multiple smoke observation products are available to the air quality manager/analyst through OGC WCS or OGC SOS

       UC #6 interact with services and alerts

       Access smoke observations, including satellite observations of smoke and surface observations of particulate matter concentrations.

       

       Access satellite and surface observations for Oct 21-26, 2007.

       AQ05.    The air quality manager/analyst uses spatial-temporal comparison servicesto visualize differences and similarities in the smoke forecast products

       UC #7 - Exploit Data Visually & Analytically
       

      Visualize maps of smoke forecasts.

      Visualize maps of smoke observations.

      Visualize tables of particulate matter concentration forecasts and observations.

       

       

       AQ06. The air quality manager/analyst uses the smoke forecasts to assess the need for public health alerts

      UC #7 - Exploit Data Visually & Analytically

       Overlay smoke forecasts with population density and locations of schools and hospitals

       

       AQ07The air quality manager/analyst issues sensor tasking requests to satellite and UAV based sensors to collectnew data over the predicted smoke impacted areas, before, during and after the event

       
      Future: Bind - Accessing a Sensor Planning Service, with scheduling requirements

       

       

       

       AQ08. The air quality manager/analyst uses thesmoke forecasts to anticipate “exceptional event” waiver requests by States

       UC #7 - Exploit Data Visually & Analytically

       Visualize smoke forecast in map and time series to identify where and when PM2.5 concentrations are expected to exceed National Ambient Air Quality Standards.

       

       

       AQ09. After the smoke event, the air quality manager/analystuses spatial-temporal comparison services between the forecast and observation data (from satellite and surface sensors)

       

       UC #7 - Exploit Data Visually & Analytically

       

       Compare PM2.5 concentrations from CALPUFF with AIRNow surface measurements and OMI AerAbs??
      spatially, through frequency distributions of values over an area, and in time series.

       

       

       AQ10. The air quality manager/analysts uses multi-source observation data to determine whether “exceptional event” waiver requests should be approved.

       UC #8 - construct and deploy workflow

       

       

       


      4.     System Model of the Scenario

      Figure 3. Publish-Find-Bind Architecture adopted by Air Quality Community


       
      Figure 4. Context Diagram showing actors and entities external to GEOSS

      Figure 5. Enterprise Specification Diagram showing the enterprise components


      5.     Specialized Use Cases

      5.1 Register Resources    

      The AQ Community registered the AQ Community Catalog as a Catalog/Component in the GEOSS Registry (Fig. 6) . The registerd AQComCat is 'found' through AQ Use Case 3 and its catalog content harvested through the GEOSS Clearinghouse. The main actors in this use case are:(1) AQ Community Catalog Service Provider; (2) GEOSS Common Infrastructure Registry (CSR). The AQ Community deployed the AQ Com Cat as a Web Accessible Folder (WAF). The catalog is then registered in the GEOSS CSR as a component and the WAF is registered as a catalog service associated with the AQ Com Cat component. 

      Figure 6. AQ Community Catalog Registration in GEOSS CSR

      5.2 Service Deployment 

      This use case describes the conditions and steps to deploy web services for accessing air quality-related datasets. The main actors in this use case are the Data Service Provider/Mediator and the Metadata Service Provider/Mediator; 
      1. Service provider implements WCS, WMS

          2. Service Provider configures the information about its Service interface as provided in the service Capabilities document:
          3. AQ Community Metadata record is created for the service. 
          2) Using the XSLT, an ISO 19115 metadata web form is auto-filled with the information from the GetCapabilities
          3) The ISO 19115 metadata elements that are not auto-filled using GetCapabilities input are completed manually via human input
          4) ISO record validated by metadata  validator web service
         

        5.3 Publish, Harvest and Query Metadata 

        This use case describes the steps to publish, harvest and discover metadata through the AQ Community Catalog, GEOSS Common Infrastructure (CGI) components: GEOSS Registry and GEOSS Clearinghouse and the Air Quality uFIND (Fig. 7). The main actors in this use case are: (1) Data Service Provider/Mediator (2) Metadata Service Provider/Mediator; (3) AQ Community Catalog Provider; (4) GEOSS Clearinghouse; (5) AQ Users. The AQ Community developed a prototype metadata record that will evolve into a convention for the AQ Community. This record has three parts: (1) metadata needed for data access; (2) metadata needed for discovery in the GEOSS Clearinghouse (common to all Earth Obs) and (3) air quality specific metadata that is determined by AQ users for sharper queries in the AQ uFIND. 
        Figure 7. AQ Community Metadata Records, Publish, Find 

        1. Publish metadata records created by the service provider or mediator in AQ Community Catalog WAF
        2. By GEOSS protocol, Clearinghouse queries GEOSS Registries for registered catalog comps.
        3. Clearinghouse extracts from GEOSS Registry record for AQComCat harvest policies  
        4. Clearinghouse access AQ WAF and harvests all or part of the available metadata holdings 
          1. AQComCat permits harvesting 
          2. Clearinghouse harvests CSR holdings 
          3. Harvest to be repeated periodically (based on GEOSS CSR Registration) 
        5. Clearinghouse extracts discovery metadata records 
          1. Use discovery metadata to coarsely filter clearinghouse contents - (see Use Case 4 for using discovery metadata to filter)
          2. For each 'Discovered' record, returns discovery metadata + link to full submitted metadata 
        6. AQ uFIND consumes the GEOSS Clearinghouse atom feed that provides a coarse filtered AQ datasets. 
          1. AQ uFIND extracts the AQ-Specific metadata and allows faceted search on space-time-parameter dimensions 
          2. Output of query is available in Atom, JSON, CSV and can be ingested by other client applications   

        This use case describes the conditions and steps for portals and application clients to support the AQ user in using data found available through the GEOSS Clearinghouses and Community Catalogs.The actors in this use case are (1) Client Apps: Portals, WebApps; DeskApps; (2) AQ Community Catalog; (3) AQ users (i.e. Analysts, Public). Client applications can use the output of the uFIND (Fig. 8) or they can build directly on the GEOSS Clearinghouse. 
        Figure 8. Air Quality Client Applications 

        Client applications are customized to allow users to search and display data in formats they are comfortable. Client apps presents user with search criteria based on queryable properties of selected catalogs, therefore the client developers need to know what type of metadata the community is using. Additional search options could be: 
          1. Simple Keyword and Area of Interest/box search 
          2. Advanced parameter searches (Organization, Time ...; Region/place of interest; Community Catalogs to be searched, SBA; Keyword in abstract, title, full text; Resource type (e.g. service, workflow, document, client app, portlet, alert, etc); Other
          3. More specific Earth-Observation criteria ( Row/path; Collection; Cloud cover; Subsetting; Ordering/delivery)
          4. Community-specific search (Thesarus, Cluster & Pattern matching) 
          5. Current Clearinghouse supports: Title;Record type;Full text;Abstract; Identifier; Bbox; Return format (not really, not all)
        The result set is returned and presented to the user with options to:
        • Display total number of results
        • Display basic information about each result (e.g. thumbnail, title, abstract, organization, etc)
        • Expand results to display additional information (e.g. full metadata record) from Community Catalog
        • Group or sort results in categories (e.g. by type of resource, by SBA, etc)
        • Select resources of interest for evaluation and/or use.
        • Outputs of the client applications also may be mashed into other applications and continue down the value chain. 

        6.1     Demonstration

        The primary goal of the AQ workgroup was to test and evaluate the Geoss Common Infrastructure for AQ applications such as the analysis of smoke events. Specifically, we were interested in connecting data processing and analysis applications to the variety of air quality data now accessible through GCI. There are numerous Earth Observations that are available and in principle useful for air quality applications such as informing the public and enforcing AQ standards. However, connecting a user to the right observations or models is accompanied by an array of hurdles. The GEOSS Common Infrastructure allows the reuse of observations and models for multiple purposes. Even in the narrow application of Wildfire smoke, observations and models can be reused. However, the user faces many hurdles: 

        “The user cannot find the data; 
        If he can find it, cannot  access it;
        If he can access it, ;
        he doesn't know how good they are; 
        if he finds them good, he can not merge them with other data”

         The Users View of IT,  NAS 1989 The ADC and UIC are both participating stakeholders in the functioning of the GEOSS information system that overcomes these hurdles. The UIC is in position to formulate questions and the ADC can provide infrastructure that delivers the answers. The data reuse is possible through the service oriented architecture of GEOSS.
        •  Service providers registers services in the GEOSS Clearinghouse.
        •   Users discover the needed service and access the data
         The result is a dynamic binding mechanism for the construction of loosely-coupled work-flow applications.  The metadata has the primary purpose to facilitate finding and accessing the data in order to help dealing with first two hurdles that the users face. Clearly, the air quality specific metadata such as sampling platform, data domain and measured parameters etc. need to be defined by air quality users. Dealing with the hurdles of data quality and multi-sensory data integration are topics of future efforts. The finding of air quality data is accomplished in two stages: (1)  the data are filtered through the generic discovery mechanism of the clearinghouse; (2) then air quality specific filters such as sampling platform and data structure are applied. Once the data are accessible through standard service protocols and discoverable through the clearinghouse they can be incorporated and browsed in any application including the ESRI and Compusult  GEO Portals. The registered datasets are also directly accessible to air quality specific, work-flow based clients which can perform value-adding data processing and analysis. The loose coupling between the growing data pool in GEOSS and workflow-based air quality client software shows the benefits of the Service Oriented Architecture to the Air Quality and Health Societal Benefit Area. 

        6.2     Next Steps

        AIP-2 moved the AQ community forward in its interaction with the GCI. The process of testing and evaluating the GCI and developing community focused components to work with the GCI identified remaining challenges in effectively using teh GCI for air quality applications. The following have been identified as topics to be addressed as teh AIP moves forward:

        1. Coordinate the operation of Clearinghouses so that the focus can be on the creation of value-adding services.

        A concerted effort is required in order to identify a core set of metadata elements needed for the Clearninghouses so that the metadata conents of community catalogs can fulfil those requirements and so that portals and applications submitting searches to the Clearninhouses have a clear understanding of what is retrievable from the Clearinghouses.

        2. Define a process for defining standards implementation conventions for a community and test engines to validate standards implementation compliance/conformance.

        Services registered in the GEOSS Registry can claim to adhere to a particular standard. However, implementations of a standard can vary and when an application or portal wishes to make use of a data service it presently has no confirmation that the service has implemented standards according to the same convention used in the application/portal. We suggest using compliance/conformance test engines to verify the implementation of standards and to have the designation of standards compliance included in metadata records so that it can be included in GEOSS Clearninghouse search criteria.  An example is that an OGC WCS service that services netCDF-CF data could go through two compliance tests, one to confirm that the implementation of the WCS follows a particicular convention and two, that the implementation of netCDF-CF follows a particular convention. This would allow an application to search the Clearninghouse for WCSes that serve netCDF-CF and also filter for those services that have gone through the compliance testing process.

        3. Engage more providers to register data services

        The creation of a process for registering and finding data services through the GCI needs to be refined based on data providers needs and expectations. The next phase of AIP should involve more data providers that can evaluate the progress in AIP-2 and tune it for more relevance to their interests.

        4. Continue engaging the UIC to formulate and evaluate User Requirements on the GCI

        The ultimate goal of GEOSS is to improve the availability and usefulness of data and information for decision makers. The next phase of AIP should bring in more users into the process for evaluating the ability to make better use of air quality data through the GCI and community infrastructure.

        5. Incorporate other features and requirements of the general air quality scenario

        Some of the focus in AIP-2 was on near real time events, in this case a wildfire. The general air quality scenario includes other aspects of air quality science and managment, including retrospective analysis, that should be incorporated in the next phases of AIP.

        7.     References


        [1] A listing of all AIP-2 Engineering Reports: http://www.ogcnetwork.net/AIP2ERs

        [2] ESIP Air Quality Workgroup, Air Quality Scenario for GEOSS AIP, http://wiki.esipfed.org/index.php/GEOSS_AIP_AQ_Scenario  

        ĉ
        Erin Robinson,
        Aug 17, 2009, 9:21 AM
        ĉ
        AQ_ER.doc
        (990k)
        Erin Robinson,
        Jul 28, 2009, 6:32 AM
        Ċ
        AQ_ER.pdf
        (460k)
        Erin Robinson,
        Jul 21, 2009, 2:44 PM
        Comments