https://6283588568219177751-a-1802744773732722657-s-sites.googlegroups.com/site/geosspilot2/air-quality-and-health-working-group/aq-documents/ESRIGEOPortal-satellitesearch.pdf?attredirects=0&auth=ANoY7cpM6IK6k4An97GcebNpbLkS5nTuz1cvXmtAwOJ01nda0Olyl2uQzkNKgKkoy-ihLlX41NUfmHedIuxR8UD4kpF1mMyzSoApFruz1au4-BP8g1NDBxOabnyYNC9bTHebVQ6eEAqaVN7rojwT7KtBWCrpom1KL2I6MRwjh0jVnHFMDDnrz1stuTNVWHRbqtRkjktfyREC7RW2-U5w6nIV47L-rwpE-5Mfdr6zoBApojA0QG4XpjPK85tjpdwXj6BPZ5TYuvbBPSxi4I7SSHKLCU78M9jSqFZX1LXlJDjv0Ztx7WOAVGg%3DWTransverse Use Cases 1-10 are activity / capability elements common to most SBA scenarios. This is a workspace for the implementation of this Use Case is specific to the Air Quality/Health SBA. The implementation of the UC components is accessible through the links.
User perspective of searching for data via the GEO and Community portals
Test searches and sucesses/limitations with Portals, Clearinghouses and/or ISO metadata content.
Green indicates a search that returns results as desired; Yellow indicates a search that returns results but not as desired; Red indicates a failed search either due to the limitations with the search client or due to limitations in the available metadata.
1) Search the GEOSS Clearinghouses for services that were harvested via the AQ Community Catalog. The AQ Community Catalog was registered in the GEOSS Component and Service Registry as a Catalog WAF. The Clearinghouse finds catalogs through the registry.
1a) Search the GEO Portals (Compusult, ESA, ESRI) for initial discovery queries.
Each GEO portal has different query and return capabilities. A comparison of the GEO portals was done in order to find the minimum set of fields needed to include in the metadata in order to be searched by each of the portals. Below is a table that illustrates the mapping between CSW:record, our metadata and USGS, ESRI Clearinghouse Search/Return
1b) Search the GEOSS Clearinghouses (ESRI, FGDC, Compusult) from the AQ Community Portal for a more specific AQ search.
1c) Identify successes and limitations in GEOSS Clearinghouses metadata searching (both in metadata stored in GEOSS registry and in search tools)
2) Identify air quality specific/customized metadata searches that might be appropriate from the AQ Community Portal
Modified by Gianni Sotis 20090210:
Why there is here this split between GEO Portal and CRH?
The CRH will be accessed through the GEO Portal.
At least the word "directly" could be removed.
Reponse by Stefan Falke 20090210
Removed 'directly' and replaced with 'AQ Community Portal'
Also moved this discussion to Comments section of page below
Semantic search is an emerging technology that utilizes Semantic Web technologies, such as ontology & reference engines, to solve some problems that can not be solved in traditional keyword-based search. For example, (1) can't support the use of domain knowledge, 2) can't support scientific-understanding in query, and 3) can't support intelligent reasoning and semantic modeling accumulated in scientific domains. Through reasoning on a highly interconnected network of data and ontologies, semantic search engines are supposed to understand user’s query on both syntax and semantic level and is able to exploit more scientifically relevant answers to users’ queries.
Towards solving the problems in traditional search, semantic search is usually tied with a specific science domain, such as Earth science and Air Quality. We have been working closely with domain experts in water cycle communities to build a knowledge base using ontology-represented knowledge within domain search for projects such as WECHO (http://eie.esipfed.org/c/portal/layout?p_l_id=PUB.1.428). Based on the ontology, we conduct research to develop rule-based inference algorithms based on existing reasoning tools to provide semantic navigation for end users. A prototype which employs the above techniques is made available through the ESIP portal (eie.esipfed.org). The semantic search engine also expands the connection to multiple major Earth science catalogs, such as GOS, NCDC, GCMD, ECHO, and ESG.
In this AIP work, we are working together to improve the semantic search engine to combine both reasoning and initial ranking algorithms to sort the searching results from the AQ catalog (WAF, Z39.50, & CSW) and other connected catalogs for Air Quality resources. The target is to 1) develop a simple CSW client for the AQ portal hosted by Stefan, 2) improve the resource volume by add other catalogs, and 3) add the semantic search capabilities developed previously to the simple client. The objective is to improve the searching experience for Air Quality community by leveraging knowledge gained through Air Quality communities. The knowledge base will be expaned through close collaboration with Air Quality domain experts using SWEET ontology framework and existing ontologies accumulated through the ESIP portal effort.
Hmmm... Not really in our case. Below a list of elements supported in our REST interface:
Tweak the search by using one or more of the following parameters. Combinations are ‘and-ed’ together:
bbox - bounding box; defines bounding box of the query - each record found has to have it's envelope completly enclosed within defined bounding box. Bounding box is defined by two pairs of coordinates representing west-south, and east-north corner of the envelope separated by coma (,). Each corner is defined by pair of values as longitude-lattitude separated by coma (,). Default: -180,-90,180,90 (entire World).
spatialRel - spatial relationship. Possible values: esriSpatialRelWithin (metadata envelope has to completly fit within request bounding box), esriSpatialRelOverlaps (metadata envelope has at least overlap request bounding box). Default: esriSpatialRelWithin. Used in conjunction with bbox parameter.
searchText - text to be searched within metadata.
contains - true to search for any word specified in searchText attribute, false to perform exact search. Used in conjunction with searchText parameter. Also accepts any of the values defined in ISearchFilterKeyword.KeySearchTextOptions. Default: true
contentType - search metadata for the specific content type. Accepts names defined in SearchEngineCSW.AimsContentTypes. Default: none
dataCategory - search metadata for the specific data category. Accepts any set fo the following keywords eparated by come (,): farming, biota, boundaries, climatologyMeteorologyAtmosphere, economy, elevation, environment, geoscientificInformation, health, imageryBaseMapsEarthCover, intelligenceMilitary, inlandWaters, location, oceans, planningCadastre, society, structure, transportation, utilitiesCommunication. Default: empty set
after - metadata updated afer certain date given as 'yyyy-mm-dd'
before - metadata updated before certain date given as 'yyyy-mm-dd'
orderBy - sort parameter. Accepts any of the values defined in SearchFilterSort.OptionsSort. Default: SearchFilterSort.OptionsSort.dateDescending.
max - maximum number of records in the feed. Default: 10.
geometryType - defines how spatial data will be represented in the feed. Possible values are: esriGeometryPoint, esriGeometryPolygon, esriGeometryBox. Default: esriGeometryPolygon.
f - output format. Possible values are: georss, kml, html, htmlfragment. Default: georss. If htmlfragment selected, body'less HTML snippet will be generated.
style - style URL. Array of URL's of the Cascading Style Sheet (*.css) files used when html format is choosen. URL's are separated by coma (,).
target - links target. Possible values are: blank, parent, self, top (just like HTML "target" attribute except no leading underscore '_'). Default: blank. It affects every link generated in GEORSS feed. Example: .../rest/find/document?target=self
USGS Search Info - Conversations between Archie Warnock and Erin Robinson
From: Archie Warnock <email@example.com>
**** Resolution was to include catalog ID in the metadata records so that we could find them regardless of if it happens that USGS or others happen to tag on a collection/component information. Can search for component ID in clearinghouse and find our GEOSS registered componenets and all data access services too.
*** Resolution: Fixed
Record Type: (March 6)
Erin Robinson wrote:When I import the record, I look at the metadata standard element and version to try to figure it out. Actually, I just look at those to find the string '19115' and if that's there, I assume what it is. Depending on how we wind up interpreting things, that could change. Right now, I'm kinda uninterested in what the metadata record is pointing to - it's all I can do to keep track of the metadata records themselves.\
One other quick thing - is there a field in ISO 19115 that you pull for record type?
*** Resolution: Wait and Check again to see if there is a way to connect record type to associated service.
Erin Robinson wrote:
Our WMS records <http://clearinghouse.awcubed.com/cgi-bin/ch_query?cmd=search&count=20&startIndex=0&keyword=Data+Access+Service&title=&type=&collection_name=&rawtext=WebMapService&abstract=&identifier=&bbox=&format=HTML&.submit.x=66&.submit.y=1> have the string. The kind of neat thing is that if you search full text for WebMapService, the return is a few other records that are using the WMS URN and a lot of WCS records include the URN for WCS. I "faked" the service type as a queryable parameter in the search interface we've copied from you:http://capita.wustl.edu/AQ_CommCat_Client_Ed.htm
I don't track service type on a record-by-record basis since for some (like FGDC records) it doesn't make sense. Not everything in the clearinghouse is a service record. That's a clever hack, though.
Erin Robinson wrote:
rawtext is what I use for full-text search. I take the XML record and strip out all of the tags and attributes. What's left is the full text of the document (the Registry records are an exception since most of their useful text is actually in attributes). The value should be whatever search terms you want.
It can return (right now) HTML, Atom and an internal XML result format (that's pretty easy to use). Plain text and Dublin Core are on the to-do list.
The two-step to get the harvested XML is to use the basic search to find the local document URN and then use "cmd=present&uuid=NNNNN". I'm still tweaking the interface, though - better (when I get it running) to use the CSW one or OpenSearch. Sheesh... I should write a manual someday ;-)
*** Resolution - follow up to see if we can have multiple raw text searches.