Parameters and their values are the essence of data in a data repository. One of the main purposes of a parameter is the condensation of a complex reality into a single value. Its utility is in the comparisons with the ‘same’ parameter in other similar situations elsewhere. Furthermore, with in a data source (file) the ‘meaning’ of a parameter is only implicitly implied by its context, i.e. in which data set it is found and with more information implied by the name. However, concepts rely on human interpretation, especially when they are to be related to other ‘similar’ or even the same parameters in other data sets.
The goal of the knowledge base of CHEMCONNECT is to formalize the concepts (through the use of ontologies) and take one step closer is automating the ambiguity and comparability of parameters coming different sources.
To illustrate, let us look at a parameter giving a temperature value. What temperature that is intended can usually be interpreted (not necessarily automatically) from either the name of the parameter used, such as ‘experimental temperature’, ‘water bath temperature’, ‘measured temperature’, etc. The context, i.e. in which data source it is found, provides information as to the intention and meaning of the parameter. Though the name label implies meaning, there are no universal standards, even within a domain community there is rarely a consensus. For example, just the label ‘temperature’ could be given as simply as ‘T’ or ‘Temp’ or even spelled out completely. The label could also be complicated with the units within the label. Which brings up another source of ambiguity, namely the units used for the value of the parameter. The difficulty lies not in the ambiguity of the value, but in the comparison with other similar reported results which may not be in the same units. Though SI units should be used as the units of choice, sometimes for historical or convenience reasons within a community, they are not used. And even if an SI unit is used, there is still of choice of, for example, joules, millijoules, kilojoules, etc.
What the knowledge base of CHEMCONNECT (through the ontology representation) does is give the a (standardized) parameter context and relationships to other information and adds supplementary information to parameter. The context of the parameter is given by the parameter’s placement of a hierarchy of concepts. For example, the pressure measurements in an RCM experiment (RCMCompressionPressure) can be found in the hierarchy under RCMPressureMeasurement (all pressure measurements of a rapid compression machine) which in turn is under PressureParameter (all pressure measurements). In addition, the purpose of the RCMCompressionPressure is labeled as a fundamental experimental measurement (FundamentalExperimentalMeasurement) and the general concept associated with the parameter is that it is a experimental measurement of the rapid compression machine (RCMExperimentalMeasurement). The unit expected for the parameter is given in he parameter specification as a unit class, in this case being pressure (PressureOrStressUnit). It is only when the actual parameter is given that the specific units are specified, for bar or atomospheres. The knowledge base helps in conversion by having conversions between the different units of the class.
A parameter is used and defined on several levels:
· Parameter Specification: Within the knowledge base, a parameter type with a specific label is defined with the unit type (unit class), uncertainty value type, a purpose, a concept and whether it is an input (dimension) or an output (measure).
· Attribute: Within a device definition as a description or in an observation as a parameter, the parameter concept is specified.
· Value Specification: In the definition of the observation specification further specification of the parameter is made through the specification of the specific unit of the parameter value and the correspondence to the source parameter.
· Value: The actual value corresponding to the specification.
The set of parameters used in the knowledge base ontology is found within a tree of parameter concepts. At the top level, there are two fundamental times of parameters:
· Fixed Parameter: This is where the label is used and is the parameter.
· Dynamic Parameter: This is where the label is specified dynamically in its use. A typical example is a species parameter with the labels being the species names.
The parameter specification is:
· Label: This is the label used to identify the type of parameter, for example, within other specifications such as observation and device parameter specifications.
· Unit Class: This is class of units of the parameter.
· Typical Value: A typical (default) value and unit for the parameter is given.
· Purpose: The purpose of the parameter
· Concept: The general concept represented by the parameter
· Type: Fixed or dynamic parameter
In the specification of a catalog object attribute descriptions, such as in a device or protocol specification, attribute parameter specification is given. In instantiation of the catalog object, for example, defining a device characterization such as a heat flux burner (HeatFluxBurner), the specific attribute unit and value of the description is given. For example, in the case of the heat flux burner, the parameter of the burner plate diameter (BurnerPlateDiameter). A default unit of centimeter and a default value of 3 is given within the specification.
In an observation specification a set of parameter specifications are given representing the standard values for that particular observation. Within the observation, they are classified as input parameters (dimension) or output parameters (measure). For example, in the definition of the standard reporting from a rapid compression machine, the input parameters represent the experimental conditions (pressure and temperature) and the output parameters represent the measured values including ignition delay times.
As an attribute or as a parameter in an observation, the specifications set forth in the catalog specifications are used to interpret the values given. In a catalog object instantiation, such as a device description, the values, with the correct corresponding units, are given through the interface. In an observation, the set of parameter values are given through the data source.
A set of observations corresponding to a protocol entails the specification of each parameter (as Value Specification) of the data to be found, particularly the units and the correspondence within the source file, within a data source file (as Value). Within the observation specification within the protocol specification within the knowledge base is the parameter specification (parameter concept).
In the standard reporting of ignition delay times from rapid compression machines, for example of ignition delay times from the files given at the University of Connecticut (http://combdiaglab.engr.uconn.edu/database/rcm-database), two blocks of data are specified in the Rapid Compression Machine ignition delay time reporting protocol (RapidCompressionMachineReportingProtocol), the fuel composition, defined in and several parameters. In the value specification (Value Specification) of the ignition delay time (IgnitionDelayTime), the time units are milliseconds and the corresponds to the column of the spreadsheet labeled ‘Ignition Delay (msec)’. In the parameter concept of the ignition delay time (ExperimentalIgnitionDelayTime), units are specified as time units (TimeUnits) with the specific units being milliseconds (MilliSecond). In addition, the purpose is given as a fundamental experimental measurement (FundamentalExperimentalMeasurement).
An important aspect of scientific observations are the units of the individual pieces of data. More often than not, the units used are implicit. The community usually report using a ‘standard’ unit common for the domain. Problem arise with the words ‘usually use’ and if the data should be used by another context or community. Have explicit knowledge about the units and conversions between units supplied by the QUDT, Quantities, Units, Dimensions and Data Types, ontologies. Within CHEMCONNECT, the QUDT have been expanded where needed.
In CHEMCONNECT, a parameter specification lists the type of data, such as qudt:TimeUnit in the properties of the specification as a qudt:SystemUnit. Since it is common within a domain to have a ‘typical’ unit, under the annotations of the parameter specification, a specification unit in the class, for example, qudt:MilliSecond, can be listed as a skos:example property which, in turn, can be annotated with a skos:example value, such as 10.
Within the interface, when a specific unit type is selected, initially the example specific unit is viewed. However, the list of possible other specific units is made available through a pull-down menu.