Parameters are central to the storage of data within CHEMCONNECT and how they managed in is another typical example of how CHEMCONNECT defines and uses the ontology on different conceptual levels. The following example shows how starting with the catalog object, ObservationCorrespondenceSpecification and a concept template defining a specific set of observations for a domain, is used create a correspondence between a matrix of input values from the user and the standardized domain parameters defined in CHEMCONNECT.
When reading in repository data, the user is not required to use the standard names (including unit names) for the data values (typically a matrix of values). To provide meaning and expand the context of the repository data, a correspondence needs to be established between the CHEMCONNECT parameter concepts and those found in the user’s input (file). This is accomplished by the catalog object ObservationCorrespondenceSpecification, where a one-to-one connection is made between specifications of the record object, through ObservationSpecification, and that set of parameters in the user input. If the user input is a matrix, the ObservationSpecification can be viewed as the specification of the parameters for each column. The ObservationSpecification entity is defined as having multiple input (DimensionParameterSpecification) and multiple output (MeasureParameterSpecification) parameter specification. Both DimensionParameterSpecification and MeasureParameterSpecification are derived from ParameterSpecification, where the individual specifications, such as units, labels, etc., of the parameter are defined.
The record object ParameterSpecification and the catalog and record objects mentioned previously define a data object with fields, which has a one-to-one correspondence with a data object in the database, but does not hold any specific domain information about a specific parameter. The domain information come from the template concept information.
Associated with the ParameterSpecification record object is a hierarchy (under the ontology concept ChemConnectParameters) of templates representing domain information about specific parameters. The name of the parameter specification concept reflects what the parameter describes in the domain. For example, ExperimentalTemperature describes the temperature conditions of the experiment. Further information about the parameter is found in the properties, cube:concept, giving another concept keyword describing the parameter, and hasPurpose, a more specific keyword specifying the purpose of the parameter. These two properties bind the parameter to keyword concepts (standardized ontology concepts within a hierarchy of concept and purpose keywords). For example, ExperimentalTemperature has the concept TemperatureOfExperiment and a purpose of ExperimentalCondition.
A third property of the template concept lists the units of the parameter (qudt:unitSystem). This property points to the unit ontology entity within the QUDT hierarchy (which has CHEMCONNECT annotations as required by the domain knowledge). The properties and instances of the QUDT unit definition provide all the necessary examples of specific units and conversions between specific units. The qudt:unitSystem of ExperimentalTemperature is qudt:TemperatureUnit. Which has instances of temperature units such as qudt:Kelvin and qudt:Celcius.
In the template within the ontology, only the unit class is specified. If a specific parameter is to be described, then the specific unit should be chosen. This is done by creating an instance derived from the template specifying the unit. This instance is then stored in the database. This is the next level of parameter specification. This level would be used, for example, to describe a column in a matrix of data (see next example).
A typical data set, or observation (coming from qb:Observation in the data cube ontology) involves several parameters. For example, a pressure versus time graph involves two parameters, time and pressure. To describe experimental conditions in a chemical experiment, typically three or more parameters are needed, temperature, pressure and each of the specifies concentrations. A single data set can be thought of as a matrix of data, where each column is a parameter.
To describe a matrix of data, a ParameterSpecification of each column is needed. In CHEMCONNECT an entire matrix specification is described using the ObservationSpecification ontology record object in CHEMCONNECT. As described previously, this involves two sets of input and output specifications (subclasses of ParameterSpecification). To describe complete data sets, templates describing which parameters are involved are listed as qb:dimension or qb:measure properties for input and output properties, respectively. For example, the observation PressureTrace has TimeInEvent as a qb:dimension property and ExperimentalPressure as a qb:measure property. The ObservationSpecification also has the qb:concept and hasPurpose properties.
As before, the template concept for ObservationSpecification describes which parameters are to be found in the type of observation. But, to describe a specific matrix, all the ParameterSpecification entities must be filled in with the specific unit to be used and stored in the database. Once again, in CHEMCONNECT the ObservationSpecification is the data-type, the corresponding template provides the generic domain information and the database object describes a specific matrix of observations.
The database object for ObservationSpecification has the column information for a matrix. However, the ObservationSpecification uses the ontology standardized domain names for the parameters. The columns of the input matrix would not necessarily have these names. The names of the specification names have to be assigned to the corresponding columns in the input matrix. This is the job of the ObservationCorrespondenceSpecification data type which is set up in the user interface. The column headings are extracted, the user assigns each to a specification parameter name and this information is stored in an ObservationCorrespondenceSpecification database object.
The database object provides the key, so to speak, for interpreting a specific matrix. Typically a laboratory (or domain consortium) would have a standardized way of presenting data. The ObservationCorrespondenceSpecification database object would only have to be defined once for all the data produced by the laboratory.