JSON data object definitions
The representation of data objects in JThermodynamicCloud has JSON like structure, i.e. property name with a corresponding value. This design decision allows the same data representation to be used even though the syntactical form within each of the different system components can differ. The JAVA objects in the background services are based on com.google.gson.Gson, the data objects in the Angular Material are based on typescript JSON objects, the RESTful service representation of data in the body are JSON objects and in the database the JSON objects translate directly to the mapping structure used in the Google Firebase Firestore representation. This common format design makes the transitions, between user interface, backend computation/manipulation and database access fairly seamless. The ontology definition also gives the software engineer managing and programming the system a common reference.
The JSON format is non typed and free-format and thus the only documentation of the objects themselves are the keywords of the fields. In JThermodynamicsCloud, the ontology definitions provide machine and human interpretable documentation and context to the JSON data objects. There is a one-to-one correspondence between each JSON object and the ontology class. The JSON objects are modeled in the ontology after the DCAT ontology. The keywords of the JSON objects are defined in the dcterm:identifier field in the annotations. The annotations also give human interpretable labels (rdfs:label) and comments (rdfs:comment) about the objects. The position of the ontology hierarchy of the object gives added context to the objects. From a programming perspective, the ontology object definition is an invaluable tool for keeping track of the objects structure. Since the JSON object is inherently free format it can be difficult to keep track of which fields are available or which fields must be filled in. This task which is often accomplished with the programming environment, cannot be applied to JSON objects. Using the ontology as a reference facilitates this task. The additional documentation within the ontology is also helps the programming process.
Due to the machine readability of the ontology, a certain degree of automation can be utilized. This is particularly useful when the a generalized algorithm can read the ontology and base its manipulation on the ontology definition. This means that updates to include more data can be made only updating the ontology and not modifying the program. One common example of this are pull-down lists in the user interface. More choices can be added within the ontology without touching the program. This feature is discussed further in the operations by type section.
Annotations
Annotations in the CHEMCONNECT ontology, from which JThermodynamicsCloud is derived, is an integral part of the documentation and setup of the data. Every data object, whether it be a component, a record or catalog object, has a set of standard (and optional) annotations. The purpose of the annotations is twofold, to provide identifiers for the JSON objects and to provide some human readable documentation to the objects. The human readable elements are also used in the GUI. There are 5 annotations which are required for (or can be expected within) each data object:
dcterms:identifier: This is the identifier, i.e. the property name, that is used with the JSON object to identify when this class of object is to be the corresponding value. This identifier includes the namespace (the namespace of CHEMCONNECT and JThermodynamicCloud objects is dataset.
skos:altlabel: This is a short label identifying the object. It is often the same as (but not a requirement) the identifier, but with no namespace. This is often used as an abbreviated name of the class of object.
rdfs:label: This is a short (can be a few words, separated by spaces) label for the object. This is what appears, for example, if the object is appears in the the user interface.
rdfs:comment: This is an abstract describing the data object. It is a longer explanation. In the user interface this can be used to supplement the label information.
DCAT Ontology
All the data objects of JThermodynamics are based on the DCAT ontology. This concept stems from the use of ontologies in ChemConnect (JThermodynamicsCloud is another use-case of the CHEMCONNECT system, especially with regards to ontologies).
dcat:Catalog
All objects stored in the database are called catalog objects and stem from dcat:Catalog of the DCAT ontology. The ThermodynamicCloud database catalog objects are subclasses of the dataset:ChemConnectThermodynamicsDatabase class. This class is within the ChemConnect ontology hierarchy:
dcat:Catalog -> dataset:SimpleCatalogObject -> dataset:SimpleDatabaseObjectStructure -> dataset:SimpleDatabaseObjectStructure -> dataset:ChemConnectDataStructure -> dataset:ChemConnectDatabaseBaseElement -> dataset:ChemConnectThermodynamicsDatabase
In JThermodynamicsCloud a field property can point to one of two types of values. The first is essentially a string value. Within the ontology, these values are subclasses of the dataset:Component class. If the value is of another type than string, for example, numeric, then the system translates the string to the proper type. Within the ontology, these different types are defined in the dataset:ChemConnectPrimitiveDataStructure class hierarchy. The second type of value is another JSON object. Within the ontology, these values are subclasses of the dcat:CatalogRecord class.
dcat:CatalogRecord
A dcat:CatalogRecord is a compound object, just as the dcat:Catalog. The semantic difference is that dcat:CatalogRecord objects are not database objects. The dcat:CatalogRecord objects of JThermodynamicCloud are the same as CHEMCONNECT and are subclasses of:
dcat:CatalogRecord -> dataset:ChemConnectCompoundDataStructure -> dataset:ChemConnectCompoundBase
The dataset:ChemConnectCompoundBase is the top level of the base records. Other branches exist, for example dataset:ChemConnectCompoundExpData, for other specialized compound objects. With the catalog objects and other records, record is specified with dcat:record.
dataset:ChemConnectPrimitiveDataStructure
A dataset:ChemConnectPrimitiveDataStructure specifies a string keyword element. With the catalog objects and records, components are specified with dcterms:hasPart. The different data types are specified within the ontology hierarchy under dataset:ChemConnectPrimitiveDataStructure. Subclasses of this include classifications (dataset:Classification), single word keys (dataset:ShortStringKey), single line of text (dataset:OneLine), paragraphs (dataset:Paragraph), boolean (dataset:BooleanDataType), numeric (dataset:ShortStringAsNumber) and others. Within this hierarchy, there are also domain specific classes, such as dataset:JThermodynamicsDisassociationEnergyValue, or more exact type specifications, such as dataset:FileSourceFormat.
One special case of dataset:ChemConnectPrimitiveDataStructure values are subclasses of dataset:Classification. An extra field in the annotations, rdfs:isDefinedBy, points to a class (a subclass of dataset:ChemConnectClassifications which itself is a subclass of skos:Concept) whose subclasses are the possible values. Within each choice the rdfs:label and the rdfs:comment fields give string values that can be used in the user interface.
Catalog Object Hierarchy
The JThermodynamicCloud catalog data stems from the CHEMCONNECT dataset:ChemConnectDatabaseBaseElement which lies in the ontology hierarchy from dataset:SimpleCatalogObject, which supplies the data regarding accessability, dataset:SimpleDatabaseObjectStructure, which has the FirebaseID that gives the exact location within the database document hierarchy of the catalog object,
SimpleCatalogObject
The SimpleCatalogObject has the fundamental information for each database/catalog object. This basically gives information about access to the object, through the key and the FirestoreID, the type and :
CatalogObjectAccessModify: This is who can modify and delete the object. The value is a username (UID) or consortium.
CatalogObjectAccessRead: This is who can access the catalog object. The value is a username (UID), consortium or public
CatalogObjectKey: This is the key string name of the object. It is a unique identifier used to identify and access the object within the collection.
CatalogObjectO.wner: This is the UID of the owner. A Unique UID is used to identify every user.
DataObjectType: This is the ontology class of the object itself
TransactionID: With this FirestoreID the transaction that created this catalog object can be directly retrieved from the database.
SimpleDatabaseObjectStructure
This is the top class of all objects appearing in the database. For this reason, the dataset:FirebaseID record is a record with the class. The FirebaseID give the exact location in the hierarchy where to find the catalog object.
ChemConnectDataStructure
This is the top class for a catalog database object. This introduces several sets of records for references and links. These record objects link this database object with other references:
dataset:FirestoreCatalogIDForTransaction: This is the FirestoreID address of the transaction that created the object
dataset:DataObjectLink: This set of links connects this catalog object with other catalog objects. It can, for example, give reference to the catalog objects that were used to derive this catalog object.
dataset:dataset:DataSetReference: This is a set of bibliographic references for the catalog object. Once again giving a link to the object's source.
dataet:ObjectSiteReference: This is a set of links to websites and how that link relates to the object.
ChemConnectDatabaseBaseElement
This is the top class of all Firestore database objects. The immediate set of subclasses represent the domains of the catalog objects. For example, for the JThermodynamicCloud system, the immediate subclass is the dataset:ChemConnectThermodynamicsDatabase class.
ChemConnectThermodynamicsDatabase
This is the top class of the database objects of JThermodynamicsCloud.