JThermodynamicsCloud

Documentation

JThermodynamicsCloud, under development, is a cloud based service (SaaS, Software as a Service) for not only the calculation of temperature dependent thermodynamics base on 2D-graphical Lewis structures of species and molecules, but also management of the database of fundamental quantities that are needed to perform the calculation. JThermodynamicsCloud is a concrete example within ChemConnect where the major goal of the database is to promote FAIR data practices and, accountability in that each data point is traceable through its evolution and source. The database is not static and is meant to be also flexible, configurable and allowing version controlled updates. The primary users of JThermodynamicsCloud stem from the combustion community. The structure of the database is also applicable to other similar database applications. The driving force behind data within the database is the ChemConnect ontology.

This documentation is a living document reflecting the current status of the software:

Overview: An overview of JThermodynamics and it origins and intent are outlined on this page.
User Interface: These are screens shots and explanations of the availability and use of the interface.
Concepts and Components: For the efficient use of JThermodynamicCloud the user should be aware of the basic concepts and components of the software. This is particularly necessary for managing data within the system.
Developmental Timeline: JThermodynamicsCloud is under active development and this gives a hint of future developments and when they can be expected
Implementation Details: This is an outline of the software technologies behind JThermodynamicsCloud. Some of these technologies are unique to JThermodynamicsCloud and some have been inherited from previous version, particularly JThermodynamics.
Data Sources: These are source examples of the standard database for the fundamental data objects needed for the thermodynamic calculations.

JThermodynamicsCloud and Temperature dependent thermodynamics

The fundamental calculation of this class of algorithms for the temperature dependent thermodynamics is the use of Benson Rules as developed by Benson or the variation of this for radical species the HBI method of Bozzelli. The fundamental representation of the molecule in these calculations is 2D-graphical, or essentially a valence bond representation. In this representation each atom is represented by its Lewis structure, namely the atomic number and its bonding (single, double, triple, aromatic, or radical).

JThermodynamicsCloud can use both Benson rules and the Hydrogen Bond HBI method. These algorithms are in the class of additivity thermodynamic rules based on the fact that the total thermodynamics is a sum of the thermodynamics of the bonding environment of individual atoms. Each 'rule' consists of the center atom (including its lewis bonding) and then the list of the atoms it is connected to (also in Lewis bonding format). For example, a secondary carbon would be represented as a singly bonding carbon as the center atoms and the connections with two hydrogens and two singly bonded carbons. The attraction of this class of method is that the calculation can be relatively quickly evaluated for a molecular species with the 2D-graphical representation. This is particularly interesting for automatic generation of reaction mechanisms which can generate a large variety of molecular structures of which some (or even the majority) have no known values for the temperature dependent thermodynamics.

This gives the base values and corrections of the more 'global' behavior of the thermodynamics of the molecule are given as 'corrections' to the base value.

These corrections include:

Internal Symmetry
External Symmetry
Optical Symmetry
Ring strain
Steric Interactions
Nearest neighbor interactions

For the calculation of radicals, TherGas differs from the HBI method in that it corrects the base Benson rule thermodynamics of the parent species (adding a hydrogen to the radical) and introducing corrections based on the hydrogen loss, namely symmetry changes (entropy) , dissassociation energy of the hydrogen (enthaply) , the spin correction (entropy) and translational energy change.

Given that the representation of the molecules is 2D-graphical, i.e. Lewis structures, these corrections are found by identifying substructures representing the correction with the molecule. This is done using the graph theoretical algorithm of graph isomorphism. The molecular graphs are colored, meaning each node has the atomic information, graphs. For example, the correction for ring strain of a cyclic butane molecule involves a 2D-graphical substructure representing the cyclic butane and an associated value for the ring strain. If that cyclic butane structure is matched within the molecule, then that ring strain is added to the enthalpy of the thermodynamics.

Thus an essential part of the temperature dependent thermodynamics calculations is the database of structures that need to be recognized to apply the contributions and corrections. These structures and their associated contributions have to made available for the software in order to determine whether the contribution apply to the molecular species being analyzed. In the algorithm's evolution from manual application of Benson rules, to TherGas (middle 1990's), JTherGas and JThermodynamics 1.0 (2010) and, finally JThermodynamics 2.0, different methods for storing and handling this database of information have been used. Originally, using the tables and the original definition defined by Benson, the method consisted of manually going through the tables in the appendix.

JTherGas, in the 1990's made the advance of automating this process with a FORTRAN problem. The values were read from files, translated to a intermediate form which was used directly to calculate the algorithm. Changes to the database involved changing the input text files and recompiling.

JTherGas and later JThermodynamics (based on the same software), used a database approach. In this case mySQL. The structures and the associated information are stored in mySQL tables. Here, the database and the algorithms using the database are decoupled. The database of infomation can change without changes to the software algorithm. Management of the database could be done by the multitude of tools associated with the mySQL tables or by the programs themselves. mySQL is fundamentally a web-based tool which also allows the information to be web-based.

In the management of databases, one principle that is being strived for is applying FAIR data practices. FAIR data means that data should be Findable (the F), Accessable (the A), interoperable (the I) and reusable (the R). The use of having the molecular information in an external database promotes the use of these FAIR practices. The information is findable in that it can be searched with all the mySQL database tools available. The information accessable (The A) in that being decoupled from the software and that it is web-based, it is accessable locally and globally. The interoperability of the data is assured by the decoupling of the data from the program. Since mySQL has an abundance of interfaces spanning a multitude of programming languages, it can be easily used within other applications. This also promotes reusability.

In terms of using the associated software, meaning JTherGas and JThermodynamics, one advantage of decoupling the database from the software is that several databases could exist, for example, each user could have his own database or, as the database information evolves, i.e. new structural information is added, a new version of the database could be substituted. This is important because, especially with new ab initio calculations, new compilations of Benson or HBI rules are made available. One important aspect that is lost in these implementations is keeping track of this evolution. Or more importantly, knowledge about the source of the information, especially the new information which replaces the old information is not available. Thus, the traceability of the information, meaning the source and evolution is lost.

The goal of JThermodynamicsCloud is to continue to apply the advantages of a decoupled database and enhance data traceability through its data structures and through its methods of database management. Database evolution and data source traceability is promoted by the used of 'Transactions'. Each transaction has a set of prerequisite transactions, meaning the source of prerequisite data. The evolution of data within the database can be following by tracing the chain of connected transactions. A transaction registers each change to the database. This promotes version control. The evolution of the database can be traced.

In JThermodynamicCloud the fundamental database, i.e. all the pieces of information needed to make the thermodynamic calculation, and the dataset collection, i.e. the set current set of fundamental data(such as which Benson rules, symmetry data, steric and ring corrections, etc) that is to be used for the calculation, is highly configurable. This facilitates the updating of data used for the calculation. For example, a new set of data, for example, new calculated HBI structures, is read in. This creates a new HBI dataset is created. A new dataset collection can be made which is like the original, i.e. with the other fundamental data unchanged, but instead the new HBI dataset is pointed to. This new dataset would be used instead for the calculation. In addition, two calculations could be made, one with the original and one with the new dataset. These could be compared.

Another unique feature of JThermodyanicsCloud is that all data structures, transactions, procedures and concepts used are defined within an ontology. The ontology provides information that 'drives' the software. This includes the handling of parameter units. There is no 'standard' unit within the database. Associated with each parameter is its units. Parameters are specified both during the input of data (meaning the data can be read in using the original units with no conversion) and for the calculation.

Software Tools within JThermodynamics 2.0

'The fundamental software of Thermodynamics 2.0 for calculation is based on the JThermodynamics 1, however, this version has a web interface and the database is based on Google's Firestore and Google storage. This version also is 'data-driven' from an ontology defining all the structures, relationships and procedures of the calculations.

The fundamental software tools and cloud-based software, which differ considerably from JThermodynamics 1.0. JThermodynamics 2.0 is an application based on the following cloud-based tools:

Google Storage: The google storage on the cloud is used to store all repository files. These are (usually) input files for database setup
Google Firestore: All the data is stored as JsonObjects within a hierarchy in the Google Firestore cloud system which a noSQL database. The advantage of a noSQL database is scalability, meaning large amounts of data can be stored.
Web-Based APIs: All operations, such as calculations and database and ontology access, which do not modify the database are defined through web-accessable APIs. The input to the API is a JsonObject with the required input information. As with a transaction, the output is a standardized 'response' which includes the primary structure to be given as output.
Angular GUI: Angular is a platform for building mobile and desktop web applications

JThermodynamicsCloud is based on the following software tools:

JAVA-based software: This is the programming language which does all the calculations. This is a common language for web-based applications. It is also platform independent.
Ontology: All objects, including database objects (and intermediate structures), procedures, transactions, API procedures and fundamental concepts, are defined within an ontology. This ontology gives not only a structure and relationships between the objects, but also associated with each object is documentation of its meaning.
JsonObject: Within the software and database all data objects are represented as JsonObject. Each JsonObject has a one-to-one relationship with an ontology object and a JAVA class in the server.
Chemical Development Kit: This is a set of predefined JAVA algorithms and data structures for manipulating chemical information.

Within JThermodynamicsCloud

Catalog, Record and Component objects: These are the fundamental objects found in the database. A catalog object is the object stored in the database and is made up of components, a single piece of data like a string or a number, and records, sets of components and other records. The specification of these objects is found in the ontology. In the JThermodynamicCloud, the catalog object is a Json object in the database and the interface and a JAVA class in the JAVA server. The structure of these objects is specified by an ontology.
Transactions: All modifications to the database are made using transactions (defined within the ontology under Event/TransactionBase). Associated with each transaction is a set of prerequisties, which provide information about the data objects created previously that are needed by the transaction, the correspondence to the software to perform the transaction, the input information needed to steer the transaction and the output object(s) of the transaction. The output is a standardized 'response' which includes the primary structure to be given as output. The full definition of the transaction, meaning the data objects needed for inputs and outputs, is defined within the ontology. The ontology also provides relationships between the transactions.

Timeline and Milestones

End of December (begin of January): A demonstration version of JThermodynamicsCloud will be available on the cloud. This version should demonstrate the look and feel of the website. Calculations using the 'standard' database should be possible.
End of January: A demonstration version, particularly for partners, where users can produce and use their own database. This is a typical beta-version, if it the site is used 'properly' and the data read in is absolutely correct format, the system should function okay. Error detection and recovering from errors at this point is not guaranteed (may require administrator intervention). Each user would have their own account.
End of January: Organization of 'standard' data file as a basis of a complete database (also to be used as patterns for new data).
Feburary-April: The robustness of the website experience should improve (again with cooperation with partners).
March: Improvement of the visualization of individual pieces of fundamental data. This includes the visualization of 'related' and 'traceable' data. This would also include outputing files containing the database element
February: Before this point, data input is largely done by reading in of blocks of data in input files. During this development time, updating (and maybe even creating) individual pieces of fundamental
February: The ability to transfer fundamental data from one dataset to another (this includes making a clone of an existing database)
March and beyond: Incorporation of new data introduced by partners.

For the most part, until Summer 2023,

Ontology Organization

dcterms:Event->TransactionEvent

prov:Agent->prov:SoftwareAgent->DatabaseServiceBase

prov:Agent->prov:SoftwareAgent->DataObjectManipulation

Google Sites

Report abuse