The two dimensional graphical representation is based on a 'colored graph' consisting of a list of nodes with properties and connections between the nodes (also with properties). In the molecular representation, the nodes are the atoms of the molecule and the connections are the bonds of the molecules. In this representation, the three dimensional information is implied by the combination of atoms and bonds.
The basic (essential) information needed for the atom graphical node is first and foremost the atomic number. Additional information, such as charge or whether the atom is a radical, is also given. To facilitate calculations and manipulations additional information may be supplied to the atom (graphical node). In this document, the graphical atom node will just be referred to as atoms.
In the graphical description of molecular bonds, the two atoms which are bonded and the type of bond, whether single, double or triple is specified (the class of molecules being described is usually restricted to covalent bonds in organic compounds). In the basic 2D graphical description there is no information as to length or direction of the connection. Resonance and aromaticity are described with supplementary information to the atom and/or bond description.
The internal representation of the molecule within JTHERGAS is the Molecule class from the Chemistry Development Kit (CDK). This allows the use of many supplemental algorithms, including the more complicated algorithms such as graph isomorphism or ring recognition. The ASCII representation (for input and output and for storage in the SQL database) of the molecule is in two forms. The first is the Chemical Markup Language[47] (CML) which is an extension of the more general XML used throughout the network. The second is a ASCII linear format, what will be called the Nancy Linear Form (NLF), which is an extension of the SMILES representation of a molecule. This format is used, for example, for ASCII input of a molecule, radical or substructure.