Upload Data

This is the interface for reading in a data source file and creating a set of catalog objects in a fundamental data component dataset. This page is just to give the 'Look and Feel' of the steps involved in uploading and interpreting a data source file. More complete documentation is forthcoming.

Reading in data is done with 3 transactions:

Upload file: Upload the source file. This transaction creates one catalog object which points to the uploaded source data. The uploaded file is stored to a unique address in the storage of JThermodynamicsCloud.
Partition This transaction takes the single source file partitions it into individual component text blocks. The transaction creates one catalog object for each text block created.
Interpret This transaction interprets each component text block and creates the corresponding catalog object of fundamental data. There is a one-to-one correspondence between the text block catalog object and the interpreted catalog object. The interpreted catalog object is what is used for the calculation.

Steps 1, Read, and step 2, Partition, create preliminary data which is kept separate from the catalog objects data used for the calculation. Associated with the preliminary data is a 'Unique Generic Name' which helps identify the entire set of preliminary data. The catalog object does not use this 'Unique Generic Name', but instead uses a dataset name. All the fundamental data catalog objects under this dataset name are used for the calculation. The dataset name is used because the set of objects in the total dataset could come from several reads. For example, the fundamental data could be distributed into several source files where there could be a natural division of data. In addition, a new subset of data could be introduced to be added to the previous dataset objects.

In the upload data interface there are 6 tabs, two for each step representing the stages of the process of converting a text file to a set of catalog objects:

Step 1: File Upload: This is the setup for the initial upload of the file into file storage. Additional reference information can be given such as object, website or bibilographic links.
Upload Transaction: This is the result of the upload. In this menu the catalog object is shown. This catalog object is the first object of preliminary data that is formed with the given 'Unique Generic Name'. The catalog object includes the supplementary data along with the location of the source file in file storage. In this menu, the transaction, and hence the catalog associated with the reading of the file, can be deleted.
Step 2: Partition File: This is the setup for partitioning the source file into block, where each block represents one piece of fundamental data.
Partition Transaction:
Step 3: Interpret Partition:
Interpret Transaction:

File Upload

This is the setup page where the file to be uploaded is to be specified along with supplementary data, such as title (one line), description of the file (abstract), the file source format. 'Unique generic name' and dataset name.

When the data is filled in, the transaction can be started. The result is listed in the next tab.

Supplementary data that is also given in the file upload setup are links to other objects, links to other sites on the web or even bibliographic links. This example shows a bibliographic link to Benson's book on thermodynamics, which is the source of the data.

In addition to the 'Unique Generic Name' which identifies the preliminary data, the dataset reference information, such as dataset name, status and version is also given.

The format type determines the input in the following upload steps.

Upload Transaction

After the submit of the upload transaction, this interface shows the catalog object that was produced as the first part of the preliminary data toward creating fundamental data for use in calculations.

The catalog object has the address of the associated transaction. In this interface, it is possible to retract this transaction.

The link data that was included in the setup is incorporated in the catalog object.

Partition File

In this part of the interface, the setup for partitioning the file into blocks is given. The data is actually filled in by the previous setup. The data format, which was given in the upload setup, determines how the file is to be parsed. Each block represents one piece of fundamental data.

Partition Transaction

As a result of submitting the partition file transaction, a set of preliminary data catalog objects. In this interface, the references to each catalog object is given.

Partition Transaction: View Object

For each object in the partition file transaction tab, there is a possibility to visualize each of the created objects.

The partitioning data is the text block that was isolated out of the uploaded file. The position of this block within the file is also given.

In the next transaction, each of these blocks will be interpreted to produce a catalog object.

Interpret Partition

This is the setup for the interpretation of the text blocks that were isolated in the last transaction. The result of this transaction will be fundamental data that will the inserted into the dataset for this calculational component. The data from this source file adds to the data currently in the dataset for the component.

The setup data is filled in already from the previous setup information. The file format determines how the text block will be interpreted.

Interpret Transaction

The resulting transaction holds the references to each of the fundamental data catalog objects that were create from the interpretation of the individual text blocks.

As in the previous step, the actual catalog object can be visualized.

Creating the full dataset from several source files

A dataset can be composed of several sources. This diagram shows the building of a dataset from three separate sources. The reading in of a source file has three transactions leading to the addition to the dataset: upload the file, parse the file into block and finally,

The dataset is used to make a calculation. The dataset name is used within the dataset collection to reference the particular dataset

The catalog objects that are created during the process of reading in data:

Upload File catalog object: There is one for each source file uploaded.
Parsed Block:
Data:
Transaction:

Google Sites

Report abuse