Documentation

Part 1: Introduction and Preliminary Information

It is the toXML library that runs the whole conversion process. However, it needs to know the XML format it will create. This is done via the instructor that it creates once and at first and loads later every time it runs. Therefore, understanding how and when the instructor is created is crucial, and readers better have a quick look through "Part 3" below before going on.

The workflow of using toXML, in general, is as follows:

1. The user
  1. creates a copy of the toXML sample spreadsheet on their Drive
  2. defines the XML format (explained in "Part 2" below)
  3. works building up their dataset(s) (explained in "Part 4" below)
  4. sets their options for the conversion process (explained in "Part 5" below)
  5. invokes the library via the toXML menu with the "Start" command.
2. The library, in turn,
  1. analyzes the XML definition
  2. creates the instructor (explained in "Part 3" below)
  3. loads the instructor
  4. analyzes the data in the dataset(s)
  5. starts converting the data. (explained in "Part 6" below)
3. The next time the user wants to either
  1. recreate their XML file
  2. or start a new spreadsheet to create XML files with the same XML format,

they can start with step #1.3 and the library starts with step #2.4.

Part 2: Defining the XML Format

The XML is to be defined in the settings sheet whose name can be set with the "appSettSheetName" constant in the attached script.

If you delete the constant, the script will not work. If you set it to undefined or to null or no sheet with the specified name can be found, the library will attempt to convert the dataset to a KML file.

Though, hence, you don't need the settings sheet to create KML files, the toXML sample spreadsheet includes one with the name ".." containing the definition of the KML elements to give you an idea of how an XML format should be defined.

The settings sheet must include two blocks of information and a cell storing the path of the instructor.

The list of elements must start in the cell whose address can be set with the "appSettSheet_Xml" constant in the attached script.

If you delete the constant, the library will not work. If you set it to undefined or to null or no cell address with the specified value can be found or the cell in question is blank, the library will attempt to convert the dataset to a KML file so the entire settings sheet becomes obsolete.

The path of the instructor is explained in "Part 3", below. The library again will attempt to convert the dataset to a KML file if it cannot find this particular cell so the entire settings sheet becomes obsolete.
The separator and the indicator characters are a series of characters to be used—while building up the list of elements—to separate elements from each other and indicate what features the elements shall own.

In conclusion, if (1) the value of the "appSettSheetName" constant is valid and there is a sheet with that value, (2) the value of the "appSettSheet_Xml" constant is a valid cell address and that cell is not blank and (3) the cell that stores the path of the instructor is found—upon the library being invoked by the user via the toXML menu—the library can analyze the XML definitions. Otherwise, it will assume the user wants to create a KML file based on their dataset.

How to define the XML elements:

Starting from the cell whose address is to be set with the "appSettSheet_Xml" constant in the attached script, each XML element must be defined throughout the same column in the next row below including their attributes and children. Example:

appSettSheet_Xml = 'C1'

Cell C1: Element1

Cell C2: Element2

...

Entries after a blank cell will be ignored.

Each row is to correspond to a parent followed by its attributes and children separated by the primary separator. Example (space character is the primary separator):

Parent1 Attribute Child

Parent2

...

The primary separator: The library gets the primary separator from the cell next to the cell with the value "cI_2=". However, this cell is not a must—it can be removed and the primary separator will become the default value, the space character.

For convenience, instead of separating with the primary separator, each child and attribute or combination of them can be entered in the cells to the right, as well. But there must be no blank cell in between. Example (space character is the primary separator):

appSettSheet_Xml = 'C1'

Column C Column D Column E Column F

Row 1: Parent1 Attribute Child1 Child2

Row 2: Parent2 Attribute1 Attribute2 Child1 Child2

Row 3: Parent3

...

Further, every element is defined by one, two or three blocks separated by the secondary separator—the first block basically being the attribute indicator if the element is an attribute and the label of the element. For children, the number of the row where the child in question is the parent can be entered instead of its label—this way, the list may get a cleaner view. Example (space character is the primary separator, ">" is the secondary separator, quote is the attribute indicator):

12th row: Parent1>block2>block3 Attribute>block2

random row: Parent2>block2 12>block2 Attribute

...

The secondary separator: The library gets the secondary separator from the cell next to the cell with the value "cI_3=". However, this cell is not a must—it can be removed and the secondary separator will become the default value, the ">" character.

The attribute indicator: The library gets the attribute indicator from the cell next to the cell with the value "cI_1=". However, this cell is not a must—it can be removed and the attribute indicator will become the default value, the quote character.

Label block:

Note that since attributes cannot be parent, only tags can be defined with a row number.

If you prefer to type the label of the child instead of its row number and the child is not a parent in any row, the library will set it as an attribute. However, there can be cases where an attribute and a parent have the same label. If you define such a child with its label rather than the row number, you have to prepend the attribute indicator to the attribute label. Example (space character is the primary separator, ">" is the secondary separator, quote is the attribute indicator):

12th row: AA>block2>block3

random row: BB>block2>block3 "AA>block2

...

Datatype block:

As a voluntary option, one of the other two blocks must be a string that is to stand for the datatype of the element's content. The string you will type to indicate the datatype must contain at least one lowercase letter. Example:

AA>angle180

...

Features block:

The other block is the block that is, if needed, to be composed of the characters indicating the features of the element:

The abstract and the not-abstract indicators: If the element is abstract or not

The library gets these indicators from the cells next to the cells with the value "cI_4yes=" and "cI_4no=". However, these cells are not musts—they can be removed and the indicators will become the default values, the "A" and the "X" characters, respectively.

As a matter of fact, defining parents as not-abstract is obsolete. It has a meaning only in cases where a parent element is defined as abstract while it must be treated as not-abstract when it is a child of another parent.

The required and the not-required indicators: If the element is required or not

The library gets these indicators from the cells next to the cells with the value "cI_5yes=" and "cI_5no=". However, these cells are not musts—they can be removed and the indicators will become the default values, the "R" and the "N" characters, respectively.

As a matter of fact, defining parents as not-required is obsolete. It has a meaning only in cases where a parent element is defined as required while it must be treated as not-required when it is a child of another parent.

The unpaired and the paired indicators: If the element is paired or unpaired

The library gets these indicators from the cells next to the cells with the value "cI_6yes=" and "cI_6no=". However, these cells are not musts—they can be removed and the indicators will become the default values, the "U" and the "P" characters, respectively.

Normally, the library detects which elements are paired and which are not. However, the user can, with these indicators, define paired elements as unpaired when they are children or vice versa if their XML format requires so. This means that if a paired parent is defined as unpaired when it is a child of another parent, its children will not be included in the final XML file. Similarly, it won't be allowed to store any data if it normally could.

The repetition indicator, and use of an integer greater than "1": How many times the element can or must be repeated

The library gets this indicator from the cell next to the cell with the value "cI_7=". However, this cell is not a must—it can be removed and the indicator will become the default value, the "+" character.

You can define repetition info for parents, as well.

In the string that forms the block; (As a matter of fact, when the element is defined as required, repetition is a "must". Otherwise, it is a "can".)

- a repetition indicator followed by an integer means that the element can or must be repeated that many times at most.
- a repetition indicator preceded by an integer means that the element can or must be repeated that many times at least.
- a repetition indicator without a preceding or following integer means that the element can or must be repeated as many times as the user wants.
- an integer without a preceding or following repetition indicator means that the element can or must be repeated exactly that many times.
The group-of-alternative-elements indicator: If the element is a member of an alternative group of elements

The library gets this indicator from the cell next to the cell with the value "cI_8=". However, this cell is not a must—it can be removed and the indicator will become the default value, the "-" character.

You can define alternative groups for parents, as well.

In the string that forms the block; the indicator must be followed by a one-length non-lowercase and non-numeric character, which will stand for the group of alternative elements within the particular parent.

All these indicators must be one-length non-numeric non-lowercase and unique. None of them may be null or equal to the label of any XML element. Their order within the block does not matter.

Since the datatype block must be all lowercase, the script can detect which block identifies the datatype and which block is composed of a string that sets the features of the element. Thereby, as both blocks are not compulsory, their order doesn't matter, either.

Examples (with default values of the indicators):

Elem>A Elem is an abstract element

Elem>angle180>R Elem is a required element containing an "angle180" datatype

Elem>+ Elem can be repeated as many times as wanted

Elem>R+ Elem must be repeated as many times as wanted

Elem>+2 Elem can be repeated twice at most—may not be included or can be included once or twice

Elem>R4+ Elem must be repeated four times at least

Elem1>-A Elem1 and Elem2 are alternatives to each other—only one of them can be included in the particular parent

Elem2>-A Elem1 and Elem2 are alternatives to each other—only one of them can be included in the particular parent

Examples of obsolete entries:

Elem>1 is actually Elem "1" is obsolete

Elem>+1 is actually Elem "+1" is obsolete

Elem>1+ is actually Elem>+ "1" is obsolete

Elem>R1 is actually Elem>R "1" is obsolete

Elem>R+1 is actually Elem>R "+1" is obsolete

Elem>R1+ is actually Elem>R+ "1" is obsolete

Parent>X "X" is obsolete for the parents

Parent>N "N" is obsolete for the parents

Examples of the not-abstract and the not-required indicators:

12th row: Elem1>AR Elem1 is abstract when it is the parent in the XML

random row: Elem2 12>XN Elem1 is not abstract and not required when it is a child of parent Elem2

...

Example of other usages of the indicators when defining children:

12th row: Elem1 8 When is the parent in the XML, Elem1 has a child and, therefore is paired

random row: Elem2 12>U ... Treat Elem1 as unpaired when it is a child of parent Elem2

Malicious definitions like defining an attribute as paired or abstract will be detected, as well, and the execution of the library will be terminated with a prompt dialogue message so that the user can correct the wrong definition.

Critical facts (some are already mentioned above):

You can use the values of the primary separator, the secondary separator and the space character consecutively or at the beginning or the end of each row. No matter, the script will remove these redundant characters.

You can set the datatype as the second block and the features as the third or vice versa. But the first block must be the label of the element.

If you type the label of a child instead of the number of the row where it is the parent, the library will find the row where it is the parent and replace it with the row's number. If it cannot find any row where the child is the parent, the element in question will be set as an attribute.

However, there can be cases where a parent and an attribute have the same label hence the library will set both as child. To eliminate it in such cases, pretend the attribute indicator to the label of the attribute.

You don't need to define any element as unpaired or paired since the script will detect which elements are unpaired and which elements are paired but only in cases when a paired element is unpaired when it is child, or vice versa.

To the left of the cells storing the separators and identifiers, are the labels which have no effect in any way and can be changed or removed (like the cell of instructor path that stores the path of the instructor in the same sheet, explained in "Part 3"). However, the values "cI_1=", "cI_2=" and the like are anchors to let the library find those special characters the user wishes to be. If they are changed or removed, the library cannot find them and will go on with their default values, which can well be the case the user wants, as well.

Part 3: All about the Instructor

The instructor is composed of script functions, classes and algorithms that will instruct the library on how to implement a dataset while converting it to the specific XML file that the user aims at. Therefore, if the instructor for the desired XML format has not been created before or it cannot be found, the library has to create it.

When the user invokes the library via the toXML menu with the "Start" command, the library, in turn, checks a special cell with the value "cF_1=" in the settings sheet and reads the value of the cell right to it (from now on, called "cell of instructor path").¹

If the settings page cannot be found or it can be found but the list of the XML elements (explained in "Part 2") cannot be found, the library will not create the instructor but rather will start to convert the dataset to a KML file. The instructor for KML files resides in the library.
If the settings page and the list of the XML elements (explained in "Part 2") can be found and
- If the cell is blank
- or the cell is not blank but its value is not a valid path, i.e. the library cannot find the instructor in the user's Drive, the user is asked for a path as to where and under what name in the Drive the instructor should be saved.
  - If the user enters blank, it will be assumed the user cancels and the library terminates.
  - If the user enters a string representing the path², the library (1) analyzes the definitions, (2) creates the instructor, (3) stores the path in the cell of instructor path to eliminate the creation of the instructor every time the library is invoked, (4) loads the instructor and (5) starts to convert the dataset in the active sheet to the XML format desired.
- If the cell is not blank and the library can find the instructor in the user's Drive, (1) loads the instructor and (2) starts to convert the dataset in the active sheet to the XML format desired.

If the user later either changes the cell content or (re)moves the instructor in the Drive, the library cannot, the next time it runs, find the file and will create a new one.

Google Drive supports duplicate folder and file names. If somehow there are other files with the same path of the instructor, the library will load the one created latest.

As long as the user does not change the value of the cell of instructor path and does not (re)move the instructor in their Drive, the library will not process again the XML definitions. Therefore, the user can remove the XML definitions or even remove the settings sheet, having moved the two cells for the instructor path to, say, the options sheet—if the target XML is a KML file, those cells can also removed, as well.

Similarly, the user can copy the same path of the instructor to other toXML spreadsheets, as well.

¹ The string representing the path is expected to contain the names of folders and of the instructor file in the user's Drive, separated by slashes²,

typed in directly by the user
or set by the library, after the library asks for it.

² While the folder names and the file name must be separated by slashes, if the user wants slash characters in the names, an additional slash must be added which will work like escape characters. (For one slash, use two slashes; for two slashes, use three slashes; ... As a result of the rule of slashes, only the root folder name can start with a single slash and only the file name can end with a single slash alone.)

To the left of the cell of instructor path, is the label which has no effect in any way and can be changed or removed (like the other cells that store some other constants in the same sheet, explained in "Part 2"). However, the value "cF_1=" must in no way be changed or removed, in which case the library would switch to attempt back converting the dataset to a KML file.

Part 4: Building up Datasets

Part 5: Setting the Options for the Datasets and for the Conversion Process

Part 6: All about the Conversion Process