The Filechecker plugin permits to:
- Read a FFF (Fixed Field File) in the two kind of format: binary or text.
- Identify leafrecords / composite records of a FFF.
- Validate fields syntax and validate the structure of a FFF.
- Verify the value of the fields
To accomplish the first 3 points, an XML file named 'FFF descriptor' is needed and to verify the value of the fields, in addition to the FFF descriptor, an Xpath queries file is used.
Terminology about the FFF
A FFF file is built of leafrecords and each leafrecord built of fields. For instance:
This file is built of 2 leafrecord and each leafrecord constituted of 3 fields.
Textual FFF (a FFF of type text) have one record per line (Records are separated with a word wrap character) so the records access is sequential. Within a record, the position and the number of characters for each field is known. (In our example, the civility field is built of 3 characters whereas the name and the first name are built of 10 characters.
- It's the same for a binary file. All record have the same bytes number which permits distinguishing them from each other. And inside a record, the position and the number of bytes for each field is known.
- In a FFF file, if all records are built of the same type (as our example, we speak about mono-recording file). If the file is built of several kind of records, we speak about multi-recording file and each record has one or more identifier fields. For instance:
In this example of a multi-recording file, the field 'OO' permits identifying a physical person and the field '01' permits identifying a moral person.
- If a file is built of only leafrecords we speak about 'Flat File'.
When a suite of leafrecords form a unity, we speak about composite records. For instance:
In this example, each composite record is built of 3 leafrecords:
- The leafrecord '00' for the civility.
- The leafrecord '01' for the adress.
- The leafrecord '03' for the phone number.
Among the leafrecords composing a composite record, we can distinguish 3 kind of leafrecords:
- The opening record which indicates the first leafrecord of a composite record.
- The closing record which indicates the last leafrecord of a composite record.It permits to detect the end of a composite record but they're not mandatory (In the previous example, there is no closing records).
- Others leafrecords are named children leafrecords.
Remark: Generally the suite of leafrecords of a composite record are submitted to management rules. For instance, a person must have a civil status AND a phone number.
Structure of the FFF descriptor
A FFF descriptor is a XML file which has the following structure:
The <ROOT> tag is the root element of the FFF descriptor. It pemits to:
- To state the scheme to use to validate the FFF descriptor.
- To decribe the general characteristics of the FFF to verify.
It contains 2 tags:
- The <sequences> tag (Optional): It contains sequences definition used for the auto-incremental fields.
- The <records> tag: It contains the records description of the file and is built of:
- A <leaves> tag which contains n <leafRecord> tags (They describe the leafrecords type of the FFF to verify).
- A <composites> tag which contains n <compositeRecord> tags (They describe the composite records type of the FFF to verify).
The <root> tag must have the following attibutes:
We're going to explain the different attributes of the <root> tag:
The 'xmlns' attribute permits to declare the URL of the dafault name space. It means that the XML elements used in the FFF descriptor must have been defined in this name space.
This name space purposes several attributes which permit to declare the scheme to use to validate the file.
The attribute 'schema location' of the name space "http://www.w3.org/2001/XMLSchema-instance" permits to declare the XSD scheme to use for the validation and to associate it to the previous name space.
- The others attributes of the <root> tag are used to describe the general characteristics of the FFF to verify:
- The 'name' attribute: It indicates the name of the file to use.
- The 'binary' attribute: A boolean specifying if the FFF is a binary or not.
- The 'encoding' attribute: It permits to specify the encoding of the file. Using names to design the encoding are those of the java class 'java.nio.charset.Charset'. For a binary file, this attribute is mandatory whereas for a text file it's optional because if not precised it's the encoding of the Java Virtual Machine who's used.
- The 'bytesPerLine' attribute: It permits for a binary file to specify the amount of bytes per record.
<sequences> and <sequence> elements
The <sequences> tag contains a list of <sequence> tags. Sequences are counters. They are used to incremente fields of type 'autonumber'.
Each <leafRecord> tag decribes a leafRecord and inside each <leafRecord> tag we have <fields> and <field> tags which describe the fields of each leafrecord.
The <compositeRecord> tag contains 3 tags:
- The <openingRecord> tag which defines the opening leafRecord. The text of this tag must be linked to the value of the attribute name of the <leafRecord> tag corresponding.
- The <closingRecord> tag which defines the closing leafRecord. The text of this tag must be linked to the value of the attribute name of the <leafRecord> tag corresponding.
- The <children> tag which contains the list of the children records.
Succession of the composite Record's children
The succession of the composite Record's children is defined with the help of a pattern built combining 'and', 'or' and 'repeat' clauses.
The 'and' clause is used to indicate that a A-type leafRecord AND a B-type leafRecord (AND a C-type leafRecord...) must be present. The recordings number included in an 'AND' clause must be higher or equal to 2
The 'or' clause is used to indicate that a A-type leafRecord OR a B-type leafRecord (OR a C-type leafRecord...) must be present. The recordings number included in an 'OR' clause must be higher or equal to 2
The 'repeat' clause is used to indicate that leafRecord must be present a number of times, between a minimal value and a maximal value defined. (min>=0, min<max<unbounded)
Combination of the 'AND', 'OR' and 'REPEAT' clauses
The 'and', 'or and 'repeat' clauses can be recursively combined.
Validation of the FFF descriptor
When a FFF descriptor is loaded by the FileChecker, he must be validate. His structure is validate by the XSD scheme that he declares and others complementary validations to ensure the functionnal coherence of the file.