file processing commands



File Processing and I/O

Accessing and manipulating disk-resident data are tasks that must be performed by any application that has long-term information storage requirements. Almost all business applications utilize or manipulate external information in some form, and many scientific programs also have data input and output, so any good programming language must incorporate commands to enable the processing of data that is external to the program.

All native CobolScript data processing is done with ASCII text files, commonly referred to as flat files; this flat file processing is the primary focus of this chapter. CobolScript will correctly process data files that are either fixed field width or single-character delimited. If the data in the file is delimited, the parsing of the fields is handled internally by CobolScript.

The data records in CobolScript data files are stored sequentially, meaning one after another. Sequential organization is the most straightforward approach to organizing records within a file; the operations that can be performed on such a file are necessarily basic, and in CobolScript, input and output commands are restricted to entire-file operations (OPEN and CLOSE), entire-record operations (READ, WRITE, REWRITE), and an operation that moves the file pointer (POSITION). Nevertheless, if you have previously only dealt with relational database access methods to retrieve or modify data, you should pay special attention to this chapter, since data access methods such as direct SQL calls are strictly a CobolScript Professional Edition feature and are not available from within CobolScript Standard Edition.

It is, however, possible for CobolScript Standard Edition to interact with a relational database, if the RDBMS (relational database management system) supports stored procedures, these procedures can be called from the system prompt, and the RDBMS is able to direct the output from stored procedure calls to flat files. Our interaction technique, which uses a combination of stored procedure calls and intermediate flat files, is described in the last section of this chapter. Since your actual technique will vary depending on the relational database that you use and any firewall that may exist on your network, the information in this section is presented at a more conceptual level than the other sections in the chapter.

If you are programming with CobolScript Professional Edition, and you want to directly interact with a relational database using CobolScript LinkMaker™’s embedded SQL capability, refer to Appendixes G and H for instructions on configuring and using LinkMaker™.

Describing Files and Defining Data Records

Before any processing can be done on a data file, you must first describe it using an FD statement, and you must create a record variable that defines the individual fields within each data record. See the Data and Copybook Files section of Chapter 3 for more details on describing a file and defining a data record.

Opening Files

Before you can begin reading data from a file or writing data to a file, you must first open the file. Opening a file lets the operating system know that you intend to perform an input or output operation on that file, and prepares the file for subsequent operations. You can open a file in CobolScript for reading, writing, updating, or appending.

If you open a file for writing and the file already exists, its contents will be destroyed and a new file created in its place. Opening a file for reading, updating, or appending, however, will not destroy the file’s contents.

The DELIMITED WITH clause can be added to an OPEN statement to indicate that a data file is delimited, meaning that fields are separated with a single-character delimiter that is specified after the WITH keyword. The absence of the DELIMITED WITH phrase indicates that the data file has fixed width fields, which will be separated based on the individual field sizes in the record definition.

Below are some examples of each variation of the OPEN statement, with and without the DELIMITED WITH clause:

OPEN test_file FOR READING.

OPEN `test.dat` FOR READING DELIMITED WITH `|`.

OPEN `test_file FOR WRITING.

OPEN test_file FOR WRITING DELIMITED WITH `,`.

OPEN `test.dat` FOR APPENDING.

Ü

OPEN test_file FOR APPENDING DELIMITED WITH `,`.

If you’re working in a Unix environment, you must have the appropriate permissions set for your data files; specifically, read as well as write permissions must be set on all data files for all file processing options. Even files that are only opened for reading must have Unix write permissions set, because early versions of CobolScript used OPEN FOR READING to update records as well as to read them; to be backward compatible, current versions of CobolScript still support this format.

Closing Files

After you have finished working with a file, you must close it. Closing a file releases the file descriptor to the operating system; failing to close a file will cause the file to be locked and appear unavailable to other applications. Here is an example of the CLOSE statement:

CLOSE `test.dat`.

In the following CobolScript program, we simply open and close a file. Since it is opened for writing, the file will be created if it does not already exist, or overwritten if it does already exist.

1 io_file PIC X(n) value `IO.DAT`.

FD io_file RECORD IS 100 BYTES.

OPEN io_file FOR WRITING.

CLOSE io_file.

Reading Records From Files

The READ statement reads one data record from the data file and loads it into the target record variable. A single READ will read data until it reaches a line terminator, at which point it stops. The line terminator is the ASCII character or character combination that is used by your operating system to indicate the end of a line, usually either the carriage return or carriage return and linefeed characters in combination. The line terminator is not included in the record data.

The AT END clause of the READ statement is an error-trapping routine that recognizes when the end-of-file marker has been reached, and executes a specific statement when this condition is met. We have chosen to use a MOVE statement in this example; any simple one-line statement, such as DISPLAY or COMPUTE, could be substituted for the MOVE. The clause should be used in most cases; if the AT END clause is not specified, reaching the end of a data file will cause a CobolScript error.

Once a data record has been read and the target record variable populated, the component fields of the record variable can be used like any other variable. Below is some example code that utilizes the READ statement:

1 test_file PIC X(n) VALUE `TEST.DAT`.

FD test_file RECORD IS 100 BYTES.

1 input_record.

5 ir_component_1 PIC X(50).

5 ir_component_2 PIC X(50).

1 eof PIC 9 VALUE 0.

OPEN test_file FOR READING.

PERFORM UNTIL EOF

READ test_file INTO input_record

AT END MOVE 1 TO eof

DISPLAY `Record component 1 is: ` & ir_component_1

END-PERFORM.

CLOSE test_file.

Overwriting a File

To overwrite a file, just open it for writing and write the new output to the file using the WRITE statement. Writing will put data from a source literal or variable into a single record in the file. In this example, the fields comprising RECORD-VARIABLE are assumed to have already been populated:

OPEN test_file FOR WRITING DELIMITED WITH `|`.

WRITE record_variable TO test_file.

CLOSE test_file.

Appending New Records to an Existing File

To append records to the end of an existing file, open the file for appending and write each record to the file using the WRITE statement. Each WRITE statement will add the source record to the file as the last sequential data record. Here’s the code for several appends to a delimited data file:

1 test_file PIC X(n) VALUE `test.dat`.

1 bytes_num PIC 99 VALUE 10.

FD test_file record is bytes_num bytes.

OPEN test_file FOR APPENDING DELIMITED WITH `,`.

WRITE `12345` TO test_file.

WRITE `1234` TO test_file.

WRITE `123` TO test_file.

CLOSE test_file.

The following output (highlighted in gray) will be written to the file test.dat:

12345, `

1234, `

123, `

Each of the three records above is made up of three components: the source literal from the WRITE statement that created that record, followed by the comma delimiter, and then followed by enough spaces to make the total length of the record equal to ten characters. Note that even when files are opened as delimited files, CobolScript still right-pads the record with spaces until it is the total length declared in the FD statement (in this case, ten bytes). This padding is an intentional feature of CobolScript, because it simplifies the task of individually updating delimited data records. This also has relevance if you intend to update delimited data records created outside of CobolScript; see the next section on updating records for more information.

If the DELIMITED WITH option is absent from our code block, as in the following:

OPEN test_file FOR APPENDING.

Then, assuming that the FD statement and everything else in our original block of code does not change, the following output will be written to test.dat:

12345 `

1234 `

123 `

Now let’s look at a slightly more complex case with a record variable that is made up of two fields. First, we’ll describe the file and define the record variable:

1 test_file PIC X(n) VALUE `test.dat`.

1 bytes_num PIC 99 VALUE 9.

FD test_file record is bytes_num bytes.

1 record_var.

5 field_1 PIC X(4).

5 field_2 PIC X(5).

Next, we’ll open the file and write some records. Note that this is a fixed width file, because there is no DELIMITED WITH clause in our OPEN statement:

OPEN `test.dat` FOR APPENDING.

MOVE `1` TO field_1.

MOVE `test` TO field_2.

WRITE record_var TO test_file.

MOVE `test` TO field_1.

MOVE `1` TO field_2.

WRITE record_var TO test_file.

CLOSE test_file.

The code above would produce the following output in the file test.dat:

1 test`

test1 `

Note that each field inside a fixed width file has, not surprisingly, a fixed width. Therefore, the second field in the above example always begins in the fifth character of the record, regardless of the size of the first field.

Now let’s take a look at what happens if we append delimited records instead of fixed width ones. We’ll first modify the original OPEN statement to handle comma-delimited data:

OPEN test_file FOR APPENDING DELIMITED WITH `,`.

Our record should be two bytes larger than the fixed width record to account for the two comma delimiters that will be in each record, so we must also modify the VALUE clause in our bytes_num variable declaration:

1 bytes_num PIC 99 VALUE 11.

We could also have changed our bytes_num value with a MOVE statement, so long as it preceded our FD. Either way, with the two above modifications, our code would write the following to test.dat:

1,test, `

test,1, `

You can see that, unlike the fixed width file, the starting position of each individual field within a delimited record varies.

Writing to a File by Updating Existing Records

In certain situations, you will probably want to update a record that already exists in a data file without appending an additional record to the file. To update a record in a data file, you should first open the file for update using the UPDATING keyword, as in:

OPEN test_file FOR UPDATING.

Next, you should perform reads until you have read the record that you wish to update. Then, using the REWRITE statement, you can overwrite the old record, as in the following:

REWRITE record_variable TO test_file.

Here’s some code that demonstrates this technique more completely:

1 eof PIC 9 VALUE 0.

1 rec_found PIC 9 VALUE 0.

1 rec_position PIC 999999.

1 test_file PIC X(n) VALUE `TEST.DAT`.

FD test_file record is 9 bytes.

1 record_var.

5 field_1 PIC X(4).

5 field_2 PIC X(5).

1 customer_of_interest PIC X(n) VALUE `Dave`.

1 new_field_2_val PIC X(n) VALUE `Davie`.

OPEN test_file FOR UPDATING.

PERFORM VARYING rec_position FROM 1 BY 1 UNTIL eof OR rec_found

READ test_file INTO record_var

AT END MOVE 1 TO eof

IF field_1 = customer_of_interest

MOVE 1 TO rec_found

MOVE new_field_2_val TO field_2

REWRITE record_var TO test_file

END-IF

END-PERFORM.

CLOSE test_file.

IF eof

DISPLAY `Customer record of interest was not found.`

END-IF.

Because CobolScript right-pads delimited records with spaces, each record is the exact number of bytes specified in the length argument to the initial FD statement. This allows any CobolScript data record, whether fixed format or delimited, to be updated in a simple and efficient manner with a simple record overlay, and without requiring any complex file reorganization for each update. However, if you process a delimited data file created with another application such as a Microsoft ExcelÒ CSV (comma-separated values) file, CobolScript updates to this file will usually not work properly, since each record in the file will have a different byte length (reads and appends to the unmodified file will work correctly, however). The data must be copied to a different file via a CobolScript program before records can be individually updated. Here’s an example of a program that does this (available in the sample program RECCOPY.CBL):

1 input_file PIC X(n) value `INPUT.CSV`.

FD input_file RECORD IS 100 BYTES.

1 input_record.

5 ir_input_1 PIC X(33).

5 ir_input_2 PIC X(32).

5 ir_input_3 PIC X(30).

5 ir_input_4 PIC X.

1 output_file PIC X(n) value `OUTPUT.CSV`.

FD output_file RECORD IS 100 BYTES.

1 eof PIC 9 VALUE 0.

OPEN input_file FOR READING DELIMITED WITH `,`.

OPEN output_file FOR WRITING DELIMITED WITH `,`.

PERFORM UNTIL eof

READ input_file INTO input_record AT END MOVE 1 TO eof

WRITE input_record TO output_file

END-PERFORM.

CLOSE input_file.

CLOSE output_file.

GOBACK.

Relative and Absolute File Positioning

If you regularly process a large number of records in flat files, you’re probably aware of the time-consuming nature of sequential searches. As your file sizes increase, sequential search times increase by a proportional amount; if file sizes grow unchecked, search times will eventually become unacceptably long. In fact, this is perhaps the most critical limitation of flat file databases, and it is what prompts many organizations to opt instead for relational databases, more so than data granularity, manageability, or other considerations.

In CobolScript, flat file search times can be reduced by using the POSITION statement. This statement positions the file pointer at the beginning of a particular record within a text data file in a single step. If a data file uses a sequential numeric value as the record key value, a record within the file can be randomly (directly) accessed given that key value.

For COBOL developers, the POSITION statement functionality is similar to relative file processing.

POSITION works with standard text data files. The POSITION statement has two forms:

POSITION data_file AT RECORD record_number.

POSITION data_file RELATIVE OFFSET number_of_records.

The record_number value in the AT RECORD clause must be a positive integer in the range:

(1 <= record_number <= total number of records in file)

The record_number value (and hence the number of records in your data file) cannot exceed 2,147,483,647.

The number_of_records value used with the RELATIVE OFFSET clause must be an integer. This value indicates the number of records, counting from the current record, that the file pointer should be moved. Thus, a value of 1 will shift the file pointer one record forward in the data file; a value of –1 will shift the file pointer one record back. The number_of_records value must fall within the absolute range:

(-2,147,483,647 <= number_of_records <= 2,147,483,647)

Furthermore, a number_of_records value that causes the file pointer to be positioned before the beginning of the data file or after the end of the data file will cause a CobolScript error.

When using the POSITION statement, the number of bytes specified in the BYTES clause of the FD statement for your file must exactly match the number of bytes in the data file record; this value is used to reposition the file pointer, and a BYTES value that is larger or smaller than the actual data record size will cause the file pointer to be incorrectly positioned.

The following POSITION example uses the AT RECORD clause to access a particular record based on a sequential key value. The record is then read and displayed. After this, the file pointer is repositioned to the record prior to the record first read by using the RELATIVE OFFSET clause of POSITION:

1 filename_var PIC X(n) VALUE `datafile.txt`.

1 bytes_num PIC 99 VALUE 50.

FD filename_var RECORD IS bytes_num BYTES.

1 record_variable.

5 order_nbr PIC 99999.

5 data_var PIC X(45).

1 key_val PIC 99999 VALUE 24331.

OPEN filename_var FOR READING.

POSITION filename_var AT RECORD key_val.

READ filename_var INTO record_variable.

IF order_nbr = key_val

DISPLAY `For order number ` & order_nbr & `, data = ` & data_var

ELSE

DISPLAY `Problem with order_nbr values in data file; check file.`

END-IF.

POSITION filename_var RELATIVE OFFSET –2.

READ filename_var INTO record_variable.

IF order_nbr = (key_val-1)

DISPLAY `For order number ` & order_nbr & `, data = ` & data_var

ELSE

DISPLAY `Problem with order_nbr values in data file; check file.`

END-IF.

CLOSE filename_var.

STOP RUN.





FD

Command:

FD

Syntax:

FD <filename> RECORD IS <bytes-length> BYTES.

Description:

The FD statement describes a data file’s location and its record length to CobolScript. This statement is a necessary precursor to all flat (text) file data processing work.

The filename is a literal or variable that includes the name of the data file as well as any path information, which is necessary if the file is not in the current working directory of the program. The bytes-length is a numeric variable or literal that indicates the record length, in bytes, of the file record. The bytes-length value should account for any delimiters that are in the record but should not account for end-of-line characters; these end-of-line characters vary between Windows and Unix platforms, and this variation is automatically accounted for by CobolScript. The bytes-length value must be exact for statements that rely on this value, such as POSITION, to work correctly.

Once a data file has been described, it may be opened and further processed. For further information on describing files, see the Data and Copybook Files section of Chapter 3, CobolScript Language Constructs. For more information on data file processing, see Chapter 4, File Processing and I/O.


Command:

FD

Example Usage:

Example with literal arguments:

FD `test.dat` RECORD IS 50 BYTES.

Example with variable arguments, which are defined prior to the FD:

1 test_file PIC X(n) VALUE `test.dat`.

1 bytes_length PIC 99 VALUE 50.

FD test_file RECORD IS bytes_length BYTES.

Example that includes path information for a Windows® machine:

1 test_file PIC X(n) VALUE

`c:\windows\desktop\test.dat`.

1 bytes_length PIC 99 VALUE 50.

FD test_file RECORD IS bytes_length BYTES.

Example that includes path information for a Unix machine:

1 test_file PIC X(n) VALUE `/usr/cscript/test.dat`.

1 bytes_length PIC 99 VALUE 50.

FD test_file RECORD IS bytes_length BYTES.

See Also:

CLOSE

OPEN

POSITION

READ

REWRITE

WRITE

Sample Program:

FTP.CBL