Commodore‎ > ‎BASIC‎ > ‎

Program Format

This page discusses the "object code" of CBM BASIC.  Whenever you enter a line that starts with an unsigned number (no leading + or -), BASIC will store the line (assuming the number was between 0 and 63999) as part of a program in a compressed format known as "object code".
 
Below are some hexadecimal memory dumps of "object code".  This is the exact same sequence of bytes stored in a program file (on cassette or disk, for example) except that a program file also contains two leading bytes; those bytes identify the start address in RAM from which the program was saved.  The first two bytes in a program file are ignored (when loading a BASIC program) in all versions of BASIC, except v1.0.
 
The program format ("object code") is quite simple.  Each program line stored in RAM consists of 4 parts:
  1. Line-link; this is a 16-bit word (RAM address) that points to the next line of the program (points to the next line-link)
  2. Line-number; this is a 16-bit word that gives the line number, 0 to 63999, of the current program line (see #3)
  3. The tokens (compressed keywords) and parameters that form the current program line.
  4. A null (zero byte) to denote the end of the program line
A strange thing about CBM BASIC is that the byte preceding the first line MUST be zero byte... otherwise errors will occur.  The sample memory dumps show this zero byte.
 
Example Source Code:
NEW

READY.
10 N$ = "H2O"
20 PRINT "HELLO ";: PRINT N$

RUN
HELLO H2O

READY.
 
Example Object Code (C128):
MONITOR

MONITOR
    PC  SR AC XR YR SP
; FB000 00 00 00 00 F8
M 1C00
>01C00 00 10 1C 0A 00 4E 24 20 : .....N$ 
>01C08 B2 20 22 48 32 4F 22 00 : . "H2O".
>01C10 26 1C 14 00 99 20 22 48 : ..... "H
>01C18 45 4C 4C 4F 20 22 3B 3A : ELLO ";:
>01C20 20 99 20 4E 24 00 00 00 :  . N$...
...                memory dump continues...
 
The null terminators (zero bytes) are shown with a red background (this includes the strange leading null, which is NOT saved to a file).  The line-links are shown with a yellow background.  The 16-bit line numbers are shown with a green background.  Tokens are shown with a blue background.  Everything else is punctuation/parameters (in ASCII-X).  Also note that the end-of-program is identified by a double-zero (line-link).  Because this is normally proceeded by a previous null terminator, the end of a BASIC program (in object format) may be identified by a "triple null" (three zero bytes in a row). 

NOTE that some people (who are desperate to save every possible byte) may fudge the standard, and set the next-to-last-line-link (last non-zero line link) to point to the terminating zero of the final line... in these rare cases, the program ends with only two null bytes.  It's an ugly hack, so if you don't understand that sentence, below is the same program using that "save 1 byte hack":

Ugly Hack to Save 1 Byte (C128):
MONITOR

MONITOR
    PC  SR AC XR YR SP
; FB000 00 00 00 00 F8
M 1C00
>01C00 00 10 1C 0A 00 4E 24 20 : .....N$ 
>01C08 B2 20 22 48 32 4F 22 00 : . "H2O".
>01C10 25 1C 14 00 99 20 22 48 : ..... "H
>01C18 45 4C 4C 4F 20 22 3B 3A : ELLO ";:
>01C20 20 99 20 4E 24 00 00 .. :  . N$...
...                memory dump continues...
Only two things were changed in the hack: the link at $1c10 was changed from $1c26 to $1c25, and the program is 1 byte shorter (only 2 null bytes at the end, not 3).  This hack may make editing the program difficult with the BASIC ROM editor.  This hack is uncommon; much more common (and can save many bytes) is to omit (almost) every space from the program that is not in quotes... this common practice makes (for a human) reading a BASIC program listing difficult!  See below...

Source Code with no (unquoted) spaces:
10 N$="H2O"
20 PRINT"HELLO ";:PRINTN$
5 bytes were saved without breaking compatibility.  (Combining this method with the previous hack will save 6 bytes.)

* NEW * if you (like me), examine the 2 hex/byte dumps shown above, it may be VERY difficult to see the difference!!
To be crystal clear, here is the difference:
>01C10 26 1C 14 00 99 20 22 48 : ..... " ;normal link
>01C10 25 1C 14 00 99 20 22 48 : ..... " ;hacked link

The "hacked" link points into the "end of line" ($1c25) instead of the "standard" next-line ($1c26).

Very, very sorry if that is confusing to you... please Contact Me for more details...


© H2Obsession, 2014, 2015
Comments