Somebody asked me, what does an assembler program look like? So I wrote a couple samples. This is the first one. It just copies a file.
Maybe a little background first. On IBM mainframe systems, an address, where you find something, like a print line, consists of the address in a 'base register' and an offset past the address in the base register. That allows my program to be loaded anywhere in memory. There are 16 general purpose registers, 0-15 that I can use. Since someone called me, the first thing I do is save their registers.
Convention is that reg-13 points to the active user's save area. Always. SO, I first save my caller's registers - in his save area. Next, I set my base register, and tell the assembler what my base register is, with a USING instruction. Then I get the address of my save area, and, I link, forward and backwards, my save area, and my caller's save area. I have my base register set, so that when instructions are assembled into executable code, the assembler knows where the base register points, and when it needs an address, calculates the offset from the base register, to, maybe, my print line. So, in general, every instruction will have a base register and an offset to reference a location.
In my code there will be DCBs (data control block) that's used to keep track of where I am in the files that I'm reading and/or writing. And I use GET and PUT instructions to read and write records.
Besides using reg-13 to point to a save area, convention is that, when control comes to me, reg-15 points to the start of my program, and reg-14 is where I should return to when I'm all done. One more thing: When my program starts, reg-1 can contain the address of fields that are passed to me. In simple programs, when they start, reg-1 points to another address, that then points to the PARM field, which can contain information that I want to use while my program is running. In these examples, the PARM field will contain 2 bytes of length, and then the character string I'm going to use.
OKAY, so first, we'll simply copy a file. Save caller's registers, set the base and our save area, and OPEN the input and output files. Then GET (read) and PUT (write) records until we're done. Last, CLOSE the files, reload caller's registers, and exit.
COPYFILE START 0
USING *,12
STM 14,12,12(13) SAVE CALLER'S REGISTERS
LR 12,15 OUR BASE REGISTER
ST 13,4(15) AND SET UP OUR SAVE AREA
ST 15,8(13)
LA 13,SAVEAREA
OPEN OPEN (IN,INPUT,OUT,OUTPUT)
GET GET IN READ
LR 0,1 LOAD RECORD ADDR INTO REG-0
PUT OUT,0 WRITE
B GET LOOP
Z CLOSE (IN,,OUT) ALL DONE, CLOSE FILES
L 13,4(13) LOAD ADDR OF CALLER'S SAVE AREA
LM 14,12,12(13) LOAD CALLERS REGISTERS
SR 15,15 RETURN CODE=0 MEANS IT'S GOOD
BR 14 EXIT
SAVEAREA DC 18F'0' OUR SAVE AREA.
*
* A DCB IS A DATA CONTROL BLOCK, THAT CONNECTS THIS
* PROGRAM TO THE FILES THAT ARE BEING READ AND WRITTEN.
*
IN DCB DDNAME=IN,DSORG=PS,EODAD=Z,
MACRF=GL,LRECL=3990,RECFM=FT
OUT DCB DDNAME=OUT,DSORG=PS,LRECL=133,
RECFM=FT,MACRF=PM
END COPYFILE
Next we'll copy only specific records in the file we're reading. If you read the first description, I mentioned the PARM field, where the system can pass me some data -- in this case, a character string that I'm interested in seeing. What we'll do is read a record, and then see if that character string is somewhere in the record. If not, we'll just forget it, and go read the next record. If we do find the character string, then we'll write the record, and go read the next, and repeat the process.
In the '70s, my boss wrote a COBOL program to do that. This one is assembler, but does that function. Read a record, then index down the record and see if the PARM string matches anything in the record. If not, skip it. If it does copy the record to the OUTPUT file.
SLOWSCAN START 0
USING *,12
STM 14,12,12(13) SAVE CALLER'S REGISTERS
LR 12,15 OUR BASE REGISTER
LA 2,SAVEAREA AND SET UP OUR SAVE AREA
ST 2,8(13)
ST 13,4(2)
LR 13,2
*
L 1,0(1)
LH 2,0(1)
LR 3,2
SH 3,=H'1'
EX 3,MVCPARM MOVE PASSED CHAR STRING TO OUR WORKAREA.
OPEN (IN,INPUT,OUT,OUTPUT)
B GET
*
MVCPARM MVC PARM(0),2(1)
CLC CLC PARM(0),0(R4)
*
PUT LR 0,1
PUT OUT,(0)
*
GET GET IN READ
LH R4,DCBLRECL-IHADCB+IN
SR R4,R2
LOOP EX R3,CLC THIS PROGRAM IS SLOW BECAUSE IT COMPARES
BE PUT THE STRING TO EVERY LOCATION IN THE
LA 1,1(1) RECORD. SO IF THE RECORD IS 5000 BYTES
BCT 2,LOOP LONG, THEN IT WILL DO ALMOST 5000
B GET COMPARES FOR EACH RECORD.
*
Z CLOSE (IN,,OUT) ALL DONE, CLOSE FILES
L 13,4(13) LOAD ADDR OF CALLER'S SAVE AREA
LM 14,12,12(13) LOAD CALLERS REGISTERS
SR 15,15 RETURN CODE=0 MEANS IT'S GOOD
BR 14 EXIT
SAVEAREA DC 18F'0' OUR SAVE AREA.
*
IN DCB DDNAME=IN,DSORG=PS,EODAD=Z,
MACRF=GL,LRECL=3990,RECFM=FT
OUT DCB DDNAME=OUT,DSORG=PS,LRECL=133,
RECFM=FT,MACRF=PM
END SLOWSCAN
So we've got the basics down. We want it to run faster, so what we'll do is find a character in the string, and only when we have a valid first character, then do the compare.
IBM has this neat instruction, TRT (translate and test). Keep in mind that, with 8 bits, there are 256 possible characters. The way TRT works is that you set up a 256 byte table that is all zeros. You then make 1 of the bytes non-zero. EG, in hex, the character "A" is a x'C1' (offset 193 in the table) so to look for "A" make that byte non-zero. Then you can do a TRT, and scan many bytes. TRT will stop when it finds a byte in your record that matches the "A" which you set. So if you're looking for "ABCDE" TRT will stop at the A, and put it's address in register-1. You can then do a compare to see if you found ABCDE. In our example, when you find the string, then you write the entire record. IF there isn't an "A" in the record, then TRT will indicate that the character is not found, and you can go read the next record. Today, there are hundreds of extra instructions (many to make sort run faster) but TRT was one of the original instructions.
There is a TR (translate) instruction that does change one character to another. TRT doesn't change anything, and at first, that is confusing. Once you get the hang of it, it's pretty cool.
In this case, we're going to look for the first character in the string that specified in the parm field. EG for "EXIT" we'll look for "E". (In the fancier version, we'll look for "X" because there aren't nearly as many of them.)
OH, there's another 'fancy' instruction, EX (execute). EG
EX 3,instruction.
Execute the specified instruction, substituting (sort of) the low byte of reg-3 for the 2nd byte (usually the length) of the "instruction". So I can load 200 into the low byte of reg-3 and scan 200 bytes of a field I'm looking at. OR, I can load the length of the passed string into reg-3 and move the string to my internal work area. In general, execute allows you to change the length of the instruction you're executing.
This is kind of the scan I wrote back around 1975, except in that case, I could search for several different strings, all at the same time.
SCAN1975 START 0
USING *,12
STM 14,12,12(13) SAVE CALLER'S REGISTERS
LR 12,15 OUR BASE REGISTER
ST 13,4(15) AND SET UP OUR SAVE AREA
ST 15,8(13)
*
L 1,9(1)
LH 2,0(1)
LR 3,2
SH 3,=H'1'
EX 3,MVCPARM
SR 14,14
IC 14,PARM LOAD THE 1ST CHAR
LA 15,TRTTBL(14) CALC THE OFFSET IN THE TRT TABLE
STC 14,0(15) AND MAKE THAT LOCATION NON-ZERO
OPEN (IN,INPUT,OUT,OUTPUT)
B GET
*
MVCPARM MVC PARM(0),2(1)
CLC CLC PARM(0),0(R4)
*
PUT PUT OUT,(0)
*
GET GET IN READ
LR 0,1
LH R4,DCBLRECL-IHADCB+IN
SR R4,R2
SH R4,=H'1'
*
LOOP LR R5,R4
SR R5,R1
BNP GET
EX R5,TRT
BZ GET
EX R2,CLC
BE PUT
LA 1,1(1)
B LOOP
*
TRT TRT 0(0,1),TRTTBL
CLC CLC 0(0,1),PARM
*
Z CLOSE (IN,,OUT) ALL DONE, CLOSE FILES
L 13,4(13) LOAD ADDR OF CALLER'S SAVE AREA
LM 14,12,12(13) LOAD CALLERS REGISTERS
SR 15,15 RETURN CODE=0 MEANS IT'S GOOD
BR 14 EXIT
SAVEAREA DC 18F'0' OUR SAVE AREA.
TRTTBL DC 256X'00'
*
IN DCB DDNAME=IN,DSORG=PS,EODAD=Z,
MACRF=GL,LRECL=3990,RECFM=FT
OUT DCB DDNAME=OUT,DSORG=PS,LRECL=133,
RECFM=FT,MACRF=PM
END SCAN1975
Okay, this is the program that I've been working on lately, just to keep my hand into coding. It's similar to the program description above, except that it finds the least frequently use character in the passed string and searches for that. When found, it backs up to the start of the proposed string and does the compare. As above, if we find the "X" in EXIT, then we back up 1 byte and do the compare.
FIND2 START 0
YREGS
USING *,13 NORMAL START STUFF, USING R13 AS BOTH
STM 14,12,12(13) SAVEAREA AND ALSO BASE. NOT SHORT OF
ST 13,4(15) REGS TODAY, BUT WHAT THE HECK.
ST 15,8(13)
LR 13,15
L R1,0(R1)
LH R2,0(R1)
SH R2,=H'1'
BM NOPARM
BZ NOPARM
ST R2,PARMLENX SAVE PARM LENGTH-1 AND PARM AND
MVC PARM(0),2(R1) ANOTHER COPY OF PARM TO GET
EX R2,*-6 CHAR FREQUENCY VALUES.
LR R0,R2
MVC LINE(99),PARM
TR LINE(99),FREQTBL
LA R1,LINE SCAN THROUGH PARM FREQ VALUES
LA R14,1(R1) TO FIND LEAST OFTEN USED CHAR
PARMCLC CLC 0(1,R1),0(R14) WHICH WE'LL SET IN THE TRT TABLE.
BNL *+6
LR R1,R14
LA R14,1(R14)
BCT 0,PARMCLC
*
S R1,=A(LINE) CALC OFFSET FO CHAR
ST R1,PRE
CVD R1,16(R13)
OI 23(R13),X'0F'
UNPK IDMSG+66(3),22(2,R13) REPORT IT IN ID MSG.
LR R0,R2
SR R0,R1
ST R0,POST SAVE CHARS AFTER OUR SELECTED CHAR
* SO WE DON'T COMPARE PAST END OF REC.
LA R3,PARM(R1)
SR R4,R4 SET UP ID MSG SO WE KNOW WHAT
IC R4,0(R3) WE THINK WE'RE DOING,
STC R4,IDMSG+53 AND SEE IF THAT MATCHES
MVC IDMSG+74(0),PARM WHAT WE WERE HOPING.
EX R2,*-6
LA R1,IDMSG+75(R2)
MVI 0(R1),C'"'
STC R4,TRTTBL(R4)
*
BAL R9,OPENSYSP OPEN FILES WRITE ID MSG
LA R2,IN
BAL R9,OPENIN
CLC =H'0',DCBLRECL-IHADCB+OUT
BNE *+16
MVC DCBLRECL-IHADCB+OUT,DCBLRECL-IHADCB+IN
MVC DCBRECFM-IHADCB+OUT,DCBRECFM-IHADCB+IN
LA R2,OUT
BAL R9,OPENOUT
SR R8,R8
L R6,PARMLENX
B GET
* ------------------ SET UP DONE. WORK AREAS, THEN OPEN ROUTINES.
RECADDR DC F'0'
PARMLENX DC F'0'
PRE DC F'0'
POST DC F'0'
*
DS 0D
#BYTES DC PL8'0',CL16' BYTES READ'
#IN DC PL8'0',CL16'RECORDS READ'
#OUT DC PL8'0',CL16'RECORDS COPIED'
#TRT DC PL8'0',CL16'TRT INSTS USED'
#CLC DC PL8'0',CL16'CLC INSTS USED'
DC X'FF'
*
PUSH PRINT --- OPEN FILES ---
PRINT NOGEN
OPENSYSP TM DCBOFLGS-IHADCB+SYSPRINT,DCBOFOPN
BOR R9
OPEN (SYSPRINT,OUTPUT)
PUT SYSPRINT,IDMSG
MVI LINE-1,C' '
MVC LINE,LINE-1
BR R9
*
OPENMSG DC C'OUT OPENED FOR OUTPUT, RECFM=.., LRECL=.....'
*
USING IHADCB,2 (USING I HAD A DCB ...)
OPENOUT TM DCBOFLGS,DCBOFOPN
BOR R9
MVC OPENMSG(3),DCBDDNAM
MVC OPENMSG+15(3),=C'OUT'
OPEN ((2),OUTPUT)
B OPENM
OPENIN TM DCBOFLGS,DCBOFOPN
BOR R9
MVC OPENMSG(3),DCBDDNAM
MVC OPENMSG+15(3),=C' IN'
OPEN ((2),INPUT)
POP PRINT
*
OPENM UNPK OPENMSG+29(3),DCBRECFM(2)
TR OPENMSG+29(2),HEX-240
MVI OPENMSG+31,C','
L R0,DCBLRECL
CVD R0,16(R13)
OI 23(R13),X'0F'
UNPK OPENMSG+39(5),21(3,R13)
MVC LINE(L'OPENMSG),OPENMSG
PUTLINE PUT SYSPRINT,LINE-1
MVC LINE,LINE-1
BR R9
* =============== DONE WITH SETUP. ===== THIS IS THE BUSINESS END =======
PUT L R0,RECADDR
PUT OUT,(0)
AP #OUT,P1
*
GET GET IN
AP #IN,P1
ST R1,RECADDR
C R8,=X'7FF0000'
BNL *+8
AH R8,DCBLRECL-IHADCB+IN
*
LH R5,DCBLRECL-IHADCB+IN
LA R5,0(R1) HAVE TO BE CAREFUL TO ONLY SCAN
S R5,POST WITHIN THE RECORD, LEAVING SPACE
A R1,PRE FOR CHARACTERS BEFORE AND AFTER
* THE SCAN CHAR.
LOOP LR R4,R5
SR R4,R1 CALC LENGTH (LEFT) TO SCAN
BNP GET Q. NONE?
CH R4,=H'256' Q. MORE OR LESS THAN 256 BYTES
BL SHORT LESS, GO DO THAT.
*
AP #TRT,P1
TRT 0(256,R1),TRTTBL
BNZ FOUND
LA R1,256(R1)
B LOOP
*
TRT 0(0,R1),TRTTBL
CLC 0(0,R15),PARM
*
SHORT AP #TRT,P1
EX R4,SHORT-12
BZ GET
FOUND LR R15,R1
LA R1,1(R1)
S R15,PRE
AP #CLC,P1
EX R6,SHORT-6
BNE LOOP
B PUT
* ================================ ALL DONE ===================
USING IHADCB,2
CLOSE TM DCBOFLGS,DCBOFOPN
BZR R9
CLOSE ((2))
CLI DCBDDNAM,C'S'
BER R9
MVC LINE(3),DCBDDNAM
MVC LINE+5(6),=C'CLOSED'
B PUTLINE
*
ED15 DC X'402020206B2020206B2020206B2020206B212020'
COUNTS LA R2,#BYTES
CVD R8,#BYTES
COUNTSL MVC LINE(L'ED15),ED15
ED LINE(L'ED15),0(R2)
MVC LINE+L'ED15+2(16),8(R2)
PUT SYSPRINT,LINE-1
LA R2,24(R2)
CLI 0(R2),X'FF'
BL COUNTSL
MVC LINE,LINE-1
BR R9
*
Z BAL R9,COUNTS
LA R2,IN
BAL R9,CLOSE
LA R2,OUT
BAL R9,CLOSE
LA R2,SYSPRINT
BAL R9,CLOSE
B EXIT
NOPARM WTO 'MISSING OR SHORT PARM',ROUTCDE=11
OI RC,12
*
EXIT SR 15,15
IC 15,RC
L 13,4(13)
L 14,12(13)
LM 0,12,20(13)
BR 14
LTORG
P1 DC X'1C'
RC DC X'00'
HEX DC C'0123456789ABCDEF'
IDMSG DC CL133' FIND2, ASSEMBLED &SYSDATE AT &SYSTIME, COPY RECS W'
WITH "?" AT OFFSET ... IN " '
LINE DC CL133' '
PARM DC CL100' '
PUSH PRINT
PRINT NOGEN
*
*** EXLST=OPENEXIT <- ADD TO //OUT
OPENEXIT DC 0F'0',X'87',AL3(OPENEXIT+4)
PUSH USING
USING *,15
CLC =H'0',DCBLRECL-IHADCB(R1)
BNZR 14
MVC DCBLRECL-IHADCB(2,R1),DCBLRECL-IHADCB+IN
MVC DCBRECFM-IHADCB(1,R1),DCBRECFM-IHADCB+IN
BR R14
POP USING
LTORG
*
IN DCB DDNAME=IN,DSORG=PS,EODAD=Z,MACRF=GL,LRECL=3990,RECFM=FT
OUT DCB DDNAME=OUT,DSORG=PS,LRECL=133,RECFM=FT,MACRF=PM
SYSPRINT DCB DDNAME=SYSPRINT,DSORG=PS,LRECL=133,RECFM=FT,MACRF=PM
POP PRINT
*
FREQTBL DC X'222120',253X'02' ---- FROM WIKIPEDIA ----
ORG FREQTBL+X'81'
DC AL1(82,15,28,43,97,22,20,61,70)
ORG FREQTBL+X'91'
DC AL1(02,07,40,24,67,75,19,01,60)
ORG FREQTBL+X'A2'
DC AL1(63,91,28,10,24,01,20,01)
ORG FREQTBL+C'A'
DC AL1(82,15,28,43,97,22,20,61,70)
ORG FREQTBL+C'J'
DC AL1(02,07,40,24,67,75,19,01,60)
ORG FREQTBL+C'S'
DC AL1(63,91,28,10,24,01,20,01)
ORG FREQTBL+C'0'
DC AL1(89,88,87,86,85,84,83,82,81,80)
ORG FREQTBL+C' '
DC AL1(99)
ORG FREQTBL+C'.'
DC AL1(9)
ORG FREQTBL+C','
DC AL1(9)
ORG FREQTBL+C'/'
DC AL1(9)
*
ORG FREQTBL+X'20' ------ ASCII CHARACTERS -----
DC AL1(98)
ORG FREQTBL+X'30'
DC AL1(89,88,87,86,85,84,83,82,81,80)
ORG FREQTBL+X'40'
DC AL1(82,15,28,43,97,22,20,61,70)
DC AL1(02,07,40,24,67,75,19,01,60)
DC AL1(63,91,28,10,24,01,20,01)
ORG FREQTBL+X'60'
DC AL1(82,15,28,43,97,22,20,61,70)
DC AL1(02,07,40,24,67,75,19,01,60)
DC AL1(63,91,28,10,24,01,20,01)
*
TRTTBL DC XL256'00'
*
END FIND2
Just to keep track of what I'm doing, there's a report that shows the string, the search character, and then record counts and instruction counts.
FIND2, ASSEMBLED 04/24/25 AT 22.32, COPY RECS WITH "D" AT OFFSET 000 IN "DCBOF"
IN OPENED FOR INPUT, RECFM=A0, LRECL=88640
OUT OPENED FOR OUTPUT, RECFM=A0, LRECL=16288
1,967,070 BYTES READ
493 RECORDS READ
15 RECORDS COPIED
1,128 TRT INSTS USED
650 CLC INSTS USED
IN CLOSED
OUT CLOSED