SCRUB
Routine used to update records in a file, to replace sensitive customer data. There are entries, containing location and length, for each field to be replaced. One field is the "key" field, eg, account number, and the file should be sorted on that field. What we do is create a simple sequence #, and put it in that field. If there's a SS# in the record, replace that with the same sequence #. Typically other fields, like customer name, are replace with "name" followed by the sequence #.
While doing this, we create a 'control' file with the old account # and the new account #, and all of the other new strings that we create and use. This would allow other coordinated files, to be updated with the same information.
It's not finished, but this is the source at the moment.
DSCRUB DSECT 0
DSLOC DS F
DSLEN DS F
DS#LOC DS A
DSFORMAT DS C B,C,P
DSKEY DS C
DSQUOTE DS C
DS C
DSTRLEN DS H
DSTRINGO DS CL24
DSTRING DS CL24
LDSCRUB EQU *-DSCRUB
* ------------- THIS IS WHAT A CONTROL CARD LOOKS LIKE -------------
* SCRUB=(##,##,KEY,BIN/KEY,CHAR/KEY,PACK,##,##,'NAME',+
* ##,##,'STREET',##,##,PACKED,##,##,DECIMAL,##,##,BINARY)
* -------------------------------------------------------------
* THIS ROUTINE SCRUBS CUSTOMER INFORMATION FROM A FILE.
* IT ALSO CREATES A 2ND, CONTROL FILE, THAT CAN BE USED TO SCRUB
* CUSTOMER INFORMATION FROM ANOTHER REALATED FILE, 'AND' HAVE
* ACCT #S, NAMES, ETC MATCH BETWEEN THE 2 DIFFERENT FILES.
* IN FACT, WITH APPROPRIAT PLANNING, THERE COULD BE MULTIPLE RELATED
* FILES THAT ARE MADE TO MATCH.
*
* YOU COULD, OF COURSE, WRITE A PROGRAM TO DO THAT. THIS PROGRAM
* ALSO ALLOWS YOU TO SELECT SPECIFIC RECORDS, SO THAT IF YOUR SOURCE
* FILE IS REALLY LARGE, YOU CAN HAVE A REASONABLE SET OF TEST DATA,
* THAT'S COORDINATED.
*
* ------------- AND THE DESCRIPTION OF PARSING THE CC --------------
* THIS IS NOT PARTICULARLY ELLEGANT, BUT IT IS THE KISS METHOD.
* STRAIGHT FORWARD "WHAT'S NEXT?"
*
* FIRST, GETMAIN THE TABLE TO SAVE THE PARM LIST.
* MAKE SURE IT'S SCRUB=( AND LOOK FOR (
* ALL PARMS START WITH LOC,LEN (##,##) SCRUBA DOES THAT.
*
* ONLY THE FIRST PARM CAN HAVE "KEY", WHICH IS REQUIRED.
* SAVE THE FLAG AND MAKE SURE IT'S THE FIRST.
*
* SCRUBX -- IS THE COMMON ERROR ROUTINE. R6=WHERE WE ARE.
*
* AFTER ##,## AND KEY IF IT'S THERE, WE CAN HAVE EITHER
* A NUMERIC FIELD DATA TYPE, OR A CHARACTER STRING.
* FOR A CHARACTER STRING, WE JUST SAVE THAT.
*
* A DATA TYPE IMPLIES A NUMERIC FIELD, EG SS# OR USER-ID.
* DATA TYPES ARE B-INARY, C-HARACTER, OR P-ACKED.
* SAVE THE TYPE, AND SKIP TO COMMA, AND THE NEXT PARAM, OR THE END)
*------- END OF CC PARSING DESCRIPTION ----------------------------
*
DC 8F'0'
SCRUBSET STM R14,R5,SCRUBSET-32 SAVE REGS
LM R4,R5,LSCRUB LOAD LENGTH+ADDR OF TABLE
LTR R5,R5 Q. FIRST TIME?
BNZ SCRUBA NO, GO LOAD PARAMS
------ PUT THIS SOMEWHERE, MAYBE WITH DD= PROCESSING -----------
------ THAT MAKES SENSE, WITH THESE CONTROL CARDS IN THAT
------ AREA OF THE //SYSIN FILE.
* ------------- FIND = IN SCRUB=(
USING DSCRUB,5
TRT 0(8,R6),FINDEQ
BZ SCRUBX
CLI 1(R1),C'(' Q. SCRUB=(##
BNE SCRUBX NO, ERR
SCRUBA XC 0(LDSCRUB,R5),0(R5)
LA R6,2(R1)
CLI 0(R6),C'0' Q. NUMERIC?
BL SCRUBX NO, ERR
BAL R14,GET# GO GET LOC
SH R1,=H'1' SUBT 1 TO GET OFFSET NOT LOC
BM SCRUBX Q. LOC=0 = ERROR
ST R1,SLOC
BAL R14,QCOMMA
BAL R14,GET# GET LENGTH
SH R1,=H'1' CALC LENGTH-1 FOR EXECUTE INST
BM SCRUBX Q. LENGTH=0 SPEC? ERROR
ST R1,SLEN
BAL R14,QCOMMA BUMP TO KEY / 'STR' / DATA TYP
*
CLC =C'KEY,',0(R6) Q. KEY ENTRY?
BNE SCRUBAA NO.
C R5,ASCRUB Q. FIRST TABLE ENTRY?
BNE SCRUBX NO, ERROR
MVI SKEY,C'K' YES, SAVE FLAG
LA R6,4(R6) BUMP PAST KEY,
*
* NEXT WE CHECK FOR 'STR' OR ## DATA TYPE
*
SCRUBAA CLI 0(R6),C'B' Q. # TYPE OR 'STR'
BL SCRUAC 'STR', GO DO THAT'
BE SCRUAB # BIN
CLI 0(R6),C'C' Q. #,CHAR
BE SCRUAB
CLI 0(R6),C'P' Q. #,PACKED
** BE SCRUAB
BNE SCRUBX
SCRUAB MVC SFORMAT,0(R6) SAVE ## DATA TYPE
TRT 0(8,R6),FINDCOMA AND FIND COMMA
BZ SCRUBX (NO COMMA? WEIRD)
B SCRUAD
*
* STRAIGHT FORWARD 'STR'. CHARACTER IS IMPLIED.
* AND THE ' CAN BE ANYTHING LOWER THAN C'A'
*
SCRUAC MVC DSQUOTE,0(R6)
LA R1,1(R6)
LA R1,1(R1) FIND CLOSE QUOTE
CLC 0(1,R1),0(R6)
BH SCRUBC+4
BNE SCRUBX Q. LOW CHAR, NOT QUOTE, YES, ERROR
LA R2,1(R1) SAVE ADDR FOLLOWING QUOTE
SR R1,R6 'ABC' CALC LENGTH OF STRING
SH R1,=H'2' CALC LENGTH-1 OF DATA
STH R1,DSTRLEN SAVE THAT
MVC DSTRING(0),1(R6) AND SAVE THE STRING
EX R1,*-6
LR R6,R2 POINT TO COMMA AFTER '
*
SCRUAD ST R5,ESCRUB SAVE LOC OF END
CLI 0(R6),C')' Q. END OF LIST
BE SCRUBAF YES, EXIT
CLI 0(R6),C',' Q. ANOTHER PARM
BNE SCRUBX NO, ERROR
LA R6,1(R6)
CLI 0(R6),C'+' Q. CONTINUED ON NEXT REC?
BNE *+8 NO
BAL R14,GETCARD YES, GO GET REC
CLI 0(R6),C'0' Q. NUMERIC?
BL SCRUBX NO, ERROR
LA R5,LDSCRUB(R5) YES, POINT TO NEXT TBL ENTRY
B SCRUBA+4 AND LOOP.
* --------------------------- END OF CONTROL CARD -----------
SCRUBAF L R5,ASCRUB FIRST ENTRY
LA R1,SCR#CHAR
CLI DSFORMAT,C'C' Q. CHAR? YES, GO CALC BEG AND SAVE
*
BE *+20
BH *+12 Q. HIGH = PACKED, GO GET THAT ADDR
SH R1,=H'16' Q. LOW, MUST BE B-INARY
B *+8 SAVE THAT
LA R1,16(R1) POINT TO PACKED STRING
*
LR R2,R1
---- L R0,DSLEN
LR R0,R1
S R1,DSLEN
ST R1,DS#LOC
*
SCRUBD LM R14,R5,SCRUBSET-32
BR R14
DROP R5
*
SCR#BIN DC 2D'0'
SCR#CHAR DC CL16'0'
SCR#PACK DC PL16'0'
*
SCRUBADD AP SCR#PACK,P1
CVB R0,SCR#PACK+8
ST R0,SCR#BIN+12
UNPK 12(15,R13),SCR#PACK+8(8)
OI 27(R13),C'0'
MVC SCR#CHAR,12(R13)
BR R14
*
SCREDIT STM R14,R5,SCRUBSET-32
LM R4,R5,ASCRUB ADDR AND END
BAL R14,SCRUBADD
USING DSCRUB,R4
SCREDL L R14,DSTRLEN
MVC 12(0,R13),DSTRING
EX R14,*-6
LA R14,13(R13,R14)
MVC 0(5,R14),SCR#CHAR
LA R1,6(R14)
MVC 5(43,R14),PARM+88
LM R14,R15,DSLOC
LA R14,0(R14,R6)
MVC 0(0,R14),12(R13)
EX R15,*-6
LA R4,LDSCRUB(R4)
CR R4,R5
BL SCREDL