FIRST. and LAST. automatic variables:
When SAS dataset has got sorted by some variables then at the time datastep execution SAS creates two automatic variables called FIRST. and LAST.
For example lets consider the example of ready made dataset CARS available in SASHELP library. From PROC CONTENTS of this dataset we can get the information that this dataset is already sorted by MAKE and TYPE.
So if you execute a data step on this data with BY statement while execution you will get FIRST.MAKE and LAST.MAKE as well as FIRST.TYPE and LAST.TYPE variables automatically getting created.
These variables will have values either 0 or 1; for first record within same set of MAKE the FIRST.MAKE will have value 1 otherwise 0; likewise for last record within same set of MAKE the LAST.MAKE will have value 1 otherwise 0.
We can logically use these variables for achieving required results.
RETAIN statement:
At the time of datastep execution we know that after each iteration all the variable/columns present in PDV get assigned to MISSING values. So every every variable gets new value from the new record. There is no way to refer the old value for some variable from previous record and make some decision based on that. Here RETAIN statement comes to an rescue.
With RETAIN statement you can persist the value of any variable for next iteration till the end of datastep. You can use any existing variable or create a new one to store the value of existing variable. You can also initialize the value to that variable in the RETAIN statement. Just remember that if this is going to be a new variable it will become permanent member of the dataset until you explicitly DROP it.
lets illustrate both of these concepts by following example.
There are various makes like TOYOTA, HONDA etc. present in SASHELP.CARS dataset and each of these MAKES have various types like SEDAN, SUV etc.; Now lets consider our requirement is to calculate the total of INVOICE amount for each type under each make and total of INVOICE amount for each MAKE as well.
Lets see how can we code this.
SAS Code:
DATA CARS_SUMMARY (KEEP=MAKE TYPE TOT_TYPE_INVOICE TOT_MAKE_INVOICE);
SET SASHELP.CARS;
RETAIN TOT_TYPE_INVOICE TOT_MAKE_INVOICE 0;
BY MAKE TYPE;
IF FIRST.MAKE THEN TOT_MAKE_INVOICE = 0;
IF FIRST.TYPE THEN TOT_TYPE_INVOICE = 0;
TOT_MAKE_INVOICE = TOT_MAKE_INVOICE + INVOICE;
TOT_TYPE_INVOICE = TOT_TYPE_INVOICE + INVOICE;
IF LAST.MAKE THEN OUTPUT;
RUN;
Output: Sample Output (of TOTAL 114 records)