SAS IN= option

Dataset Option IN=

If one is dealing with multiple datasets in SET statement then IN= dataset option gives power to the programmer to identify the constituting dataset of the corresponding record.

SAS creates temporary variable with the name specified with IN= option and sets its value to 1 if the record under processing corresponds to that particular dataset else its value will be set to 0.

Hence IN= dataset option is generally used in conjunction with if/else conditioning.

Lets get it clarified with the example below.

Scenario: Consider two datasets QTR1 and QTR2.

QTR1 got the deposits from the customers for the month of MAY, JUN and JUL.

QTR2 got the deposits from the customers for the month of AUG, SEP and OCT.

Now we need to create the another dataset TOT_BAL which will contain only total_balance for corresponding accounts.

Lets see how we can achieve this using IN= dataset option.

SAS code

/* prepare input dataset QTR1 and QTR 2 */

data QTR1;

input acc_no $ APR_BAL MAY_BAL JUN_BAL;

cards;

Q101101 500 300 400

Q101201 340 230 400

;

data QTR2;

input acc_no $ AUG_BAL SEP_BAL OCT_BAL;

cards;

Q201101 800 300 400

Q201201 840 230 400

;

/* Please note that because column names are different

in both datasets those needs to be treated separately */

data TOT_BAL (keep=acc_no total_balance);

set QTR1(IN=inq1) QTR2(IN=inq2);

total_balance = 0;

if inq1 then total_balance = sum(total_balance, APR_BAL,MAY_BAL,JUN_BAL);

if inq2 then total_balance = sum(total_balance, AUG_BAL,SEP_BAL,OCT_BAL);

/* Examine the contents of PDV */

put _ALL_;

run;

proc print data =TOT_BAL;

run;

It is clear from the LOG below that IN dataset option returns 1 if the current observation is contributed by that specific dataset else it returns 0.

LOG:

81

82 data TOT_BAL (keep=acc_no total_balance);

83 set QTR1(IN=inq1) QTR2(IN=inq2);

84 total_balance = 0;

85 if inq1 then total_balance = sum(total_balance, APR_BAL,MAY_BAL,JUN_BAL);

86 if inq2 then total_balance = sum(total_balance, AUG_BAL,SEP_BAL,OCT_BAL);

87

88 /* Examine the contents of PDV */

89 put _ALL_;

90 run;

inq1=1 inq2=0 acc_no=Q101101 APR_BAL=500 MAY_BAL=300 JUN_BAL=400 AUG_BAL=. SEP_BAL=. OCT_BAL=.

total_balance=1200 _ERROR_=0 _N_=1

inq1=1 inq2=0 acc_no=Q101201 APR_BAL=340 MAY_BAL=230 JUN_BAL=400 AUG_BAL=. SEP_BAL=. OCT_BAL=.

total_balance=970 _ERROR_=0 _N_=2

inq1=0 inq2=1 acc_no=Q201101 APR_BAL=. MAY_BAL=. JUN_BAL=. AUG_BAL=800 SEP_BAL=300 OCT_BAL=400

total_balance=1500 _ERROR_=0 _N_=3

inq1=0 inq2=1 acc_no=Q201201 APR_BAL=. MAY_BAL=. JUN_BAL=. AUG_BAL=840 SEP_BAL=230 OCT_BAL=400

total_balance=1470 _ERROR_=0 _N_=4

NOTE: There were 2 observations read from the data set WORK.QTR1.

NOTE: There were 2 observations read from the data set WORK.QTR2.

NOTE: The data set WORK.TOT_BAL has 4 observations and 2 variables.

NOTE: DATA statement used (Total process time):

real time 0.10 seconds

cpu time 0.04 seconds

91

92 proc print data =TOT_BAL;

93 run;

NOTE: There were 4 observations read from the data set WORK.TOT_BAL.

NOTE: PROCEDURE PRINT used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

One can note that Two temporary variables inq1 and inq2 have got created which got value 1 OR 0 depending on the current records corresponds to the QTR1 or QTR2.

Below is the output of the code above; as expected.

Output:

The SAS System 19:39 Saturday, October 31, 2009 3

total_

Obs acc_no balance

1 Q101101 1200

2 Q101201 970

3 Q201101 1500

4 Q201201 1470