SAS Merge Statement - Combining Datasets

2. MERGING :

Merging is nothing but combining two or more SAS datasets horizontally

In one to one merging dataset are not required to be sorted

But for match merging sorted data sets are mandatory.

We can use MERGE statement for this.

Two types of merging can be done

a. One to One Merge

b. Match Merge

While merging data sets SAS writes one new observation for each observation in the data set with the largest number of observation in the by group.

In match merging if the relation is one to many we get different results than what we would get with sql join. This has been illustrated with following example.

ILLUSTRATION : SAS Merge Example

SAS Code for Merge with Merge Statement and PROC SQL

data data_set_1;

INPUT ID X $;

CARDS;

1 X1

1 X2

2 X3

3 X4

3 X5

;

RUN;

data data_set_2;

INPUT ID Y $;

CARDS;

1 Y1

2 Y2

2 Y3

3 Y4

3 Y5

;

RUN;

/* Merging Using DATA Step MERGE Statement */

Data DATA_MERGE_OUTPUT;

merge data_set_1 data_set_2;

by ID;

RUN;

/* Merging Using PROC SQL JOIN*/

PROC SQL;

Create TABLE PROC_SQL_OUTPUT

as select data_set_1.ID, data_set_1.X, data_set_2.Y

from data_set_1, data_set_2

where data_set_1.ID = data_set_2.ID;

QUIT;

SAS MERGE code requires that input datasets must be already sorted with variables listed in BY statement. However PROC SQL dont need this; and thus being advantageous.

But as we can see below there is significant difference between the outputs produced by these methods. So one must take precautions while using MERGE statement with many to many relationship.

Output