Feed Back And Queries

Navigation

PROC SORT - NODUPKEY DUPOUT TAGSORT

PROC SORT is one of the most frequently used procedure for data processing; Like any other sas proceduers SORT has got wide number of useful options.
Below is basic syntax with important options for PROC SORT.
 

SAS Code:

PROC SORT data = input_dataset out = output_dataset
NODUPKEY / NODUPREC <Other options like TAGSORT FORCE DUPOUT= >;
BY ASCENDING var1 DESCENDING var2...;
RUN;
 
 
 
Explanation of options:
  • OUT = dataset_name : Many time user dont want to rearrange original data set; in that case using OUT= option one can redirect the output to another dataset keeping original data intact.
  • NODUPKEY : If the original dataset contains rows with duplicate key columns (specified in BY statement)and we wish to keep only uniqe records then NODUPKEY will drop those records.
  • NODUPRECS (or NODUP) : It is same as NODUPKEY however it will check for the complete duplicate observation.
  • DUPOUT= dataset_name: This collects duplicate records deleted by NODUP options into some different dataset specified.
  • TAGSORT: Its kind of memory optimization or whenever we are short of resources then using TAG SORT would help; From all observations it will fetch only key columns specified in BY statement into a temporary file; SAS will work on that temporary file and once its done it will arrange the records from the original dataset accordingly.

 
Comments