PROC IMPORT

We already have seen how different types of external flat files can be read using INFILE and INPUT statements using data step.

There is yet another way of reading such files is using PROC IMPORT. although it is generally getting used for excel based files, it can be also used to read flat files. Let us observe how to get this done.

Syntax:

PROC IMPORT DATAFILE=“filename” | TABLE=“tablename”

OUT=output_sas_dataset <DBMS=dbtype> <REPLACE>;

<GETNAMES=YES|NO> <SHEET=excel_sheet> <RANGE=excel_range>;

RUN;

DATAFILE = This is fully qualified file name along with path. Otherwise you can use predefined file name.

TABLE = PROC IMPORT can also be used to read data from some external databases like Microsoft Access, PC File servers etc. In that case you provide the source table name here. Please not that to read from external database; you would ACCESS to that specific database installed on your site.

OUT = output dataset name.

DBMS = This is optional parameter; in case if we are dealing with files for which it is not specific that what type of data the file could have then you should provide this option. There are several options which you can use for specific databases/files as follows;

  • ACCESS - Used for Microsoft Access database

  • EXCEL/EXCEL97/XLS - used for excel files for different versions. For some UNIX installations only XLS option would work depending on specific version of SAS you are dealing with.

  • DLM - for delimited files; default delimiter is space.

  • CSV - for comma separated files.

  • TAB - for TAB separated files.

REPLACE - This is important option; if the same named dataset is present in specified library, PROC IMPORT does not replaces it with new data unless you specify this option.

GETNAMES = to read the column names from the files itself; by default it is yes. so you need to be careful to find whether or not the file contains header with column names.

SHEET = You can specify the specific sheet name from a multisheet excel document; that has to be read.

RANGE = You can specify the specific range from a specified sheet name from a multisheet excel document; that has to be read.

Examples: Reading delimited external file using PROC IMPORT

Consider that following type of data has to read from external file present on some location.

Data From External File

account_number,open_dt,acc_type,min_balance

111573691345,10JUL2010,C,0.00

111456234567,23JUN2011,O,1000.00

101345456734,21SEP1999,O,1000.00

101456234561,30OCT2011,C,0.00

111234345566,18APR2010,,0.00

Following code can used to read such type of data.

SAS Code:

proc import datafile='../proc_import_example.sas'

out=temp dbms=csv replace;

getnames=yes;

run;

following is the expected output; you can notice that PROC IMPORT has wisely assumed a missing field upon encountering two continuous commas. Remember you would have required to use DSD option if you have chosen DATA STEP to read this type of file.

Output:

Examples: Reading specific range of data from EXCEL file using PROC IMPORT

Consider following range of data from cell B4 to E9 from excel sheet called 'data_sheet' has to be read into SAS.

Excel Source Data

Following PROC IMPORT step can be used in this case.

SAS Code: PROC IMPORT

proc import datafile='/sasdata/prototype/slm/abt/source_data.xls'

out=exceldata dbms=xls replace;

getnames=yes;

/*sheet="data_sheet";*/

range='data_sheet$B4:E9'n;

run;

And following is the output as expected.

PROC IMPORT: Output