PROC IMPORT


Reading an Excel file into SAS

Suppose that you have an Excel spreadsheet called auto.xls. The data for this spreadsheet are shown below.

MAKE           MPG  WEIGHT PRICE
AMC Concord 22 2930 4099
AMC Pacer 17 3350 4749
AMC Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827

Using the Import Wizard is an easy way to import data into SAS.  The Import Wizard can be found on the drop down file menu.  Although the Import Wizard is easy it can be time consuming if used repeatedly.  The very last screen of the Import Wizard gives you the option to save the statements SAS uses to import the data so that they can be used again.  The following is an example that uses common options and also shows that the file was imported correctly.

PROC IMPORT OUT= WORK.auto1 
DATAFILE= "C:\auto.xls"
DBMS=EXCEL REPLACE;
SHEET="auto1";
GETNAMES=YES;
MIXED=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
proc print data=auto1;
run;


Obs MAKE MPG WEIGHT PRICE

1 AMC Concord 22 2930 4099
2 AMC Pacer 17 3350 4749
3 Amc Spirit 22 2640 3799
4 Buick Century 20 3250 4816
5 Buick Electra 15 4080 7827
  • First we use the out= statement to tell SAS where to store the data once they are imported. 
  • Next the datafile= statement tells SAS where to find the file we want to import. 
  • The dbms= statement is used to identify the type of file being imported.  This statement is redundant if the file you want to import already has an appropriate file extension, for example *.xls. 
  • The replace statement will overwrite an existing file.
  • To specify which sheet SAS should import use the sheet="sheetname" statement.  The default is for SAS to read the first sheet.  Note that sheet names can only be 31 characters long.
  • The getnames=yes is the default setting and SAS will automatically use the first row of data as variable names.  If the first row of your sheet does not contain variable names use the getnames=no
  • SAS uses the first eight rows of data to determine whether the variable should be read as character or numeric.  The default setting mixed=no assumes that each variable is either all character or all numeric.  If  you have a variable with both character and numeric values or a variable with missing values use mixed=yes statement to be sure SAS will read it correctly. 
  • Conveniently SAS reads date, time and datetime formats.  The usedate=yes is the default statement and SAS will read date or time formatted data as a date.  When usedate=no SAS will read date and time formatted data with a datetime format.  Keep the default statement scantime=yes to read in time formatted data as long as the variable does not also contain a date format.

 Making a permanent data file

What if you want the SAS data set created from proc import to be permanent? The answer is to use libname statement. Let's say that we have an Excel file called auto.xls in directory "d:\temp" and we want to convert it into a SAS data file (call it myauto) and put it into the directory "c:\dissertation". Here is what we can do.

libname dis "c:\dissertation";
proc import datafile="d:\temp\auto.xls" out=dis.myauto replace;
run;

 Reading in a specific sheet

Sometimes you may only want to read a particular sheet from an Excel file instead of the entire Excel file. Let's say that we have a two-sheet Excel file called auto2.xls. The example below shows how to use the option sheet=sheetname to read the second sheet called page2 in it.

proc import datafile="auto2.xls" out=auto1 replace;
sheet="page2";
run;

 Reading a file without variable names

What if the variables in your Excel file do not have variable names? The answer here is to use the statement getnames=no in proc import. Here is an example showing how to do this.

proc import datafile="a:\faq\auto.xls" out=auto replace;
getnames=no;
run;

Comments