PROC SQL

PROC SQL is a SAS implementation of structured query language.

With PROC SQL one can treat SAS datasets as DBMS tables and can do any sort of SQL operations on it.

Below are some important facts about PROC SQL.

• We can use column number (position number of column in select statement) in order by clause

Select empId, jobcode, sal from employee order by 2;

Here instead of using word jobcode we have used 2 as position number of the column in order by clause

• PROC SQL FEEDBACK;

Select * from sasuser.tablename;

QUIT;

Here FEEDBACK is the debugging option which rewrites the query into the log after expanding ‘*’ into list of column names

• INOBS and OUTOBS – These options control the number of rows to be read in or write out

• DISTINCT – is the sql query keyword which removes duplicates from the table (duplicates are identified on the basis of column names provided to DISTINCT clause)

• Use of CALCULATED keyword

Select *, mark1+mark2 as TOT from table_name where CALCULATED TOT > 40;

In this case if do not use CALCULATED key word SAS returns ERROR as TOT is not a column name present in the table; how ever this arrangement reduces the performance.

• LABEL and FORAMTS in PROC SQL;

PROC SQL;

TITLE “some title string”;

Select sal label = ‘salary’, bonus format = ‘dollar12.2’ from table_name;

Quit;

• NON Co-Related Sub query – It is sub-query which do not references (co-relates) any of the column, table which is used in the outer query. In this case inner query is independent of outer query. e.g.

PROC SQL;

Select * from employee where job_code in (select id from job_desc where desc in (‘CA’,’ENGG’))

quit;

• Co-Related Sub query – When inner query references one the column used in the outer query then its called as co-related sub query. In this case inner query is depend on execution of outer query

e.g. PROC SQL;

Select FName, LName, Bdate from employee where ‘ADMIN’ = (select

dept_name from dept_table where employee.empId = dept_table.empId)

quit;

• NOEXEC & VALIDATE : These options are used to validate or check syntax errors in the sql query

PROC SQL NOEXEC; OR

PROC SQL;

VALIDATE select * from table_name ;

VALIDATE some_sql_query;

QUIT;

• A single inner join can combine 32 tables (i.e. in single query) but only 2 tables can used in outer join. Syntax of using joins is as follows,

PROC SQL;

Select table1.columns, table2.columns from

table1 INNER JOIN / LEFT JOIN / RIGHT JOIN / FULL JOIN table2

on {table1.ID_COLUMNS = table2.ID_COLUMNS}

QUIT;

• SET OPERATORS – Various SET operator are INTERSECT, UNOIN, OUTERUNOIN, EXPECT.

PROC SQL;

Select * from table1

INTERSECT / UNION / OUTERUNION / EXPECT ALL/CORRESPONDING

Select * from table2;

quit;

• INTERSECT is having highest precedence and all others except it are at the

same level of precedence

• ALL Keyword selects all rows, the reason to specify this keyword is that by Default SET operators select only UNIQUE rows rejecting duplicates

• CORRESPODING – selects common column to both tables

• Creating tables using PROC SQL;

1. Create table AS select * from some_other_table…….;

2. create table LIKE another_table;

3. create table (column1 data_type, column2 data_type_ _ _ _ _);

• Inserting records into tables using PROC SQL

1. INSERT INTO tableName

Set column1 = value1,

column2 = value2;

Set column1 = value3,

column2 = value4;

2. INSERT INTO tableName values (‘column1_value1’, ‘column2_value1’), (‘column1_value2’, ‘column2_value2’);

3. INSERT INTO tableName SELECT column1, column2 from some_other_table;

• INTEGRITY CONSTRAINTS

Create table table_name ( ID char(5) PRIMARY KEY, NAME char(20) NOT NULL, GENDER char(1) CHECK (IN (‘M’, ‘F’));

Create Index index_name on lib.data_set_name (var_list_index_needs_to_be_created) ;

• Integrity constraints can also be employed as

Create table table_name ( ID char(5), name char(20), gender char(1),

Constraint ID_PK PRIMARY KEY (ID),

Constraint NAME_NOT_NULL NOT NULL (NAME),

Constraint CHECK_GENDER CHECK (GENDER IN (‘M’,’F’)) );

The General syntax is

Constraint constraint_given_name constraint_type (column_name/s)

• To view the constraints, we can use following query

Describe table constraints table_name/s;

• CASE-WHEN Statement

• CASE case_operand/expression_on_case_operand

WHEN val1 THEN result1

WHEN val2 THEN result2

ELSE result3

END;

Case when statement can be used in select, update, insert statements

• NUMBER/NONUMBER – this option is similar to OBS/NOOBS option in PROC PRINT, it instructs SAS about whether or not to print observation number in output.

• PROC SQL STIMER; STIMER option shows time taken for execution of each query used in PROC SQL, this information can used for comparison or benchmarking to achieve optimization

• RESET option_name; this statement resets the value of specified option to default value.

• GETOPTION( ) – it is very useful function to get the system option while calling a macro and then we can change the settings

• ERRORSTOP/NOERRORSTOP – Tells SAS whether to stop executing batch or remaining SQL queries if any error is encountered