PROC SQL
PROC SQL is a SAS implementation of structured query language.
With PROC SQL one can treat SAS datasets as DBMS tables and can do any sort of SQL operations on it.
Below are some important facts about PROC SQL.
• We can use column number (position number of column in select statement) in order by clause
Select empId, jobcode, sal from employee order by 2;
Here instead of using word jobcode we have used 2 as position number of the column in order by clause
• PROC SQL FEEDBACK;
Select * from sasuser.tablename;
QUIT;
Here FEEDBACK is the debugging option which rewrites the query into the log after expanding ‘*’ into list of column names
• INOBS and OUTOBS – These options control the number of rows to be read in or write out
• DISTINCT – is the sql query keyword which removes duplicates from the table (duplicates are identified on the basis of column names provided to DISTINCT clause)
• Use of CALCULATED keyword
Select *, mark1+mark2 as TOT from table_name where CALCULATED TOT > 40;
In this case if do not use CALCULATED key word SAS returns ERROR as TOT is not a column name present in the table; how ever this arrangement reduces the performance.
• LABEL and FORAMTS in PROC SQL;
PROC SQL;
TITLE “some title string”;
Select sal label = ‘salary’, bonus format = ‘dollar12.2’ from table_name;
Quit;
• NON Co-Related Sub query – It is sub-query which do not references (co-relates) any of the column, table which is used in the outer query. In this case inner query is independent of outer query. e.g.
PROC SQL;
Select * from employee where job_code in (select id from job_desc where desc in (‘CA’,’ENGG’))
quit;
• Co-Related Sub query – When inner query references one the column used in the outer query then its called as co-related sub query. In this case inner query is depend on execution of outer query
e.g. PROC SQL;
Select FName, LName, Bdate from employee where ‘ADMIN’ = (select
dept_name from dept_table where employee.empId = dept_table.empId)
quit;
• NOEXEC & VALIDATE : These options are used to validate or check syntax errors in the sql query
PROC SQL NOEXEC; OR
PROC SQL;
VALIDATE select * from table_name ;
VALIDATE some_sql_query;
QUIT;
• A single inner join can combine 32 tables (i.e. in single query) but only 2 tables can used in outer join. Syntax of using joins is as follows,
PROC SQL;
Select table1.columns, table2.columns from
table1 INNER JOIN / LEFT JOIN / RIGHT JOIN / FULL JOIN table2
on {table1.ID_COLUMNS = table2.ID_COLUMNS}
QUIT;
• SET OPERATORS – Various SET operator are INTERSECT, UNOIN, OUTERUNOIN, EXPECT.
PROC SQL;
Select * from table1
INTERSECT / UNION / OUTERUNION / EXPECT ALL/CORRESPONDING
Select * from table2;
quit;
• INTERSECT is having highest precedence and all others except it are at the
same level of precedence
• ALL Keyword selects all rows, the reason to specify this keyword is that by Default SET operators select only UNIQUE rows rejecting duplicates
• CORRESPODING – selects common column to both tables
• Creating tables using PROC SQL;
1. Create table AS select * from some_other_table…….;
2. create table LIKE another_table;
3. create table (column1 data_type, column2 data_type_ _ _ _ _);
• Inserting records into tables using PROC SQL
1. INSERT INTO tableName
Set column1 = value1,
column2 = value2;
Set column1 = value3,
column2 = value4;
2. INSERT INTO tableName values (‘column1_value1’, ‘column2_value1’), (‘column1_value2’, ‘column2_value2’);
3. INSERT INTO tableName SELECT column1, column2 from some_other_table;
• INTEGRITY CONSTRAINTS
Create table table_name ( ID char(5) PRIMARY KEY, NAME char(20) NOT NULL, GENDER char(1) CHECK (IN (‘M’, ‘F’));
Create Index index_name on lib.data_set_name (var_list_index_needs_to_be_created) ;
• Integrity constraints can also be employed as
Create table table_name ( ID char(5), name char(20), gender char(1),
Constraint ID_PK PRIMARY KEY (ID),
Constraint NAME_NOT_NULL NOT NULL (NAME),
Constraint CHECK_GENDER CHECK (GENDER IN (‘M’,’F’)) );
The General syntax is
Constraint constraint_given_name constraint_type (column_name/s)
• To view the constraints, we can use following query
Describe table constraints table_name/s;
• CASE-WHEN Statement
• CASE case_operand/expression_on_case_operand
WHEN val1 THEN result1
WHEN val2 THEN result2
ELSE result3
END;
Case when statement can be used in select, update, insert statements
• NUMBER/NONUMBER – this option is similar to OBS/NOOBS option in PROC PRINT, it instructs SAS about whether or not to print observation number in output.
• PROC SQL STIMER; STIMER option shows time taken for execution of each query used in PROC SQL, this information can used for comparison or benchmarking to achieve optimization
• RESET option_name; this statement resets the value of specified option to default value.
• GETOPTION( ) – it is very useful function to get the system option while calling a macro and then we can change the settings
• ERRORSTOP/NOERRORSTOP – Tells SAS whether to stop executing batch or remaining SQL queries if any error is encountered