When you are done with this section, you should be able to:
Define: data matrix, row, column, observation and variable (draw an appropriate diagram and indicate what part of the diagram corresponds to each of these terms).
Give the general differences between the formats of data matrices and data reports.
Distinguish between a datum and an observation.
Describe how a data matrix will change when new observations are added.
Properly add identification values tthe data matrix.
List the three commonly used measurement scales and give examples of each.
Understand that there are many important types of behavior and observational abilities needed as you approach the problem of data analysis as a process.
Describe the broad characteristics of SAS as a language, including how it is divided into major building blocks and how these can be combined.
Distinguish between the SAS log and SAS output.
Describe the indicators in the SAS log that provide information on whether your data were successfully entered into the data matrix and whether there were any syntax errors.
Enter SAS statements accurately and with good style so that the overall structure of the analysis is clear and each significant operation is clearly structured.
Have some sensitivity in spotting syntax errors without the need for machine assistance (e.g., be able to spot errors in other people's SAS programs even though you aren't particularly aware of the details of their analysis.)
Demonstrate how to construct comments within a SAS job and be able to use them effectively.
Describe the major options available with the OPTIONS statement be able to place an appropriate OPTIONS statement in a SAS job.
Understand the role of a DATA step in a SAS job.
Show how to run SAS on a personal computer.
Control the creation of a named data matrix.
Identify the limitations on variable names and how effective names can be constructed.
Demonstrate how data values are entered on lines and where these should occur in a SAS job.
Describe what is meant by the term "free-format."
Place a PROC PRINT step appropriately in a SAS job.
Describe the use of each of the PROC PRINT options.
Demonstrate how to control what variables are printed, the order in which they are printed, and how to substitute a variable for the OBS number that is ordinarily printed.
Show the proper construction of several title lines and where these should be placed to work with a SAS procedure.
Identify how title lines are changed so that new titles are printed with each procedure.
Put in a title for one procedure and remove it from the next procedure.
List the INPUT statement for a DATA step that uses a variable to read data consisting of a few letters, a few letters with a blank embedded, many letters with no blanks embedded, and many letters with a blank embedded.
Show how the input data lines need to be constructed if one of the variables consists of letters and has a blank embedded.
Demonstrate how SAS can be directed to skip over leading blanks and start reading a variable at its first letter.
Code missing values in a line of input data. Define the term "default."
Convert a specific data value ta SAS missing data value.
Show how a new variable can be created by using an existing variable and performing some common mathematical transformation, such as division or multiplication.
Describe when parentheses are needed in a mathematical expression and demonstrate their use.
Provide examples of the use of mathematical functions, such as taking the square root or sine of a value, and describe what action SAS will take if the value to be transformed lies outside the defined range for the function (such as the square root of a negative number).
Use the multi-value functions, such as MIN and SUM, and describe the result of including a missing value in one of the variables.
Give an example of performing a conditional calculation of a value using the IF ... THEN ... ; ELSE ... ; statement.
List the six relational operators, including GT and LT, and give their alternative SAS symbols.
Show how to perform a conditional test of a character variable.
Use a compound conditional test, such as one that includes an AND or OR linkage of two conditional tests.
Describe under what conditions an output format might be used.
Properly distinguish between the precision with which data are stored versus how they are printed with an output format.
List the main procedures in which output formats might be appropriate.
Provide examples of the ways in which date values can be entered on data input lines and provide appropriate input formats so that there data can be read as SAS date variables. Demonstrate how to take numeric values for a month, day and year and produce a new variable that contains a SAS date value.
Describe the evidence that you need to know that a variable has been read properly as a date value.
Give an example of what statements are needed to print dates in a PROC PRINT step.
Show how the format in which dates are read from the input lines can be changed in an output procedure (such as 09AUG1985 being input and 9/8/85 being output).
Effectively use a PROC FORMAT step.
Give an example of a PROC FORMAT step that will categorize a range of values into a set of discrete units.
Identify what SAS keyword is used in a PROC FORMAT step to identify the bottom and top ends of a range of values when the actual values are not known.
Provide an appropriate format name when the values on the left side of the equal sign in a PROC FORMAT step are not numbers.
List those characters that require special handling as substitution values (those on the right side of the equal sign) in a PROC FORMAT step.
Show the proper SAS code tsort a data matrix by a the values of a specific variable.
Identify several situations where you might want to sort a data matrix.
Distinguish between sorting data matrix and replacing the original data matrix with the one that is sorted and creating a new data matrix for the results of the sort.
Describe why two variables would be used in a sort specification and what sort of results would be produced.
Comment on the order in which values are sorted (e.g., do numbers come before or after letters).
Show how tcategorize SAS dates as major intervals, such as month or year.
Calculate the interval between two dates in terms of the number of days, weeks, months or years.
Provide an example of a situation in which you might use more than one DATA step in a SAS job.
Describe the general structure of the SAS "dataset library."
Indicate how to create a named data matrix.
Provide examples of the names that SAS uses for data matrices that are produced in DATA steps when you don't give your own names.
Show the way to confirm how many observations there are in a specific SAS data matrix.
Give an example of how you would use a named data matrix in a SAS procedure, such as PROC PRINT.
Demonstrate the use of an existing data matrix in a DATA step and provide an example of when you might want to do this.
Describe a situation in which you would produce a new data matrix that is a subset of an existing data matrix and show how to do this in a DATA step; give examples of input and output data matrices.
Show how to use sub-setting condition that involves criteria from several variables.
Provide the evidence that would tell you that the results of a sub-setting operation on a data matrix resulted in a new data matrix with no observations.
Give two alternative ways to create a new data matrix that has fewer variables; illustrate this with an example of the starting and ending data matrices.
Illustrate the three general situations involved in joining two data matrices with examples of starting and ending data matrices.
Distinguish between "concatenation" and "merging."
Identify the role of sorting in joining data matrices, clearly indicating when it is needed and when it need not be done.
Show how to add one data matrix to another so that the observations match according to the values of a common identification variable.