UNIVARIATE - Generate statistics
The procedure generates a bunch of statistical measures such as Mean, Std Dev, Range, Skewness, Extreme values etc.
proc univariate data=sashelp.cars;
variable horsepower;
run;
Some results are as follows:
Btw, mean is mean, median is the value in the middle, mode is the value appear the most times.
e.g 1223456, then mean = (1+2+2+3+4+5+6)/7, median = 3, mode =2
May use ODS (Output Delivery System) to specify the measures to show.
The following will show only the basic measures and extreme values.
ods select BasicMeasures ExtremeObs;
proc univariate data=sashelp.cars;
variable horsepower;
run;
MEANS
This procedure returns the Mean, Standard deviation, Min and Max of the dataset. The var controls the variables to be shown.
proc means data=TEST;
var gre gpa rank;
run;
FREQ
The FREQ procedure count the frequency of each variable in dataset. The tables keyword controls the variable to be shown in the result.
If tables Var1, then Var1 is shown as the row:
proc freq data=TEST;
tables admit;
run;
If tables Var1 * Var2, then Var1 is the row and Var2 is the column:
proc freq data=TEST;
tables admit*rank ;
run;
Frequency
Percent
Row Pct
Col Pct
There are 4 rows of numbers in each cell. The 1st is the number of appearance. The 2nd one is percentage w.r.t to the whole dataset. The 3rd one is the row percentage. The 4th one is the column percentage.
RANK
This is a pretty helpful proc for dividing a dataset into a nominated number of deciles. The following example divide the dataset into 10 groups, ordered by the sorted_column and the rank is output to the rank_column.
proc rank data = work.dataset groups=10 out=work.dataset_ranked;
var sorted_column;
ranks rank_column;
run;