logit
Degree Of Freedom (DF)
For a hypothesis test on a model as a whole, the DF equals the number of predictor variables
For a hypothesis test on a predictor variable of a model, if the variable is numeric, the DF=1. If the variable is categorical and there are k categories then
the DF equals k-1.
The reason is that we need k-1 coded variables to represent the categorical variable.
E.g. Quality = {Good, Bad, Normal}
The we have 2 coded variables as follows
code1 code2
quality=good 1 0
quality=bad 0 1
quality=normal 0 0
The above coding is also called dummy coding, it can be enabled by setting param=ref
proc logistic data =test;
class quality / param=ref;
model price = quality;
run;
If there is no param=ref, the default coding is effect coding, as follows:
code1 code2
quality=good 1 0
quality=bad 0 1
quality=normal -1 -1
generally the two coding schemes are the same, but effect coding has some benefits when you have an interaction of two categorical variables.
Model Fit Statistics
The "-2 Log L" is the log of the likelihood (L) of the model. The likelihood score can be derived by algorithms like Fisher's Scoring or Newton-Raphson.
AIC (Akaike Information Criterion) and SC (Schwarz Criterion) are both variants of the "-2 Log L" by adding a penalty factor
AIC/SC = -2 Log L + e, e is formula of small things, so it can return a more accurate likelihood
The model fit statistics is the smaller the better. the statistics itself is meaningful, but comparing the scores of two models does.
In SAS, it compares the socre of the model with intercept only to the score of the model with both intercept and covariates (variables), if the score of the model
with intercept and covariates is smaller then we say the predictor variables help fit the data better.
Model Fit Statistics, example:
Intercept Intercept and Criterion Only Covariates AIC 233.289 168.236 SC 236.587 181.430 -2 Log L 231.289 160.236
Testing Global Null Hypothesis: BETA=0
Empty Model (BETA=0). In SAS when testing a model as a whole against an empty model (ie BETA=0)
The Null Hypothesis here is BETA=0 which means the model has intercept only, all other coefficients are zero so the variables are not participating.
The idea of testing global null hypothesis is to compare the variance between the full model and the one with intercept only.
If the Chi-Square statistics is big, with "Pr > ChiSq" (p-value) smaller than 0.05 (the 0.05 alpha level), then we say the full model significantly better
than the NULL model.
What is NULL Hypothesis
The hypothesis actually to be tested is usually given the symbol H0, and is commonly referred to as the null hypothesis. As is explained more below, the null hypothesis is assumed to be true unless there is strong evidence to the contrary – similar to how a person is assumed to be innocent until proven guilty.
The other hypothesis, which is assumed to be true when the null hypothesis is false, is referred to as the alternative hypothesis, and is often symbolized by HA or H1. Both the null and alternative hypothesis should be stated before any statistical test of significance is conducted. In other words, you technically are not supposed to do the data analysis first and then decide on the hypotheses afterwards.
In SAS, NULL hypothesis normally assumes there is no significant difference between the two models to be compared.
Alternative hypothesis assumes there is a significant difference between the two models.
Testing Global Null Hypothesis: BETA=0, Example:
Test Chi-Square DF Pr > ChiSq Likelihood Ratio 71.0525 3 <.0001 Score 58.6092 3 <.0001 Wald 39.8751 3 <.0001
Analysis of Maximum Likelihood Estimates
The Analysis here tests on each of the predictor variables individually. The NULL hypothesis = an predictor's regression coefficient is zero, so the predictor is not participating.
The test compares the variance between the full model and the model without the selected predictor to evaluate the importance of the selected predictor. It calculates the Wald Chi-Square (to measure the variation), and if "Pr > ChiSq" (p-value) smaller than 0.05/0.01 (the 0.05/0.01 alpha level), then we say the full model significantly better than the model without the selected predictor.
Analysis of Maximum Likelihood Estimates, example:
Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -12.7772 1.9759 41.8176 <.0001 female 1 1.4825 0.4474 10.9799 0.0009 read 1 0.1035 0.0258 16.1467 <.0001 science 1 0.0948 0.0305 9.6883 0.0019
It also gives the prameter Estimate (something like expectation) and the Standard Error (something like standard deviation).
Therefore the model (log odds): Log[p / (1-p)] = b0 + b1 * predictor1 + b2 * predictor2 + b3 * predictor3 = -12.7772 + 1.4825 * female + 0.1035 * read + 0.0948 * science
We can interpret the Estimate as follows: for a one unit change in the predictor variable, the difference in log-odds for a positive outcome is expected to change by the respective coefficient, given the other variables in the model are held constant.
Odds Ratio Estiamtes
The above analysis is for the log-odds. here we also have the analysis for Odds. (log-odds = log (odds) )
The Point Estimate for each of the predictor variables is obtained by exponentiation the the Estimate in the above. E.g. 4.404 = e ^ 1.4825
Again we can interpret the Point Estimate as: for a one unit change in the predictor variable, the odds ratio for a positive outcome is expected to change by the respective coefficient, given the other variables in the model are held constant.
Odds Ratio Estimates, Example:
Point 95% Wald Effect Estimate Confidence Limits female 4.404 1.832 10.584 read 1.109 1.054 1.167 science 1.099 1.036 1.167
The 95% Wald Confidence Limits give the interval which would include the true Point Estimate with a confidence of 95%. As 4.404 is within [1.832, 10.584] we say the Estimate is not unusual.
The estimate +/- 1.96 standard error gives the 95% confidence interval, and e^interval gives the confidence limits.