Task :
Classification
Profiling (contributions of each predictors)
Steps :
Estimated probabilities for each class
Use cutoff on these probabilities in order to classify each case into one of the class
Method: Maximum Likelihood Estimation
Hypothesis Model:
Odds(Y = 1) = e^(β0 +β1 x1 +β2 x2 +...+βqxq)
Sigmoid function:
Help to convert to probabilities
Model Assumption:
Misclassification rate:
default is 0.5, assign label to probabilities over 0.5
Sensitivity: how sensitive the model / cutoff combination is to detecting events
Sensitivity = (# correctly classified event )/( total # event observations )
Specificity: how specific the model / cutoff combination is in detecting events
Specificity = (# correctly classified non-event )/( total # non-event observations )
ROC : probabilites curve
AUC : Area under curve
Higher AUC and ROC have better prediction
Fits well:
relationship reasoning or not
HL-test
H0: Model accurately described the data H1: Model don't accurately described the data
P-value small - reject H0
P- value large - accept H0
Observations independent :
Pearson Residue Check
if there is pattern - observation dependent
if there is not pattern - observation independent
Intercept : log of the odds value when our predictors are all equal to zero
coefficient of the j th predictor : is the value such that a unit increase in the j th predictor is associated with the odds of an “event” increasing or decreasing by a multiplicative factor of e βj , holding everything else constant
coefficient of a dummy variable : is the value such that the ratio of the odds for the dummy value to the odds for the reference level is e^(βj) , holding everything else constant
coefficient of an interaction term : βij , is the value such that, e^(βij) is the multiplicative increase or decrease in the odds ratio for Xi between the reference level and the dummy value (Xj is a dummy variable)