Logistic Regression

Introduction

Task :

Steps :

Estimated probabilities for each class
Use cutoff on these probabilities in order to classify each case into one of the class

Method: Maximum Likelihood Estimation

Hypothesis Model:

Odds(Y = 1) = e^(β0 +β1 x1 +β2 x2 +...+βqxq)

Sigmoid function:

Help to convert to probabilities

Model Assumption:

Misclassification rate:

default is 0.5, assign label to probabilities over 0.5

ROC and AUC ：

Sensitivity: how sensitive the model / cutoff combination is to detecting events

Sensitivity = (# correctly classified event )/( total # event observations )

Specificity: how specific the model / cutoff combination is in detecting events

Specificity = (# correctly classified non-event )/( total # non-event observations )

ROC : probabilites curve

AUC : Area under curve

Higher AUC and ROC have better prediction

Fits well:

relationship reasoning or not
HL-test
- H0: Model accurately described the data H1: Model don't accurately described the data
- P-value small - reject H0
- P- value large - accept H0

Observations independent :

Intercept : log of the odds value when our predictors are all equal to zero
coefficient of the j th predictor : is the value such that a unit increase in the j th predictor is associated with the odds of an “event” increasing or decreasing by a multiplicative factor of e βj , holding everything else constant
coefficient of a dummy variable : is the value such that the ratio of the odds for the dummy value to the odds for the reference level is e^(βj) , holding everything else constant
coefficient of an interaction term : βij , is the value such that, e^(βij) is the multiplicative increase or decrease in the odds ratio for Xi between the reference level and the dummy value (Xj is a dummy variable)

Page updated

Google Sites

Report abuse