Logistic Regression and Classification

Nikki Stix George

An Introduction to Probit and Logit using the HMDA dataset of Munnell et al (1996)

Probit and logit models are among the most popular models when modelling a target variable that represents a binary choice. The dependent variable is a binary response/choice/condition, commonly set out as a stark 0 or 1 variable. (Survival or not). Other common examples might include a consumer making a purchase or not, and whether an individual participates in the labor market or not. Logistic regression calibrates the sensitivity of the categorical dependent variable to one or more independent variables. One can infer an estimated probability by applying a logistic function. In this respect, probit regression handles the same problem as logit, with the former using a cumulative normal distribution curve instead.

Logistic regression can be viewed as a particular instance of the generalized linear model and not unlike linear regression. Logistic regression, however, is premised on quite different assumptions (about the relationship between the dependent and independent variables) from those of linear regression. In Logistic Regression the conditional distribution conforms to a Bernoulli distribution rather than a Gaussian distribution, because the dependent variable is binary. The predicted values are probabilities and are therefore bounded by (0,1). The logistic regression predicts the probability of particular outcomes rather than the outcomes themselves. We take a dataset developed by Munnell, Tootell, Browne, and McEneaney (1996). To get background to the Varian (2014) HMDA example please follow link. Some interesting perspective on dataset are provided by Varian (2014). Below we follow the approach outlined in chapter 11 of the online text: Introduction to Econometrics with R .

Google Colaboratory

MITCOURSEWARE: 3.2 Modeling the Expert: An Introduction to Logistic Regression

Please check MIT website for code and data.

Google Colaboratory

R: Julia Silge on Penguins and Classification

Please check here.

Python: penguin dataset: EDA, classification and clustering

Please check here.

Please check here also here.

Page updated

Google Sites

Report abuse