Linear regression is a statistical technique that establishes a relationship between a dependent variable and one or more independent variables by fitting a linear equation to the data. It aims to predict continuous outcomes by reducing the gap between actual and predicted values, optimizing the fit by minimizing the mean squared error. This approach finds the line that best represents the observed data.
In contrast, Logistic regression is used for classification tasks. It models the likelihood of a binary outcome through a logistic, or sigmoid, function. Unlike linear regression, which predicts continuous values, logistic regression generates probabilities, making it ideal for categorical target variables, often limited to binary classes (0 or 1).
Both methods rely on linear equations for making predictions and perform well with linearly separable datasets. Meanwhile, Linear regression forecasts continuous values, whereas logistic regression estimates probabilities, making it suitable for classification problems, usually with binary outcomes.
Logistic regression employs the Sigmoid function, also known as the logistic function, to transform its linear predictions into probabilities that lie within a range of 0 to 1. This transformation is crucial, as it allows the model's outputs to be interpreted as probabilities, which is especially useful in classification tasks. The Sigmoid function has an “S”-shaped curve that maps any real-valued input to a value between 0 and 1, effectively compressing extreme values in either direction.
The Sigmoid function's output can then be interpreted as the probability of an instance belonging to a particular class. By setting a threshold, often at 0.5, the model assigns a final classification: if the probability is greater than or equal to 0.5, the instance is classified as one class (e.g., 1), and if it is below 0.5, it is classified as the other class (e.g., 0). This threshold can be adjusted depending on the specific requirements of the classification problem, allowing flexibility in decision-making for cases where misclassification costs may differ.
In logistic regression, Maximum Likelihood Estimation (MLE) is applied to estimate the model parameters, or weights, by finding the values that make the observed data most probable. MLE works by adjusting the model’s parameters to maximize the likelihood function, which represents the probability of the observed outcomes given the input data and the model. MLE aims to find parameter values that produce probabilities close to the actual labels in the dataset. In logistic regression, where the outcomes are binary (0 or 1), the likelihood function is constructed to represent the probability of each observed label. For each data point, MLE uses the Sigmoid function to map the predicted values to probabilities between 0 and 1. Then, it combines these probabilities into a single likelihood function for the entire dataset. By iteratively adjusting the parameters to maximize this log-likelihood, MLE finds the best-fit weights that make the predicted probabilities align closely with the actual labels. This results in a logistic regression model that is well-calibrated to make accurate predictions based on the training data.
Accuracy : 79%
Accuracy : 68%
Logistic Regression outperformed Multinomial Naïve Bayes in this case, achieving higher accuracy and better class 1 identification. Logistic Regression’s continuous feature handling likely aligns better with the dataset structure. Meanwhile, Multinomial Naïve Bayes might benefit from further tuning or additional feature engineering, but its lower accuracy suggests that it may not be as suitable for this dataset as Logistic Regression.