Lecture 6

When we get the data, after data cleaning, pre-processing, and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But hold on! How in the hell can we measure the effectiveness of our model. Read more on this link

Logistic regression is a machine learning method used in the classification problem when you need to distinguish one class from another. The simplest case is a binary classification. This is like a question that we can answer with either “yes” or “no.” We only have two classes: a positive class and negative class.Read more on this link

There are various machine learning algorithms that can be put into use for dealing with classification problems. One such algorithm is the Decision Tree algorithm, that apart from classification can also be used for solving regression problems. Though one of the simplest classification algorithms, if its parameters are tuned properly can yield incredibly accurate results.Read more on the link

Logistic regression is a powerful statistical tool used to solve binary classification problems, such as predicting whether an event will occur or not. It essentially determines the extent to which there is a linear relationship between a dependent variable and one or more independent variables. Read more on the link

This confusion matrix gives a lot of information about the model’s performance: As usual, the diagonal elements are the correctly predicted samples. A total of 145 samples were correctly predicted out of the total 191 samples. Thus, the overall accuracy is 75.92%., Read more on the link

I like animals, so let’s explore KNN by looking at the differences between cats and dogs. They’re both four-legged mammals with tails, which some people keep as pets, but there are a few ways in which they differ. Most dogs are bigger than most cats, and most dogs are friendlier than most cats. That’s shown on this graph here - the red dots represent dogs and the blue dots represent cats:Read more on the link

As we already know from our previous discussion on Regression Trees, that tree algorithms are Greedy in nature which means they tend to choose the better node now, rather than choosing a node that will create a better tree later. Also, in contrast to the regression tree model where the prediction...Read more on the link

Decision tree is a widely-used supervised learning algorithm which is suitable for both classification and regression tasks. Decision trees serve as building blocks for some prominent ensemble learning algorithms such as random forests, GBDT, and XGBOOST.A decision tree builds upon iteratively asking questions to partition data. For instance, the following figure represents a decision tree used as a model to predict customer churn.Read more on the link

Decision Trees are powerful machine learning algorithms capable of performing regression and classification tasks. To understand a decision tree, let’s look at an inverted tree-like structure (like that of a family tree). We start at the root of the tree that contains our training data. At the root, we split our dataset into distinguished leaf nodes, following certain conditions like using an if/else loop.Read more on the link

As you can see from the diagram above, a decision tree starts with a root node, which does not have any incoming branches. The outgoing branches from the root node then feed into the internal nodes, also known as decision nodes. Based on the available features, both node types conduct evaluations to form homogenous subsets, which are denoted by leaf nodes, or terminal nodes. The leaf nodes represent all the possible outcomes within the dataset. As an eRead more on the link

A decision tree is a support tool with a tree-like structure that models probable outcomes, cost of resources, utilities, and possible consequences. Decision trees provide a way to present algorithms with conditional control statements. They include branches that represent decision-making steps that can lead to a favorable result..Read more on the link

Recommendation Engines: Using clickstream data from websites, the KNN algorithm has been used to provide automatic recommendations to users on additional content. This research (link resides outside of ibm.com) shows that the a user is assigned to a particular group, and based on that group’s user behavior, they are given a recommendation. However, given the scaling issues with KNN, this approach may not be optimal for larger datasets..Read more on the link

Page updated

Google Sites

Report abuse