🌞Supervised learning is a fundamental machine learning technique used in various fields. Here’s a brief overview:
✍️Definition:
☘️Supervised learning involves training a model on labeled data to make predictions or decisions based on input features. The algorithm learns a mapping between input and output data, aiming to predict accurate outcomes for new, unseen data
✍️Examples:
☘️Classification: Assigns labels to input data, often used for tasks with distinct categories. For instance, classifying images of fruits into categories like apples, oranges, and bananas.
☘️Regression: Predicts a continuous numerical value based on input features.
✍️Popular Algorithms:
☘️Support Vector Machines (SVM)
☘️Decision Trees
☘️Random Forests
☘️Logistic Regression
☘️K-Nearest Neighbors (KNN)
✍️Linear Regression:
☘️Linear regression is a type of regression algorithm that is used to predict a continuous output value. It is one of the simplest and most widely used algorithms in supervised learning. In linear regression, the algorithm tries to find a linear relationship between the input features and the output value. The output value is predicted based on the weighted sum of the input features.
✍️Logistic Regression:
☘️Logistic regression is a type of classification algorithm that is used to predict a binary output variable. It is commonly used in machine learning applications where the output variable is either true or false, such as in fraud detection or spam filtering. In logistic regression, the algorithm tries to find a linear relationship between the input features and the output variable. The output variable is then transformed using a logistic function to produce a probability value between 0 and 1.
✍️Decision Trees:
☘️Decision tree is a tree-like structure that is used to model decisions and their possible consequences. Each internal node in the tree represents a decision, while each leaf node represents a possible outcome. Decision trees can be used to model complex relationships between input features and output variables.
☘️A decision tree is a type of algorithm that is used for both classification and regression tasks.
👉Decision Trees Regression: Decision Trees can be utilized for regression tasks by predicting the value linked with a leaf node.
👉Decision Trees Classification: Random Forest is a machine learning algorithm that uses multiple decision trees to improve classification and prevent overfitting.
✍️Random Forests:
☘️Random forests are made up of multiple decision trees that work together to make predictions. Each tree in the forest is trained on a different subset of the input features and data. The final prediction is made by aggregating the predictions of all the trees in the forest.
☘️Random forests are an ensemble learning technique that is used for both classification and regression tasks.
👉Random Forest Regression: It combines multiple decision trees to reduce overfitting and improve prediction accuracy.
👉Random Forest Classifier: Combines several decision trees to improve the accuracy of classification while minimizing overfitting.
✍️Support Vector Machine(SVM):
☘️The SVM algorithm creates a hyperplane to segregate n-dimensional space into classes and identify the correct category of new data points. The extreme cases that help create the hyperplane are called support vectors, hence the name Support Vector Machine.
☘️A Support Vector Machine is a type of algorithm that is used for both classification and regression tasks
👉Support Vector Regression: It is a extension of Support Vector Machines (SVM) used for predicting continuous values.
👉Support Vector Classifier: It aims to find the best hyperplane that maximizes the margin between data points of different classes.
✍️K-Nearest Neighbors (KNN):
☘️KNN works by finding k training examples closest to a given input and then predicts the class or value based on the majority class or average value of these neighbors. The performance of KNN can be influenced by the choice of k and the distance metric used to measure proximity. However, it is intuitive but can be sensitive to noisy data and requires careful selection of k for optimal results.
☘️A K-Nearest Neighbors (KNN) is a type of algorithm that is used for both classification and regression tasks.
👉K-Nearest Neighbors Regression: It predicts continuous values by averaging the outputs of the k closest neighbors.
👉K-Nearest Neighbors Classification: Data points are classified based on the majority class of their k closest neighbors.
✍️Gradient Boosting:
☘️Gradient Boosting combines weak learners, like decision trees, to create a strong model. It iteratively builds new models that correct errors made by previous ones. Each new model is trained to minimize residual errors, resulting in a powerful predictor capable of handling complex data relationships.
☘️A Gradient Boosting is a type of algorithm that is used for both classification and regression tasks.
👉Gradient Boosting Regression: It builds an ensemble of weak learners to improve prediction accuracy through iterative training.
👉Gradient Boosting Classification: Creates a group of classifiers to continually enhance the accuracy of predictions through iterations
CLASS MATERIALS
🌚Data Sources:
☘️Kaggle: 👉 Link
☘️UCI Machine Learning: 👉 Link
🌚Data Representation: 👉PDF
🌚Time Series Data: 👉PDF
🌚Structured and Unstructured Data: 👉PDF
🌚Basic concept of Supervised learning: 👉PDF
🌚Statistical Learning: 👉PDF
🌚Support Vector Machines(SVMs): 👉PDF
🌚Decision Trees: 👉PDF
🌚Random Forest: 👉PDF
🌚Logistic Regression: 👉PDF