SVM

Fig 1. SVM Example

Overview

Support Vector Machines (SVMs)

Support Vector Machines (SVMs) are a powerful tool in the world of machine learning, primarily used to categorize data—like sorting emails into spam and not-spam. They work by finding the best boundary that separates different kinds of data. This boundary is chosen so it has the most space around it, making it more reliable when new data comes along.

How SVMs Handle Complex Data

While SVMs are great with straightforward tasks where the data can be split by a straight line or flat surface, many real-world problems aren't that simple. To solve this, SVMs use a clever trick called the kernel trick. This method transforms the original data into a higher dimension where a simple line or surface can do the job of separation, all without having to compute and understand this new complex space.

Fig 2. Kernel Trick Example

The Role of the Dot Product

In SVMs, the dot product a mathematical way to measure how much one group of numbers aligns with another helps determine how similar the data points are to each other. This similarity is crucial when deciding where to place the boundary in more complex spaces.

Types of Kernels: Polynomial and RBF

Polynomial Kernel: This tool lets SVMs draw curves instead of straight lines, allowing them to create more complex boundaries. It’s like fitting a curved path through a park rather than a straight sidewalk.

RBF Kernel (Radial Basis Function): This kernel looks at how close data points are to a central point and can adapt to almost any shape needed, making it very versatile for different scenarios.

Example of using a Polynomial Kernel

These points are more prone to becoming linearly separable in the new feature space. This modification enhances the dimensionality and the hyperplane separation capabilities of linear models (such as support vector machines) by using non-linear combinations of the original features.

This elementary example shows how data that is not linearly separable can be transformed into a space where linear separation is feasible using a polynomial kernel.

Data Preparation for SVM

A labeled dataset, in which each sample in the training set is associated with the right output, allows models to learn to generate predictions from inputs in supervised learning. By comparing its predictions with the genuine outputs and making adjustments to its calculations based on those comparisons, the model learns from this training set.

Data preparation for a support vector machine model is as follows:

Labeled Data: A dataset with features (inputs) and outputs (outputs) included in each record serves as the starting point for labelled data. For this prediction, the characteristics can be anything from numerical values to categories of data, and the objective is the expected result for every record.
Splitting Data into Training and Testing Sets: A training set and a testing set are created from the entire dataset. Data from the training set is used to train the model. Using the unlabeled testing set, you can assess your model's performance on new data after the training phase. To evaluate the model's applicability, this separation is vital.
Disjoint Sets: The data sets used for training and testing must not overlap in any way. This prevents the model from relying on memorization alone to gauge its success on the test set; instead, it will be evaluated against its capacity to generalize from the training data.
Numeric Data for SVMs: Support vector machines (SVMs) are mathematical models that calculate the distance between distinct data points in feature space. That is why they need numerical data. You can use one-hot encoding or label encoding to transform categorical data into a numerical format. Then, you can feed it into an SVM model.
Scaling Numeric Data: Support vector machines (SVMs) are scale-sensitive because features at bigger scales might have an outsized impact on distance calculations. Hence, before to training an SVM, the data is frequently subjected to feature scaling, which can be described as standardization or normalization.

Sample Train and Test Data

Fig 3. Training Set X_train data

Fig 4. Training Set y_train data

Fig 5. Testing Set X_test data

Fig 6. Testing Set y_test data

Link to the SVM Code :

PitStopAnalytics/ML_Module_4_SVM.ipynb at main · kirandevihosur74/PitStopAnalyticsMachine Learning project to predict the future of formula one championships - kirandevihosur74/PitStopAnalytics

Results

Model Performace

The linear kernel :

C = 0.1: Accuracy = 96.5217%
C = 10: Accuracy = 96.5522%
C = 100: Accuracy = 96.5522%

The RBF Kernel:

C = 0.1: Accuracy = 94.7826%
C = 10: Accuracy = 94.7826%
C = 100: Accuracy = 94.7826%

The Polynomial Kernel:

C = 0.1: Accuracy = 94.7826%
C = 10: Accuracy = 94.7826%
C = 100: Accuracy = 94.7826%

With precision up to four decimal places, the linear kernel yielded the best results for all regularization settings. In Formula One, where every second counts, even a small performance advantage might mean the difference between second and first place.

The accuracy of the RBF and polynomial kernels was the same for all 'C' values that were tested. In other words, the model may have been able to adequately represent the dataset's complexity regardless of the 'C' value range that was evaluated, or perhaps the dataset's features are adequately captured by both of these kernels.

Choosing between models may not be based only on accuracy when it comes to predicting the outcomes of Formula One races. Possible confounding variables include computational efficiency, the model's interpretability, and the effect of the cost of inaccurate forecasts on the choices taken by race strategists and team management.

If all other variables remain constant, the linear model's simplicity and marginally improved performance may make it the best option. The RBF or polynomial kernels may be better suited, nevertheless, if the dataset's properties evolve or fresh evidence reveals non-linear correlations.

Because of the high stakes involved in real-world applications like Formula One, it is frequently worthwhile to pursue the best performing model, even if it only improves predicted accuracy somewhat.

All Kernel Types Confusion Matrix

Fig 7. Confusion Matrix for all kernel types

Linear Kernel (C = 0.1):

Fewest false negatives: The model's ability to accurately forecast the positive class (such as a victory or a podium place) is indicated by the low number of false negatives.
One false positive: It could lead to poor strategy choices if it wrongly predicts a victory or high finish that does not materialize.
A greater 'C' results in a somewhat higher mistake rate: When a model grows too complicated, it runs the risk of overfitting and failing to discover patterns that are applicable outside of its training data.

RBF Kernel:

No false positives across all `C` values: This suggests a cautious prediction model that rarely indicates a win or high finish unless it's very likely.
Some false negatives: Could end up being too cautious and missing out on some possible victories or high finishes.

Polynomial Kernel:

The outcomes are the same as those of the RBF kernel Displays the same cautious outlook while making forecasts, which could lead to failure when devising strategies for bold race plans.

Fig 8. SVM Accuracy for Linear Kernel

Fig 9. SVM Accuracy for RBF Kernel

Fig 8. SVM Accuracy for Polynomial kernel

Linear Kernel (C = 0.1):

The data does not necessitate intricate models to accurately forecast race outcomes, as evidenced by the peak accuracy at the lowest complexity.

RBF and Polynomial Kernels:

If the accuracy remains constant regardless of the value of C, it could mean that adding more complexity to the model doesn't improve the accuracy of race outcome predictions.

Kernel Comparison and Selection

Linear Kernel (C = 0.1):

Predicting outcomes, such as which driver or team could win a race or finish strongly, is best handled by this model because of the minor advantage in accuracy. Decisions made in real-time during a race weekend could benefit from speedier computations made possible by its simplicity.

RBF and Polynomial Kernels:

They may not be required for this dataset at the moment based on their performance, but they could be useful for future data or other F1 prediction tasks, such as non-linear outcome prediction or more granular prediction.

Best Kernel and C Value

Linear Kernel (C = 0.1): It would be beneficial to have a model that has fewer false negatives because of the high cost of them in Formula One (for example, losing out on a strategic call that may have improved the finishing position). By reliably indicating when a driver or team is in a good position to succeed, this model helps guide strategy.

Though they do not generate any false positives, the RBF and polynomial kernels' conservative character may be too careful, causing teams to underuse their resources or overlook strategic chances.

Conclusion

By analyzing the Formula One dataset using Support Vector Machines (SVM) with various kernels and regularization parameters, we discovered that less complex models can be remarkably efficient for some sorts of predictions. The linear kernel, specifically with a low regularization value (C=0.1), fared better than more intricate models, suggesting that the data may have some degree of linear separability. This implies that when it comes to forecasting race results, such as determining the winners or podium finishes, a simple and direct method may often be adequate.

The lack of incorrect positive results in RBF and polynomial kernels indicates a cautious model that may result in missed chances if employed for strategic decision-making in F1. This emphasizes the need to select a model that achieves a balance between precision and the capacity to take advantage of prospective benefits.

In summary, this demonstration in predictive modeling for Formula One highlights the importance of adjusting the complexity of the model to match the available data. The statement supports the idea that more intricate or non-linear models should only be used if they offer a distinct benefit compared to simpler options. In the dynamic realm of Formula One, where plans are devised and modified instantaneously, the knowledge obtained from a dependable and uncomplicated prediction model might be a valuable advantage.

Page updated

Google Sites

Report abuse