CODE & RESULTS

Confusion Matrix & Accuracies

Linear Kernel

Polynomial Kernel

RBF Kernel

Linear Kernel

The Linear SVM creates a decision boundary that is straight, attempting to divide the two classes (low and high anxiety) using a linear approach. The boundary doesn’t do a great job of separating the data, as the classes are not linearly separable in this case. The data points are scattered widely, with many misclassified points along the decision boundary. While the linear kernel works well for simpler datasets with clear class separation, it struggles with complex datasets like this one, where anxiety levels are influenced by multiple, interrelated factors. Based on the visualization, this model is not the most effective at separating the data, and its performance may not be optimal in this scenario.

Polynomial Kernel

The Polynomial SVM kernel introduces a higher-degree curve that allows the decision boundary to flex and better adapt to non-linear relationships in the data. This model offers a more curved decision boundary, which appears to improve the separation between the classes. However, while the polynomial kernel enhances the ability to capture non-linear patterns, the improvement is still limited. Many data points are still scattered near the boundary, which suggests that this model is better than the linear SVM but not sufficient to perfectly separate the data. Nonetheless, this kernel might perform better than the linear SVM in handling the more complex, non-linear relationships present in the dataset.

RBF Kernel

The RBF (Radial Basis Function) kernel provides the most promising results among the three, as it is designed to handle complex, non-linear decision boundaries. The RBF kernel creates a decision boundary that is more flexible, able to adapt to the non-linear patterns within the data, offering a clearer separation between the two anxiety severity classes. This model effectively clusters the data points into distinct regions, with fewer misclassifications compared to the linear and polynomial SVMs. The RBF kernel’s ability to handle non-linearity in the data makes it the best separator among the three, offering better performance and higher accuracy in classifying the anxiety severity levels.

Why SVM failed

The results from the SVM models are suboptimal, primarily due to the inherent complexity of the anxiety dataset and the nature of the features used. Despite extensive preprocessing and transformation of the data, the anxiety severity levels may be influenced by intricate, non-linear patterns that are difficult to capture with standard machine learning models like SVM. The dataset itself includes various factors such as lifestyle, demographics, and physiological traits that do not exhibit clear linear separations, which is essential for SVM's effectiveness. Furthermore, while the RBF kernel performed the best among the models tested, it still struggled to perfectly separate the classes, indicating that the underlying data complexity might require more advanced techniques, such as ensemble learning or deep learning models. These challenges reflect the limitations of the current dataset and feature set rather than any issues with the model selection or implementation.

Comparison of SVM Kernels

Linear Kernel achieved the highest accuracy at 34.1%.
RBF Kernel followed closely with an accuracy of 32.9%.
Polynomial Kernel performed the lowest among the three, with an accuracy of 31.1%.
All three kernels show accuracies close to random guessing, indicating difficulty in properly classifying the data.
The features may not be well-separated for SVMs to effectively draw decision boundaries.
Linear and non-linear transformations both struggled, suggesting limitations in feature representation.

Insights about the topic

The SVM results provide important insights into our anxiety severity classification task. Across all three kernels linear, polynomial, and RBF the model achieved relatively low accuracies, all around 30–34%, only slightly better than random guessing. This suggests that the features extracted from the anxiety dataset, even after preprocessing, are not strongly separable by SVM decision boundaries. Since SVMs are powerful for both linear and non-linear classification when there is a clear margin between classes, the poor performance implies that anxiety severity levels may be highly overlapping or influenced by complex patterns not easily captured by basic SVM kernels. Moreover, the similarity in performance across kernels indicates that simply introducing non-linearity (through polynomial or RBF kernels) does not resolve the underlying feature limitations. These findings highlight that anxiety prediction might require models capable of capturing deeper feature interactions or hidden structures, such as ensemble methods (Random Forest, XGBoost) or deep learning approaches, combined with more sophisticated feature engineering. In short, SVM revealed that while it is a strong algorithm theoretically, our current features are not sufficient for it to separate low and high anxiety severity effectively.

CODE

Link to the Code

Conclusion

Page updated

Google Sites

Report abuse