Spaceship Titanic Kaggle Challenge :
In this project, we tackled the Spaceship Titanic Kaggle competition, competing against thousands of participants in UK. Our goal was to accurately predict passenger survival and successfully travelled to other dimension using advanced machine learning techniques.
We implemented and compared multiple models, including Logistic Regression, XG-Boost, and Light-GBM, leveraging their unique strengths to optimize prediction performance. A key focus was the hyperparameter tuning of Explainable AI (XAI) components, which enhanced model interpretability and contributed to improved accuracy.
Through systematic experimentation and rigorous validation, our optimized Light-GBM model achieved an impressive overall accuracy of 80.05%, positioning us competitively within the leaderboard.
This project highlights the effective application of ensemble methods, hyperparameter optimization, and interpretability techniques to solve complex real-world classification problems.
Image Segmentation:
This project focuses on advanced image segmentation using the COCO-2027 dataset, aiming to accurately identify and delineate objects in complex images through cutting-edge deep learning techniques. We developed a lightweight U-Net convolutional neural network designed for efficient and precise segmentation, capturing detailed object boundaries while maintaining computational efficiency. Through careful training and optimization, the model achieved strong performance across multiple object categories, demonstrating the effectiveness of modern segmentation architectures on a challenging dataset and highlighting its potential for real-world computer vision applications.
Personal Medical Cost Prediction using quantile regression and machine learning techniques.
This project centers on predicting individual medical insurance costs using the Medical Cost Personal Dataset—comprising approximately 1,338 records and features such as age, sex, BMI, number of children, smoking status, region, and insurance charges.
To model the distribution of insurance charges more robustly, quantile regression was employed alongside powerful ensemble methods including Random Forest Regression, XG-Boost, Cat-Boost, and Light-GBM. The performance of these models was rigorously evaluated using key metrics: R², RMSE, and MAE, which provided insights into both the variance explained by the models and their precision in cost estimation.
This multi-model approach leveraged the strengths of each algorithm to accurately capture complexity and distributional nuances in insurance cost data, offering a comprehensive and interpretable framework for healthcare cost prediction.