Explored the Pima Indians Diabetes Dataset focusing on:
Glucose level, BMI, Age patterns
Correlation analysis, distribution plots
Tools: Python (Pandas, Matplotlib, Seaborn)
Outcome: Identified key health factors influencing diabetes probability using visual analytics.
Developed a predictive ML model to understand car ownership behavior.
Features: Income, Age, Location
Techniques: Regression Models, Feature Engineering, Scaling
Tools: Scikit-learn, NumPy
Outcome: Predicted car ownership likelihood based on demographic and socioeconomic data.
Performed complete data preprocessing and statistical analysis.
Removed duplicates, handled missing values
Applied summary statistics, outlier detection
Tools: Pandas, Matplotlib, Excel
Outcome: Improved dataset quality and generated practical business insights.
Analyzed synthetic Uber ride data:
Peak hour detection
Zone-wise demand analysis
Visuals: Heatmaps, Bar charts, Pivot tables
Key Insights: Highest demand between 8โ10 AM & 6โ8 PM; Busiest areas โ Gulshan, Dhanmondi, Mirpur.
Implemented Support Vector Machine Model for classification.
Dataset: Public binary-class dataset
Applied RBF kernel, scaling, parameter tuning
Tools: Scikit-learn, Matplotlib
Outcome: Achieved stronger accuracy after kernel & hyperparameter optimization.
Worked on various regression approaches:
Linear Regression
Polynomial Regression
Visualizations: Scatter plots, Trend lines
Outcome: Showed how scaling & model complexity affect prediction accuracy.
Focused on solving class imbalance issues.
Oversampling (SMOTE)
Undersampling
Model performance comparison
Outcome: Improved classification results with balanced datasets.
Developed a machine learning model to classify email/SMS as spam or not.
Text cleaning, TF-IDF vectorization
Algorithms: Naive Bayes / SVM
Outcome: Achieved high accuracy with optimized text preprocessing.
Analyzed sales trends and created forecasting models.
Monthly, yearly pattern analysis
Top/Bottom profitable products
Used ARIMA / Time-series forecasting techniques
ย
Applied business metrics
Conducted descriptive & inferential statistics
Outcome: Produced clean datasets ready for decision-making.