Having worked in a fully Agile workspace has made me cognizant of the various aspects of the SDLC and Data Pipeline. I am capable of working at any point in the Data Pipeline ranging from Data Loading, Cleaning, Manipulation, Wrangling, Statistical Analysis (A/B Testing), Visualization and Deployment. I am skilled in data manipulation using Python libraries like pandas, numpy; performing DDL, DCL and DML operations in SQL; libraries like dplyr, tidyr in R; and Excel. Visualization is vital to convey the patterns of data distribution. I am skilled in producing visualizations in Python using matplotlib, plotly and seaborn; in R using ggplot2 and building dashboards using Microsoft Power BI. Once the data has been visualized and an exploratory data analysis has been done, we would need to build models to learn the patterns from the data and be able to predict outcomes for incoming data. I am skilled in building machine learning models in both R and Python. I am conversant in building Classification and Regression models using libraries like sklearn (sci-kit learn) in Python along with caret and party in R. I am also conversant in building Deep Learning solutions (Neural Network) using TensorFlow and Keras. I have built Machine Learning Models using RStudio, Jupyter Notebook, Google Colab and deployed them using Streamlit libraries and maintained them in production.
Data Scientist
December 2022 – Present
Bank of Ireland, Dublin, Ireland
Forecasted stock markets by analyzing real-time market trends using OpenAI and Bing Search APIs.
Leveraged GenAI for news summarization, sentiment extraction and directionality predictions for company stocks.
Developed a cutting-edge Streamlit application integrating sentiment analysis and trend forecasting for insights.
Led the market basket analysis project to extract customer journeys influencing the bank’s 2024 retail strategy.
Developed a recommender system to identify the next best product in a customer’s journey to uplift sales by 16%.
Designed propensity models to identify high-potential customers, boosting product uptake and revenue by 12%.
Implemented Azure ML and AI Services to develop a model, deployed as an API, reducing manual effort by 14%.
Addressed concept drift and refined ML models using out-of-time testing and PSI (population stability index).
Engineered a Profile Tool to calculate information values, cutting feature selection time by 20%.
Orchestrated GDPR-compliant DSAR processes, ensuring data privacy and regulatory adherence.
Partnered with marketing squads to enhance social media engagement strategies using Pega analytics.
Data Scientist (Systems Engineer)
October 2016 – June 2021
Tata Consultancy Services, Kolkata, India
Applied ANOVA, logistic regression and A/B testing in Python to develop models, curtailing fraud leakage by 17%.
Led digital transformation of rule-based legacy systems to machine learning models, increasing ROI by 12%.
Identified potential high-risk clients by clustering claims, k-means, reducing Probability of default (PD) by 3%.
Automated the data collection & cleaning pipeline from 5+ sources reducing ETL latency from 3 hours to 1 hour.
Project Intern
January 2016- April 2016
I worked as an intern for the Project, '"Social Inclusion through Digital Inclusion ", under Dr. Somprakash Banerjee, for Indian Institute of Management, Calcutta. In the project, I worked to implement an online learning system, by involving a group of elderly teachers and training them in using online learning platforms.
National University Of Ireland, Galway
First Class Honors (1.1) in Masters in Computer Science, Data Analytics
Techno India Salt Lake, West Bengal University of Technology
Completed Bachelors in Technology (B.Tech) in Electronics and Communication Engineering with a DGPA of 8.11 out of 10.
Sri Aurobindo Institute of Education
Completed ISC (Class 12) with a score of 90.25%
Sri Aurobindo Institute of Education
Completed ICSE (Class 10) with a score of 92.2%