Experiences

Ascena Retail, USA                                                                  

Sr. DATA SCIENTIST ( 2022 - Till Date)

Skills: Python, Classifications, Clustering, Vision, Marketing

· Developing Computer Vision Based Outfit Matching algorithm for Loft inventory. 

· Working on XGBoost-based Churn models, CLTV prediction models, and Next Transaction Prediction Models for all brands.

7-Eleven, USA                                                                  

Sr. DATA SCIENTIST ( 2022 - 2022)

Skills: Pyspark, Classifications, Clustering, NLP, Bert Model, Databricks, NoSQL(Cosmos DB, Pinos) 

· Worked on predicting three-level grouping of both 7-Now data and 7-Rewards items data based on NLP Bert Model based similarity techniques and K-means Clustering techniques to create PSA, Category, Sub-categories.  

· Worked on analysis and improving the 7-eleven data sets for items data using Pyspark, and SparkSQL. 

· Developed an NLP-based model to find top similar products in a search system.  

· Worked on developing a Content-Based Item Similarity Prediction system using Bert Model NLP techniques.

·     Worked on developing A/B Testing Personalization Models Validation Metrics script for comparing ALS, FPGrowth, and Bert Model similar items prediction systems. 

·       Worked on a computer vision based object detection project for 7-Eleven shelf data. 

AT&T, USA                                                                  

DATA SCIENTIST (2020 - 2021)

Skills: Python, Hadoop (Hive), Teradata, SQL, classification techniques, ensemble methods, Text analysis 

·       Working on end to end of a model to predict necessary dispatches from a group of tickets reported by customers and marked as not necessary based on MLT testing. 

·          Working on data extraction and creating features using SQL, Hive, Teradata from data lake. 

·       Developed and validated classification models with different types of data and with different ML techniques using SVM, Linear Logistic Regression, Random Forest, Gradient Boosting, Decision Tree

·        Successfully achieved good accuracy by implementing Ensemble Methods and Unbalanced Data Handling Techniques like choosing Balanced Random Forest, Balanced SVM, Under Sampling etc. 

·        Worked on validating the data sources, improving the quality of data, data analysis to find trends and patterns. 

·        Worked on an NLP project to predict necessary dispatches based on comments text data


MANTECH, MD, USA (UNITED HEALTH SERVICES, MEDICARE & MEDICAID)                                                                              

DATA SCIENTIST (2019 - 2020)

Data Scientist

Skills: R, Spark(R , Scala), H2o,SystemML, DML, PhotonML, Survival Analysis( Cox Model, AFT), AWS(EC2, EMR, S3 data late)

·       Involved in rewriting legacy R Cox Proportional Hazard Model in distributed environment to predict the hazard rate of facility providers for ESRD system to find standard transfusion rates (STRR) of providers.

·       As a part of it, developed the cox proportional model in spark using Apache systemML with Scala and DML.

·       Developed the same project in h2o with Sparklr and involved in introducing artificial weights to the model which doesn't accept weights as input, developing predict function from scratch to introduce off_sets, weights etc to predict system.

·        Involved in analyzing and comparing the results of legacy R model with our new distributed models.

·        Worked on data extraction and modification using Sparklr, H2o.

·        Involved in EC2, EMR setup and spark base setup for distributed models using h2o, photonML, reticulate etc.

·        Worked on rewriting Standard Hospitalization Rate (SHR) for providers based on Cox Model and successfully handled large data by implementing techniques like PCA and EMR configuration improvement

·      Involved in rewriting a SAS GEE model in R language for VAT model and rewriting R language GLMER Standardized Re-admission Rate (SRR) model in distributed environment using photonML package. 

United Installs, KY           

MANAGER ( 2019 - 2019)           

Skills: Project Management, Agile, Scrum, Jira, Home Improvement

·        Worked as Manager for an on-demand home improvement business which is based on an eCommerce style website, customer app and provider app based on Uber framework style.

·       Worked as Manager for an on-demand home improvement business which is based on an eCommerce style website and customer app and provider app based on Uber framework style.  

·       Involved in Project planning, Architecture design, Requirement collection, and Designing user stories.

·       Handling the scrum meetings. Tracking planned and unplanned work and conducting meetings with business.

·       Mentoring and guiding the team.                                                                    

DATA SCIENTIST, TEAM LEAD (2018 - 2019)                                                                                                                                                  

Skills: Python, SQL server, Unix, AWS (EC2,S3,RDS,Machine Learning, Redshift), Regression, Time series, Clustering, Tableau

·        Working on end to end of analytics systems from data extraction (using APIs& SSIS) to visualization including business requirements collection, data cleaning, filtering and developing machine learning algorithms.

·       Developed a SARMA time series order management (Supply Chain) system to predict the orders count and predict goods required to be ordered to maintain a continuous workflow and decrease the service delivery time inpython.

·    Developing an Automated Optimized Scheduling system using Geospatial HDBSCAN clustering to increase installers utilization and decrease the service delivery time using python.

·     Leading an application team of 6 members. Involving in hiring employees/ Consulting services, collecting requirements, documentation of project, building application workflow, guiding the team and tracking the resources utilization.

·       Worked on AWS EC2, S3, Redshift, cloud RDS database.

·       Developed Tableau dashboard to visualize goods analysis, service analysis results, employee utilization etc. 

·       Performed different types of statistical testing like Dickey–Fuller test, rolling average, T test, F test etc to check the scope of the project and check the performance of different types of machine learning algorithms on our data. 

·       Developed a POC on User Interest Prediction on Home Improvement Business data in python.

·      Involved in a POC on Predicting Floorplans System based on customer input like number of floors, bedrooms, bathrooms etc for creating visual building plans using Decision Trees in Python.

·     Developed a POC on Price Optimization System using Regression Algorithm to dynamically change our Home Improve Services price as per factors like supply, demand etc.

·        Developed a Barcode Detection system to track Goods Delivery using Computer Vision Techniques   

·        Developed a Computer Vision POC to Recognize Faces and provide access accordingly.

Cincinnati Children's Hospital ,OH                                                                    

VOLUNTEER DATA SCIENTIST  ( 2018 - 2018)

Skills: Python ,Linux ,R(ggplot)

·        Working on a problem to understand the mechanisms of gene transcriptional regulation by applying clustering on DNA patterns.

University of Cincinnati IT,OH                                                                                            

DATA ANALYST ( 2017 - 2018)

Skills: Python, SSIS, Azure, SQL, SPARQL, Control Vocabulary , Apache Solr

·        Extracting, transforming and loading data from different systems of UC medical campus for further analysis using SQL, SSIS etc. Working on data validation, filtering, cleaning and mapping of data using Python

·        Worked on developing Solr search implementing Control Vocabulary and visualization of resulted data in network diagrams using Python. Involved in Azure cloud setup. 

·        Developed an SSIS system to load data in EAV database from RDF file using SPARQL

·        Developed a Naive Bayes classifier to analysis the university financial system and enrollment rate based on growth of different departments and plans implemented. Provide analysis reports to the managers to show which department can help in university growth and where they should spend money. 

TATA CONSULTANCY SERVICES ,India                                                                                            

Research Engineer(Machine learning), Team lead ( 2016 - 2017)        

Skill: Python, Spark, Tableau, SQL, AWS, Association Algorithms 

·        Developed a Content-Based and Collaborative-Based Filtering User Interest Prediction application based on the user purchase history and plausible requirements using Python. Involved in high priority production support. 

·        Worked on Data Visualization of Retail and Whole Sale Data Analytics using Tableau

·        Developed a system to predict the suitable candidates for a position and filter them for recruiting process based on resumes available in Indeed, LinkedIn , Glassdoor etc.                                                                                    

Data Analyst, , Head of Fun at Work , Head of a project development team  ( 2015 - 2016)       

Skill: Python, Spark, Tableau, SQL, AngularJS, Java , AWS, ID3 Algorithm, Association Algorithms

·          Worked on data cleaning, filtering of large amount of data.

·         Developed an application on Student Data using ID3 algorithm to analyze the impact of the different factors like Quality of Education, Research in Organization, Faculty Experience, Sports, Extra activities using Hadoop frameworks (Spark and Hive) etc.   

·          Also experienced with presenting data patterns to the clients and managers to help in business decisions making.

·          Received numerous awards from the client and TCS. Led a team of 10 people in a project development activity and achieved best team award & “Best Idea of The Project”

·          Worked as a full-stack developer in Manufacturing, Retails domain in a collaborative team based on AngularJS, SQL, Java. Also involved in analyzing and solving high priority incidents in production.

BHEL                                                                                                                 

INTERN                  

·        Developed a web application to calculate the power distribution based on J2EE and JavaScript.