Blaise Papa
Database / Warehousing
Mongo DB, Oracle ,Postgres, Cloud Storage
Machine and Deep Learning
Tensorflow, keras, AWS Stack, Scikitlearn
MLOPS
MlFow, DVC, Jenkins, Neptune.ai ,Weights & Bias, DVC, AWS
Visualization
Matplotlib, Tableau, Domo, Seaborn, Power-BI
Data Engineering
Apache Kafka, Apache Airflow, MySQL, Apache Spark
CI/CD
Travis CLI
About me
A data engineer who takes a passion for unearthing the untapped potential found in data and uses it to fuel smarter and more efficient decisions.I have experience working with real-world data and machine algorithms at a production level, and possess a Bachelor's degree in Informatics and Computer Science. I'm also passionate about motorsport and always up for an adventure.
I aim to apply my knowledge to spearhead data-driven decisions, through the creation of sustainable pipelines that are able to convert raw data into meaningful insights for better conclusions. Through this define a change in how problems facing modern-day society are approached.
Education
- 10 Academy ( July 2021 - October 2021)
Machine Learning Engineer
Awarded best visualization and quality code badge
Projects
Sales forecast
Time-series analysis
Twitter-Data sentiment analysis
Speech-to-text recognition model
Geospatial Modelling
- Strathmore University ( 2017-2021 )
BSc.Informatics And Computer Science
Overall GPA 3.3 out of 4.0
Projects
Early detection and prevention of complications during pregnancy using machine learning and IoT
Alzheimer's disease detection from MRI scans using Deep Learning
IoT based animal tracking for Kenyan game parks
Spearheaded the development of the African Airlines Line Maintenance pooling portal as the team leader and developer. The portal aimed to reduce the cost of operations of airlines by 30%. This was achieved by member pooling groups whereby airlines shared common aircraft parts.
Programmed the automation and creation of the AFRAA database that housed information about member airlines hence reducing access time by 60% and increasing efficiency by 40%.
Use of python to automate key business tasks reducing response and filling times.
- Junior Data Scientist, Data Glacier (May 2021- September 2021)
Developed scalable machine learning pipelines improving model robustness by 10%
Prepared Market analysis and visualization for G2M Insight Investment increasing predicted profit by 9%
Conducted time-series analysis for data-driven business decisions
Projects
Swahili Speech-to-Text
This project was carried out in a group with the aim of coming up with a speech-to-text system that was capable of learning from speech audio and transcribe the predicted text with high accuracy and robustness against background noise.
The system employed the use of an LSTM model which is favored for speech and audio analysis. It also used CTC based off of a simple RNN
Telecom Industry Review
This project used Telecom data to analyze the market share of a mobile company.
It employed the use of clustering algorithm -(K-means) to group telecom customers according to behavior and through this determine the best KPI for investment
Pharmtec-Sales
The project employed the use of deep-learning models to predict customer sales across several stores. It employed a simple Recurrent Neural Network and Facebook's prophet algorithm.
Twitter Sentiment Analysis
With the rise of social media platforms, there exists a positive correlation with the amount of data generated. With up to 10TB of data generated every second, there is a need to analyze and understand human behavior through the use of their social media accounts.
We analyzed tweets over a period of time and using topic modelling identified the top used topics over the quarantine period