Ethani Caphace,
Junior Machine Learning Engineer
Bachelor of Eng. in Electronics and Telecommunication
Dar Es Salaam Institute of Technology (2016-2020)
Email: ethancaphace@gmail.com
Dodoma, Tanzania
Python, C/C++, & Go
Object-Oriented and Functional Programming
MySQL, SQLite, & PostgreSQL
SQL programming
TensorFlow, PyTorch, Scikit-learn, Kafka, Spark, DBT, and Airflow
Machine Learning Libraries, Tools, and Algorithms
Mathematics, & Statistics
Probability, Algebra, Calculus, and Statistics,
Django, Flask, Superset, and Streamlit Frameworks
Building ML dashboards, API's and Web applications
Telecommunications
Networking, RF Transmission, Telephone systems, and Mobile Communication
About me
An ambitious, team-player Junior Machine Learning Engineer, eager to learn and contribute developed knowledge in innovating and applying Machine Learning knowledge and skills in solving day to day problems, with hands-on practice in applying software development life-cycle.
Deep knowledge in applying Machine Learning Algorithms and Libraries, worked in different ML and Data Engineering projects majored in Natural Language Processing, Data Mining (topic modeling and sentiment analysis), A/B Hypothesis testing, Sales predictions, Developing python packages, and Developing an end-to-end data collection pipeline using Kafka, spark and Airflow, and Flask web framework. Software developments life-cycle.
Education
10 Academy (July 2021-present)
Data and Machine Learning Engineering
- Intensive hands-on training, and experience in solving real-world/industrial problems using
Data Engineering, and ML Engineering solutions/approaches which involve,
i. Setting up project's codebase, version control (git, DVC, and MLflow),
ii. Performing Data Exploration Analysis, Feature Extraction, and Pre-processing, etc.
iii. Building data pipelines for ETL using Kafka, Spark, and scheduling tasks using Airflow.
iv. Developing, Testing, and Maintaining ML Models using different Algorithms,
v. Perform CI/CD using Travis CI, and Compare models using MLflow,
vi. Dockerization, and Dashboard presentation and Visualization using (streamlit, Redash),
and deployment on different platforms, include Heroku, Streamlit, AWS, etc.
Dar Es Salaam Institute of Technology (2016-2020)
Bachelor of Engineering in Electronics and Telecommunication
- Graduated with a 4.2 GPA out of 5.
- Embedded, and IoT Systems, Networking, RF Broadcasting, and Transmission,
Telephone systems, and Cellular and Mobile Communications.
Full-stack Web application development using Django framework.
Designed and Developed IoT Platform, using DeviceHive open-source,
Researching and developing an ML solution for Interactive Voice Recognition, and Optical character recognition (OCR) projects using Deep Learning Neural Networks,
Designing and Implementation of Embedded and IoT systems.
Some of My Projects
A regression problem, whose main aim is to come up with an end to end product that delivers Sales prediction across multiple stores of some Pharmaceutical company. The performance of 3 regression models are explored: Linear Regression, XGBoost, and Random Forest. Random Forest regressor emerges the best performer with a Mean Square Error of 0.056. Streamlit is used for model deployment, and visualization
This project aimed to discover what topics are discussed on Twitter concerning the Covid-19 pandemic in Africa, and to figure out how people feel toward these topics or the Covid-19 pandemic in general.
Using Telecommunication's data to perform Data Exploratory, Customers Overview, User Engagement, Experience and Satisfaction Analysis.
.
Python module interfaced with USGS 3DEP, that will be used by Data Scientists to fetch, visualize and transform publicly available satellite and LIDAR data.
.