About me
A junior data engineer with computer science background and can build scalable data pipelines with ETL, DBT, apache frameworks such as Kafka and Spark. Skillful in git version control and bash. Able to wrangle, preprocess and transform data into usable form. Ability to perform business intelligence, reporting and build dashboards.
Python
C++
Java Script
PhP
Scikit-learn
Colab
PyTorch
Tensorflow
Matplotlib and plotly
Pandas
SQL, dbt, Airflow
Kafka
Spark
Snowflake
PostgreSQL
aws
Windows
Linux
Microsoft Excel
Matplotlib and plotly
Microsoft Power BI
Tableau
Streamlit
GitHub Actions
Docker
MLflow, DVC, CML
Unit-testing
Education
10Academy (Agu 2022- Nov 2022)
Machine Learning, Data Engineering, and Web3 Engineering Training
Key Courses:
Building data pipelines
Building Machine Learning Models and deployment
MLOps and CI/CD
Data Visualization and Dashboards
Community Building and Career Services
Technical Writing, Reporting and Blogging
Udacity(Jul 2022- Sep 2022)
Data Analyst Nanodegree
Key Courses:
Introduction to Data Analysis
Data Wrangling
Data Visualization
Technical Writing and Reporting
Addis Ababa University (Oct 2017- Oct 2022 )
MSc. in Computer Science
Key Courses
Object Oriented Software Engineering
Natural Language Processing
Advanced Information System
Computer Security
Aksum University (Nov 2011- Jul 2014)
BSc. in Computer Science
Key Courses
Fundamental of Programming
Object Oriented Programming
Internet Programming
Fundamental and Advanced Database
Applied Mathematics I and II
Linear Algebra
Software Engineering
Introduction to AI
Probability and Statistics
Operating System
Formal Language
Compiler Design
Work Experience
Courses Taught Includes:
Object Oriented Programming
Fundamental of Software Engineering
Introduction to Artificial Intelligence
Database Management System
Computer Security
Microprocessor and Assembly language
Wireless Network Communication and Mobile Computing
Role:
Deliver courses, both theory and lab classes
Advising and counseling students
Assessing students on course level and projects
Participating in community services
Working on research
Projects
In this project I have built a data pipeline with ELT framework that involves data extraction to presentation. The process begins by reading the raw CSV data form the sources into the Postgres database. Next, transformation of the raw data using dbt and drawing visualization using Redash are performed. I have used Airflow to orchestrate tasks and schedule a daily job to sync the data from the source to the Postgres data warehouse.
In this project I have done data visualization that involves data preprocessing to reporting insights. The process involves reading raw CSV data, preprocessing, visualizations, polishing and reporting findings. Furthermore, more than 15 visualizations and five polished reports has presented. Finally slide deck are prepared to convey and reveal key summary of the insight made.
Pharmaceutical Sales Prediction across multiple stores
In this project, I made prediction of daily sales in various stores up to 6 weeks ahead of time. The process involves preprocessing, data exploration and building predictive model with Machine and Deep Learning techniques. As a result I found the RandomForestRegretion algorithm performing better prediction.
In this project, I have made analysis on the effectiveness of the ads using A/B hypothesis testing. The workflow involves reading the raw CSV data, preprocessing, identifying the control and exposed groups and presenting the result. I have made analysis using the classical and sequential A/B testing to measure the effectiveness of the ads. Additionally ML based A/B testing pipeline has been built.