Biruk Getaneh Gezmu

Biruk Getaneh Gezmu

Addis Ababa, Ethiopia

Email:  bkgetmom@gmail.com

Phone: +251922564087

LinkedIn

Medium

About me

As a Data Engineer and Analyst with an M.Sc. in Computer Science, I possess expertise in SQL, data preprocessing, transformation, visualization, analytics, feature engineering, and modeling. I have experience building fault-tolerant, distributed, and scalable end-to-end data pipelines, and am committed to delivering effective solutions that help organizations leverage their data to drive meaningful insights and achieve their strategic goals.


Programming Languages

Automation Tools

Data Engineering Tools

Data Analytics Tools 

Machine Learning Tools 

Work Experience 

BIG Data Analytics Engineer

Sep 2022 - Dec 2023 | Safaricom, Ethiopia

Leading extensive data management at Safaricom, a major African telecom company, during its strategic expansion into Ethiopia. I worked on overcoming data volume challenges, designing and implementing robust ETL pipelines, and ensuring a seamless process from extracting data from the datalake to efficiently loading it into the warehouse. Employing Apache Airflow for workflow orchestration, I leveraged Spark (Scala) for diverse data transformations, thereby enhancing analytical insights. Additionally, I utilized Superset for crafting insightful reports and dynamic dashboards to facilitate informed decision-making.


Data Engineer

Jan 2023 - Apr 2024 | Exxon Mobil, US

In this remote role, I actively engaged in a customer relationship management initiative leveraging Salesforce Marketing Cloud and Snowflake Datawarehouse. My core responsibilities encompased data processing, analysis, automated workflow development, scheduled email sends & tracking, and ensuring seamless data processes. I collaborated with cross-functional teams and support data-driven decision-making while adhering to industry best practices.


Junior Data Engineer

Jun 2022 - Oct 2022 | 10 Academy, Ethiopia


Data Scientist

Oct 2021 - Jan 2022 | EthioAI, Ethiopia

 I have contributed to projects aimed at addressing different community issues. Specifically, I have worked on the following two projects:


Lecturer

Nov 2015 - Sep 2022 | Haramaya University, Ethiopia

 Education 

 10 Academy (May 2022 - July 2022)

Machine Learning, Data Engineering, and Web3 Engineering Training 

Key Courses:


Haramaya University (Oct 2019 - Oct 2021)

M.Sc. in Computer Science 

Key Courses:


Debre Berhan University (Nov 2011 - Jul 2015)

B.Sc. In Information Technology

Key Courses:

Projects

Data warehouse Tech Stack with Postgres, dbt, and Airflow

In this project I have done an ELT pipeline which involves the process starting from the data extraction to presentation. The workflow of this pipeline starts by reading the raw CSV data from the source into Postgress database, transform it using dbt, and present the output using Redash. In the meantime, Airflow was used to orchestrate the workflow and schedule a daily job to sync the data from the source to the Postgres data warehouse.


Speech-to-Text Data Collection 

In this group project, me and my team created a web app that have the ability to receive audio records of a given text displayed to the users on our front end application. From another project we have clearly noticed that the amount of data was a crucial factor behind the effectiveness of deep learning models. Therefore, in these project we have done a data collection system by integrating the three Apache tools, Kafka, Airflow, and Spark.  Kafka is used as a broker, Airflow as our event listener and initiator, and Spark to do the data transformation  and cleaning.

Scalable Data Migration from PostgreSQL to MySQL Database

In this project I have done a data migration from PostgreSQL to MySQL database. The result data is explored, queried, and visualized using Apache Superset. Airflow takes care of task scheduling and workflow management. 


Advertisement Data Analysis

In this project, I used the data that was registered at different steps of the creative creation and ad placement process to perform a data engineering process and a machine learning prediction. The data elements coming from different steps were linked accordingly. After ingesting the data into a data lake, I have modeled and merged the data to a single unit in the data warehouse and expose an interface for the machine learning task.