Rafaa Ahmed

Khartoum, Sudan

University of Khartoum (2010-2015)

BSc. Physics

Email: rafaesam0@gmail.com

TwitterMediumLinkedIn

Math & Statistics

  • Mathematical Methods

  • Statistics

  • Analytical Skills

Operating Systems

  • Windows

  • Linux

Programming Skills

  • Python

  • HTML

  • TypeScript

CI/CD Tools

  • Github Actions

  • CI/CD

  • CML

Data Engineering Tools

  • Apache Kafka

  • Airflow

  • DBT

  • SQL

  • AWS

Visualization & Other Tools

  • Streamlit

  • DVC

  • Docker

  • MLflow

A junior data engineer with experience in building and maintaining end-to-end data pipelines using ETL/ELT frameworks. I hold a Bachelors degree in Physics plus work experience in E-Commerce and Telecom sectors. I am skilled with SQL, Python, GitHub, AWS and Airflow. I combine strong analytical and team skills.

Education

  • University of Khartoum ( 2010-2015 )

BSc.(Physics)

    • Theoretical Physics.

    • Mathematical Methods & Statistical Mechanics.

  • The International Centre for Theoretical Physics ( 2016-2017 )

Postgraduate high diploma(Physics)

    • Condensed Matter & Statistical Physics.

    • Quantum Information.

  • 10 Academy Training Program ( 2022 )

Data Engineering, ML&Web3 Technology

  • Designing and building data pipelines ELT/ETL frameworks.

  • Machine learning Modelling.

Work Experience

  • Junior Software Engineer

Awiz Group (June 2021- October 2022)

    • Worked on developing a store management system, that covers all the main aspects of the business, enabling users to manage everything from a single portal (Accounting, Inventory/Stock, HR, Sales, etc..)

  • Internship Trainee

Zain (September 2018- February 2019)

  • Carried out Psychographic Analysis for Zain users, to derive better understanding of customers behavior

Speech-to-text data collection


The purpose of this project is to build a data engineering pipeline that allows recording millions of Amharic and Swahili speakers reading digital texts in-app and web platforms. We worked as a team to produce a tool that can be deployed to process posting and receiving text and audio files from and into a data lake, apply transformation in a distributed manner and load it into a warehouse in a suitable format to train a speech-to-text model. We used tools like Kafka cluster, Apache Airflow and Spark.

Data warehouse: TRAFFic Data


This project aims to creating a scalable data warehouse that will host the vehicle trajectory data extracted by analyzing footage taken by swarm drones and static roadside cameras. The data warehouse had to organize the data such that a number of downstream projects query the data efficiently. The framework used is Extract Load Transform (ELT) framework using DBT.

Tech stack used: MySQL, DBT, Airflow and redash