Desmond Onam

AI Engineer | Expert Machine Learning Engineer and Data Engineer| GenAI | Passionate Mentor.

AI Engineer - ML Data Engineering

My GitHub Portfolio

About me.

I am Desmond Onam, a passionate AI Engineer and Machine Learning Data Expert, dedicated to transforming complex data challenges into actionable insights with innovative solutions. With a Bachelor’s degree in Mathematics and Computer Science from Jomo Kenyatta University of Agriculture and Technology and a diverse portfolio of certifications and projects, I excel in predictive modelling, data mining, and machine learning algorithm development.

My expertise includes handling structured, semi-structured, and unstructured data using Python, R, Spark, and SQL. As an award-winning tutor in Machine Learning, Data Engineering and Web3, I bring a unique combination of technical acumen and mentorship skills to every project.

Whether developing robust data pipelines, designing cutting-edge AI solutions, or delivering impactful training programs, I am driven by the power of data to solve real-world problems and enable business success. Let's create, innovate, and elevate together.

Work Experience

Advanced DeepTech ML and MLOps Training Instructor

Omdena (from December 2025 - Present)

✓ Course Development — Collaborated with Omdena and GovTech stakeholders to align curriculum with learning objectives and industry demands for climate-focused AI solutions

✓ Content Delivery — Led interactive sessions blending theoretical foundations with practical applications, delivering complex ML and MLOps concepts in accessible, engaging formats

✓ Assessment & Evaluation — Designed and graded assignments, quizzes, and capstone projects ensuring rigorous evaluation and meaningful learning outcomes

✓ Student Support — Provided comprehensive guidance, addressed technical and academic queries, and offered constructive feedback throughout the learning journey

✓ Hybrid Learning Facilitation — Effectively managed both online and onsite sessions, maintaining engagement and learning quality across distributed participants

Technical Expertise & Teaching Competencies

AI ENGINEER - ORIVFY

HAVARTECHS SOLUTION (from April 2025 - Present)

As an AI Data Engineer at HavarTech Solutions, I was instrumental in building, maintaining, and optimizing data pipelines that powered our core machine learning and AI applications. My work involved a hybrid of data engineering, machine learning, and cloud solutions engineering tasks, ensuring seamless and efficient data flow from ingestion to model deployment. I collaborated closely with data scientists to productionize models and with software engineers to integrate AI solutions into our products.

• Engineered and maintained a robust data infrastructure on AWS, utilizing services such as S3, Redshift, Glue, and EMR to handle large-scale data processing for machine learning model training and inference.

• Developed and deployed scalable ETL/ELT pipelines using Apache Spark and Databricks, reducing data processing time by 40% and ensuring timely delivery of clean, structured data for data science teams.

• Implemented CI/CD pipelines for MLOps using AWS CodePipeline and Jenkins, automating the deployment of machine learning models and data-related services, which improved deployment efficiency by 50%.

• Designed and managed data warehousing solutions using Redshift, optimizing query performance and access control for various business intelligence and analytics teams.

• Automated data ingestion from various sources (e.g., APIs, databases, streaming data) into a centralized data lake, ensuring data quality and availability for downstream applications.

• Collaborated with data scientists to productionize machine learning models, creating deployment scripts and services to integrate models into the existing product ecosystem, and monitoring their performance.

• Utilized Python for scripting and automation, and SQL for data manipulation and analysis, ensuring data integrity and consistency throughout the data lifecycle.

Data Engineer - Pipeline fusions.

Omdena (from December 2024-March 2025)

Drove successful architecture of AI-driven Disaster Management Systems, implementation of scalable data pipelines using cloud native technologies and serverless computing platforms, and optimization of data models for real-time analytics. Designed data governance frameworks and integrated edge computing solutions with centralized data platforms.

Key Contributions:

▪ Developed a machine learning-based disaster management system that aided in real-time analysis and visualization of data.

▪ Architected and deployed scalable synthetic data pipeline to retrieve data from Sentinel-2 and other APIs, used to create AI disaster management system and improve existing model (Leona) for specialized AI agents.

▪ Built an insight-driven dashboard for visualization with Plotly tools and optimized data models for real-time analytics and predictive insights of supply chain data.

▪ Collaborated in a team of 8 via GitHub to create a fully function and deployed system in production that is helping detect real-time Disasters like floods, fires and Earthquakes through Data Fusion, Reinforcement Learning, and Multimodal LLM.

Associate Data Scientist - Customer Experience - Machine Learning Data Engineer.

Ajua (from July 2024-Present)

Business Requirements Translation: Collaborated with stakeholders to translate high-level business objectives into actionable data science problems, aligning analytics solutions with customer experience goals.

Data Preparation and Transformation: Extracted, cleaned, and transformed data to address integrated customer experience challenges, optimizing datasets for analysis and reporting.

Data Architecture Development: Partnered with engineering teams to build, test, and maintain scalable data architectures, streamlining data extraction, transformation, and loading (ETL) processes.

Pipeline Optimization: Implemented strategies to improve the reliability, efficiency, and quality of data pipelines, ensuring data consistency across organizational systems.

Machine Learning Model Deployment: Deployed advanced machine learning models across the Ajua Product Stack, delivering actionable insights that enhanced decision-making for clients in diverse sectors.

Generative AI Chatbot Development: Designed and deployed a customer self-service chatbot, leveraging generative AI to enhance interactivity and engagement with the platform, streamlining customer support operations.

Data science specialist | Instructor | Mentor.

Henry Harvin India (from July 2023 to July 2024).

Delivered industry-relevant training to 300+ students, focusing on data science and machine learning concepts, with a strong emphasis on employability in leading organizations.
Mentored students on 30+ real-world data science projects, offering hands-on guidance to tackle complex business problems using practical solutions.
Designed and implemented a Masters in Data Science curriculum, integrating emerging technologies and best practices, adopted internationally by College De Paris.
Conducted interactive training sessions covering Courses such as:
- Complete Data Science with Python & R
- Machine Learning and Deep Learning with Python
- Human Resource Analytics with Python & R
- Marketing Analytics with R
- Data Visualization with Tableau, PowerBI, Looker Studio
Boosted student engagement by 35% through innovative teaching methodologies, fostering a collaborative and immersive learning environment. Highest retention in the data science department.
Provided tailored feedback to students, achieving a 92% course completion rate and a 90% job placement success rate within six months post-graduation.
Partnered with industry leaders to align course materials with current trends, ensuring students gained in-demand skills and practical knowledge.
Organized industry-recognized certification programs with prominent tech companies, enhancing students' professional credibility and career opportunities.
Developed an interactive online learning platform featuring:
- Video lectures
- Hands-on coding exercises
- Engaging interactive modules
  Supporting remote learning and providing seamless access to course content.

Lead Data Scientist

Nexthikes (June 2023 – June 2024 | Part-time, Remote)

Directed a cross-functional team of 100+ data scientists and interns, deploying advanced deep learning models using technologies such as TensorFlow, Python, Pandas, SQL, Docker, MLFlow, AWS, pytest, and dbt, resulting in a 22% improvement in model accuracy.
Led end-to-end delivery of data science projects, from scoping and requirement analysis to design, execution, and deployment, achieving a $250,000 annual cost reduction through optimized resource utilization.
Designed and implemented advanced machine learning algorithms to streamline workflows and enhance task automation, boosting team efficiency by 30%.
Conducted hands-on training for interns on deep learning and machine learning techniques, enabling early anomaly detection during project development and improving team productivity by 35%.
Delivered actionable insights through comprehensive data analysis, including EDA and statistical modelling, driving strategic business decisions and operational improvements.
Enhanced stakeholder confidence by presenting detailed reports on project outcomes to senior management, fostering a culture of trust and collaboration.
Optimized data pipelines, reducing lead times by 40% and expediting AI solution deployment.
Applied cutting-edge machine learning techniques to achieve an 18% increase in predictive model accuracy, contributing to business success.
Mentored and upskilled 10+ junior data scientists, advancing their professional growth and strengthening team expertise.

Skills: ETL Tools · Teamwork · SQL · Git · Apache Spark · MySQL · DBT · Docker · Extract, Transform, Load (ETL) · Apache Kafka · Apache Airflow · Pipelines · Data Engineering · Python (Programming Language) · Data Analysis Click up.

Data Science Trainer

SNVA Edutech LLC (September 2022 – September 2024)

● Equipped students with solid data ethics knowledge through comprehensive training programs on ethical issues surrounding machine learning and data science.

● Enhanced automated data processing machine performance by analyzing, testing, and debugging several lines of code.

● Received positive accolades from the community for developing open-source data science projects that solved real-world problems.

● Delivered a data visualization course that helped students develop skills in creating compelling and informative visualizations to communicate data insights effectively.

Machine Learning Data Engineer Consultant/Tutor

10 Academy (April 2022 – August 2022 | Full-time)

● Trained 37 junior data engineers in building datasets that underpin machine learning models and designing a real-time data pipeline that fastened semi-structured data processing.

● Instituted a project-based learning approach among machine learning and data engineering trainees by enabling them to ingest data from multiple third-party APIs on real-world projects. Supervised, and graded students' projects and provided individual detailed feedback to students.

● Built top talented, tech-savvy data engineers through comprehensive and data-driven technological training that equipped them with machine learning, data engineering, and web3 skills for job readiness.

● Liaised with data engineers to expand and optimize data and pipeline architecture; taught students how to make well-reasoned business decisions fueled by new data.

Junior machine learning engineer, advanced machine learning data engineer instructor, Machine Learning Engineer.

Omdena (September 2022 – Present | Seasonal, Remote)

● Solved complex real-world problems by applying machine learning libraries and algorithms, using statistical modelling techniques, and writing programming codes.

● Designed machine learning systems and self-running artificial intelligence solutions through 100% working codes and testing models.

● Improved data quality and insight reports using data tagger and data wrangler; developed machine learning pipelines and trained models with end-to-end Bayesian segmentation.

● Contributed to solving community challenges by developing, simulating, testing, and improving various machine-learning algorithms.

Research Analyst - Data Engineer.

Africa's Voices Foundation (from November 2021- February 2022)

· Engineered a data pipeline that ingested data from multiple sources using Google Analytics APIs across billions of rows of data.

· Automated ETL processes make it easier to wrangle data and reduce the time to upload and manual workload by half.

· Designed, developed and maintained scalable, insightful data tables, which acted as the primary input for analysis models, reports, and dashboards.

· Created superior functionality across data systems by automating testing of pipelines and scheduling tasks to validate the organization's assumptions about data and write logic to prevent issues from working downstream.

This is accomplished by:

Creating ETL pipelines with Python and associated technologies.

Using APIs to create complex aggregation pipelines to get data.

Curating, normalizing, and extracting value from large amounts of data.

Creating automated analysis for the data and visualizing the data.

Using the data to predict outcomes of different topics in the community.

Reporting on the data.

Technical Skills

Data Science & Machine Learning:

Languages & Frameworks: Python, R, TensorFlow, PyTorch, Scikit-Learn, NLTK, spaCy
Techniques: Machine Learning, Deep Learning (CNNs, RNNs), Natural Language Processing, Computer Vision, Statistical Modeling, A/B Testing, Bayesian Analysis
Tools & Libraries: Pandas, NumPy, Seaborn, SciPy, matplotlib, Jupyter

Data Engineering:

Big Data Tools: Apache Spark, Hadoop, Kafka, AirFlow
Databases: PostgreSQL, MySQL, MongoDB, BigQuery, Redshift
Pipeline & Workflow: Apache Airflow, DBT, ETL Processes, Data Collection, Web Scraping

Software Development:

Programming: JavaScript, Bash, PowerShell
Frameworks & Platforms: Django, Node.js, React Native, Flask
Development Practices: MLOps, CI/CD, Docker, Kubernetes, GitHub, Jenkins

Large Language Models:

LLMops
Langchain
Huggingface

Visualization & Reporting:

Tools: Tableau, Power BI, Streamlit, R Shiny, Apache Superset

Cloud & Operations:

Platforms: AWS, Google Cloud Platform, Azure, Digital Ocean
Operations: MLFlow, HyperOpt, TravisCI, GitHub Actions, Jenkins

Additional Skills

Project Management: Agile methodologies, team leadership, project scoping
Communication: Excellent teaching ability, client presentations, and detailed technical documentation

Education

10 Academy (July 2021- October 2021)

Machine Learning and Data Engineering.

- Intensive hands-on training and experience in solving real-world/industrial problems using

Data Engineering, and ML Engineering solutions/approaches which involve,

i. Setting up projects' codebase, version control (git, DVC, and MLflow),

ii. Performing Data Exploration Analysis, Feature Extraction, and Pre-processing.

iii. Building data pipelines for ETL using Kafka, and Spark, and scheduling tasks using Airflow.

iv. Developing, Testing, and Maintaining ML Models using different Algorithms.

v. Perform CI/CD using Travis CI, and Compare models using MLflow and DagsHub.

vi. Dockerization, Dashboard presentation, Visualization, and deployment on different platforms,

including Heroku, Streamlit, AWS, etc.

Jomo Kenyatta University of Agriculture and Technology, (Sep 2017 - Dec 2022)

BSc. Mathematics and Computer Science - Applied Mathematics.

-Learning Mathematical concepts in Calculus, Applied Mathematics, and various fields of application, Statistics and the fields of its application in Data Science. Software development, Web Development, Artificial Intelligence and data Analysis.

i. Application of Mathematics concepts in programming

ii. Software and networking concepts in both Mathematics and computer science.

iii. Web Development and design with Internet application programming.

iv. Databases and their applications.

v. Data structures and Algorithms.

Other certifications

Project Management Certification | ExpertEase Education | 2025

DevOps Using Azure DevOps and Docker | Udemy | 2025

Machine Learning Using Python | Start-Tech Academy | 2023

Product Analytics Micro Certification | Product School | 2023

Data Science Micro degree | Udemy | 2022

Post Graduate Program in Machine Learning Engineering/ Data Engineering | 10 Academy | 2021

Statistics Fundamental and Its Application | Udemy | 2021

Complete Data Science Course | 365 Team | 2020

Download CV here

Highlights

SmartAd A/B testing

Developed a hypothesis-testing algorithm to assess the effectiveness of SmartAd's Brand Impact Optimiser (BIO) service, quantifying the impact of ad campaigns on brand awareness. The analysis revealed a significant lift in brand engagement and memorability, demonstrating the success of SmartAd's creative advertising approach and providing measurable value to clients.

Causality

Designed and implemented a causal inference framework using Judea Pearl's methodologies to extract actionable insights from observational data. Successfully inferred and validated causal graphs, merging machine learning with causal inference principles to address complex business questions, enhancing decision-making capabilities.

Sales Prediction

Using time-series analysis, this project was to predict the sales of a company for two weeks. This is based on the data and the performance of the company in a period of time. The creation of a model for this prediction was very important and understanding how time series analysis works.

Twitter sentimental Analysis

This project was meant to create a word cloud to help us identify the words that are highly talked of on Twitter. This was mainly to focus on the Covid-19 as an emerging issue. With this, we are able to identify the words that are spoken of most in an area using Twitter.

AgriTech package

This project was for creating a package for Data scientists to use in the analysis of the data of soil for the production of maize. This package is for taking data from the satellite, gathered by Lidar so that the landscape is analyzed for water distribution and the effect that it has is the production of maize in a given area.

Telecommunication Analysis

A telecommunication company analysis on the performance based on the applications that are being used most and the survival of the company in the future. This includes the recommendations to be improved and more attention to be given to which sector in the company.

Page updated

Google Sites

Report abuse