Projects

Eyerusalem Gebreegzabiher Hadgu

Highlights

A/B testing is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B., which are identical except for one variation that might affect a user's behavior. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistic

A/B TESTING

Metric Choice:

  • Invariant metrics-Used this to ensure that the experiment (the way we presented a change to a part of the population )is not inherently wrong. eg number of users in both groups

  • Evaluation metrics-metrics we expect to change and are relevant to the goals we aim to achieve eg (brand awareness) Hypothesis testing for A/B testing

  • We use hypothesis testing to test the two hypotheses: Null Hypothesis :There is no difference in brand awareness between the exposed and control groups in the current case. Alternative Hypothesis:There is a difference in brand awareness between the exposed and control groups in the current case.

Machine Learning

  • Carried out 3 types of classification analysis to predict whether a user responds yes to brand awareness,namely: Logistic Regression Decision Trees XGboost ,then compared the different classification models to assess the best performing one(s).


AgriTech---USGS-LIDAR

U.S. Geological Survey(USGS) National Geospatial Program developed a 3D Elevation Program (3DEP) program to respond to the growing needs of high-quality topographic data and for a wide range of other three-dimensional (3D) representations of the Nation's natural and constructed features. 3DEP informs critical decisions that are made across the world every day that depend on elevation data ranging from immediate safety of life, property, and environment to long-term planning of infrastructure projects.


AgriTech---USGS-LIDAR

PYTHON PACKAGE FOR USGS DATA

Pipeline:

  • A python class is used for fetching data from AWS using boto3, the AWS SDK for Python. Boto3 makes it easy to integrate our Python application, library, or script with AWS services including Amazon S3, Amazon EC2, Amazon DynamoDB, and more.


  • PDAL- was used to build a custom pipeline that fetches and preprocesses lidar data from USGS API.

  • Las and tif- The pipeline generated a .las and .tif file after execution.

  • The raster file is accessed using GDAL which is a python library.

  • The data from the .las file was used to generate a 3D terrain of the area while the .tif file was used to estimate the area covered.

Applications

  • The 3D rendered showed which areas of the terrain have high elevation and which areas have low elevation. This is useful in determining how water flows through the terrain.


AMHARIC SPEECH TO TEXT MODEL


Worked with a group of 9 people. to build an Amharic(language in Ethiopia) speech-to-text model.


AMHARIC SPEECH TO TEXT MODEL

Approach:

  • Applied audio data preprocessing techniques like resampling, data augmentation by adding noise, changing the speed and pitch of the audio.

  • Audio files were trimmed to remove silence.

  • Silent intervals were removed from the audio files

  • Extracted features from the audio data using log Mel spectrogram.

  • Outliers were also removed from the audio files.

Model Results:

  • Built a model based on CNN and Bidirectional RNN that accepts Amharic audio data and transcribes it to text



Build a tool that can be deployed to process posting and receiving text and audio files from and into a data lake, apply transformation in a distributed manner, and load it into a warehouse in a suitable format to train a speech-to-text model.