Projects

Daniel Zelalem Zewdie

Highlights

Telecom Data Analysis

The project analyzed a telecommunication company data to suggest whether the telecommunication company is worth buying or selling using the Statistical Modeling and Data Exploratory analysis.

Telecom Data Analysis

Method Used

  • Performed univariate and multivariate analysis

  • Performed exploratory data analysis to observe customer behavior based on their engagement, experience, and satisfaction using pandas and visualizations tools such as plotly, seaborn, and matplotlib.

  • Computed metrics for Experience, Engagement and Satisfaction.

  • Developed a Streamlit dashboard to provide the insights gained from the analysis

Machine learning


  • Using Python Skit-Learn, Applied K-means clustering to cluster customers based on their engagement score and experience score. Calculated satisfaction score for customers.



Rossman Pharmaceutical Sales prediction

The finance team of Rossmann wants to forecast sales in all their stores across several cities six weeks ahead of time. Managers in individual stores rely on their years of experience as well as their personal judgement to forecast sales.

The task was to build and serve an end-to-end product that delivers this prediction to analysts in the finance team.




Rossman Pharmaceutical Sales prediction

Method Used

  • Multivariable Sales Forecasting: A multivariable forecasting is a type of sales forecasting where you try to predict the sales by considering the current variables that influences the sale.

      • Using Python Skit-Learn, Performed Feature Engineering and applied Random Forest model that takes multiple variables that influence sales and predict sale based on those variables,

      • Calculated Feature Importance to see which variables are mainly responsible for affecting number of sale and number of customers.

      • deployed the model on a streamlit dashboard

  • Historical forecasting (Time series forecasting): Historical forecasting is a quick way to gather insights based on past sales performance. The idea is to look up sales from past sale time-series sales data and predict future sales based on that.

    • Using Tensorflow/Kersa, Applied an LSTM recurrent neural network that takes seven weeks of historical sales data and makes predictions for future sales.



Amharic to Ethiopia Sign language Translator


Designed and implemented a machine learning model that takes spoken and written Amharic sentence and translates it to the native grammar of Ethiopian sign language and present it in 3d animation.
The project was done in a group.

Amharic to Ethiopia Sign language Translator

Method Used

  • Collect and preprocess training data.

  • Build an application that displays the translated sentences in 3D animation using Unity3D.

Machine Learning

  • Used a pre-trained model called HornMorph for word Lemmatization.

  • Built a word2vec model for word embedding using Tensorflow.

    • Built a grammar-translation model, that translates sentences from Amharic to Ethiopian sign language based on an encoder decoder seq2seq architecture using Tensorflow

  • A speech recognition model.

  • Deployed a translation model and speech recognition model on a flask server

Technologies used:

  • Python, C#, TensorFlow, Keras, Flask, UnityGameEngine and Blender.



Amharic Speech To Text Model


Developed a model that transcribes speech audio into text for Amharic language by applying CNN with BI-RNN networks.


Amharic speech to text model

Data

  • The dataset is from ALFFA (African Languages in the Field: speech Fundamentals and Automation). The dataset contains transcribed speech data, in Amharic and Swahili and Wolof. The Amharic transcribed speech data were prepared by Solomon Tefera. The data folder has two subfolders, Train and Test data folders. The training data contains nearly 10000 audio data in wav format and a transcription text file for each audio data.

Data Pre-processing

  • Applied audio data preprocessing techniques using librosa

    • Resampling the sampling frequency of the audio data

    • Data augmentation by adding noise, changing the speed and the pitch of the audio signal

    • Converting all audio signal channels to a mono channel

Feature extraction

  • Implemented feature extraction by generating Log Mel spectrograms for audio signals.

Building Model

  • Using Tensorflow/Keras, Built a model based on CNN and Bi-directional RNN that uses CTC loss, which takes Amharic audio signal and converts it to text


Smart Ad A/B Testing


A/B testing is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B., which are identical except for one variation that might affect a user's behavior. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistic

A/B TESTING

Metric Choice:

  • Invariant metrics-Used this to ensure that the experiment (the way we presented a change to a part of the population )is not inherently wrong. eg number of users in both groups

  • Evaluation metrics-metrics we expect to change and are relevant to the goals we aim to achieve eg (brand awareness) Hypothesis testing for A/B testing

  • We use hypothesis testing to test the two hypotheses: Null Hypothesis :There is no difference in brand awareness between the exposed and control groups in the current case. Alternative Hypothesis:There is a difference in brand awareness between the exposed and control groups in the current case.

Machine Learning

  • Carried out 3 types of classification analysis to predict whether a user responds yes to brand awareness namely: Logistic Regression, Decision Trees, and XGboost ,then compared the different classification models to assess the best performing one(s).

  • Used the models to calculate the feature importance and determined if being in the experiment group or not is an important feature in the prediction.