Selected Projects:
Currently:
The selection process and criteria of best Foundation model for specific use-case Recommendations
Creating and design of a light weight Foundation model for personalized ads.
Designing an AI-agent model to recommend optimal contents (ads and organic feeds).
Tuning of Foundation/Gen-AI models for page recommendation.
Also;
1.Adaptive Regularization for Deep Speech and Vision
Summary: Deep Learning related problems - - in different applications such as NLP, ASR and Vision - - are inherently ill-conditioned and thus we need to deploy regularization to achieve accuarte and stable solutions. Deep Learning models are indeed an ensemble of many different models and thus we can not use the same regularization techniques for all of the models in the ensemble. I have been focusing on using adaptive regularization approaches to incorporate different regularization models throughout deep learning architectures.
2.The Impact of Distance Metrics for Deep Learning Models
Summary: Deep Learning models like all other machine learning models rely on the application of proper distance metrics. The decision of choosing the type of distance metric depends on many parameters such as type of data, data distribution and the desired structure of the output. The choice of distance metric for any specific deep learning problem is not unique and we may have multiple options to select the distance metrics from. This projects look at possible combination of distance metrics to make sure for any specific problem, a specialized distance metric is deployed. The correct choice of a distance metric leads to often much higher accuracy.
3.Optimization of User Experience
Summary: What needs to be done to make user's experience online a delightful, efficient and productive experience. The answer to the question encompasses many areas such as design of the space, interaction approaches and types of promotions and offers provided to users. In short, such an "optimal user experience design" needs to address the immediate and long terms needs of users. This is fascinating work as it involves also with predicting human behavior and how to get ahead of human future needs.
4.Comprehensive and Complete View of User Experience
Summary: Human experience is a continuous event. Even if we need to focus only on a specific part of the journey, it’s imperative to consider the overall and comprehensive user experience (online and offline) to design an optimal experience for the specific segment. In that, it’s essential to take into the account the interaction among different segments and parts of user experience. This is the purpose of this work, to consider the comprehensive view of user experience and not to deal with any part of experience in isolation.
5.Predictability of Human Behavior
Summary: For many AI applications such as ecommerce, finance, social networks and IoT, to name just a few, one major objective is to analyze and predict human behavior which is manifested in actions like human interaction with the digital world, contents, phenomena, events, and with other humans. This is a nontrivial task as human behavior is far from a purely well-defined structure (or rational) and includes many unstructured and complex aspects. The area of modeling human behavior encompasses many domains such as psychology, sociology, political history and neuroscience. My work focuses on using data-driven approach to predict and understand this critical issue.
6.Theory of AI
Summary: We have been taking the first initial steps in AI. These steps could be described as brute-force approaches. This is not an unusual phenomenon in computer science as we have seen many other developments of practical solutions for which, initially, there did not appear any clear explanation of why and how they work. We need to develop a cohesive and unifying paradigm and basis for AI. I believe this is a necessary requirement in taking AI to the steps we envision for and that is one of the areas I have been working on nowadays.
7. Personalization Project: Personalized Recommendation Systems
Summary: The goal of this project is the construction of a hybrid model based on complete personalization of any recommendation, with respect to both individual user and single content. Lack of personalization is one of the major shortcoming of the current systems that lead to the creation of an undesirable experience to users and thus making sure every user has been addressed uniquely is an essential necessity in building sound recommender systems.
8. AI-Based Recommender Systems
Summary: The goal of this project is to create a Recommender System platform which entails the state of the art of recommendation models, which includes our specific models plus various testing and verification algorithms. The recommender system models include many approaches such as collaborative filtering, content based recommendation, latent factors and deep learning based models.
9. A Comprehensive Global View of Individual User
Summary: The aim of this project is to provide this comprehensive view of the user by recognizing the user regardless of the channel and venue user is using. One basic idea of AI is to create an intelligent service to optimally serve any individual’s needs. These services may include providing the best courses and learning materials, personalized web design, jobs, travel offers, promotional packages and many other products and services. In all, we need to make these services and products personalized to a specific user. Personalization could not be achieved without understanding each specific user as a unique individual. To be able to reach that goal, we need to collect all data pertinent to the individual user. Since any individual may interact with many online and offline devices and venues (laptop, phone, different browsers, various sites, multiple shopping stores, …), it is imperative to recognize the same user anonymously, considering the security and privacy of the individual as the priorities, across all these channels so to be able to provide a global and comprehensive view of the user. This is because, though there are overlaps, but different channels may capture only a subset of a user’s unique features. This way, by having all different characteristic of a user and forming a comprehensive view of the user, we understand each individual user with all their characteristics and so could provide personalized experience, services and products for them.
10. User Recognition: Cross Devices, Channels and Venues
Summary: Using projection of data onto a higher dimensional space and later transforming it onto a lower dimensional space, this project uses another approach in recognizing any individual user cross all different devices and channels.
11. Recognition of Individual Online User and Security Breach Detection
Summary: For this project, I use an “Iterative Clustering” approach to identify online users regardless of the online experience and venues they may be involved in. Among other benefits, this project helps the specific purpose of detection of “account and security breaches”.
12.Modern Data
Summary: For almost 20 years, we have been dealing with modern data which has very high dimension, is massive, heterogeneous mixed data-types, very sparse and unstructured. The machine learning models, algorithms and tools we are using are the ones that have been developed in the past and were aimed at analyzing traditional data. These models and techniques are not readily applicable for “Modern Data” and thus we need unlearning of many approaches from the past and develop new machine learning models and algorithms for modern data.
13. Fast Estimation of the Best Campaign
Summary: The objective is to find the winner of a pairwise test as fast as possible. The minimum cost is defined by using the smallest number of user’s visits. As soon as there is a statistical significance between the two offers, the winner is chosen and the losing offer is dropped.
14. Audience Discovery
Summary:
Phase I. We like to know what variable(s) drives users’ specific action. How could these variables explain the action? The objective of this project is to measure the predictive impact of any user’s feature (in user-based data) or event’s feature (in event-based data) on any specific metric across all offers. After the detection of driving variables, the process continues with finding the driving sub-features (as an example, if “state of the residency” is the driving variable for user conversion, then which state is the driving variable and so on).
Phase II. Similarly, the project also plans in learning which segment of the population has the most significant impact on a given metric.
Phase II. Along that line, the project also plans in learning which segment of the population has the most significant impact on a given metric.
15. Clustering Expansion, Additional Algorithms for Lookalike Modeling
Summary: At Adobe, segments are build up using rule-based approaches. In this way, the customers define criteria using desired traits (features) to form the segments. This project focuses on expanding these segments, called baseline segments, using machine learning models. The expansion is done by adding users that are similar to the baseline segments, though they may not have the specific traits or features identifying the segments.
16. Automatic, Insightful and Physically Interpretable Clustering
Summary: Currently, our customers have to choose a specific feature (of the users’ interactive data) to be used to build a segment. The idea of this project is to use segmentation as a broader tool in providing insights about the audience and to do that in an automated manner. So, the segments are to be built based on the amount and quality of actionable insight they provide. The project’s goal is to understand and visualize the concept and meaning of each segment. That would help to get an insight into the similarity of the audience inside the same segment and the dissimilarities of the audience belonging to different segments. Dividing the entire audience to few meaningful segments also provides a way of data summarization and data dimension reduction, in addition to displaying trends belonging to each segment.
17. Estimating Statistical Significance of a Segment of Data Set
Summary: Measuring the statistical significance of a segment test group is the main objective of this project. Another purpose of this project is to provide a testing methodology in determining if a lift (by using a segment) is statistically significant. This is done using A/B testing on N mutually exclusive segments. The other aspects of the project include providing a visual interface for the test result and confidence interval so they would be easier to understand.
18. Identifying the Most Influential Data
Summary: Customers are often interested in acquiring third party data to enhance their own data so they could improve some desired metrics, using the combined data. Thus, given a specific segment – for example a segment of users most likely to convert - which one of the third-party data could potentially (in combination with the customer data) lead to more accurate result – for example in predicting conversion? The other objective of the project is “how to measure and quantify the impact of the complementary third-party data sets
19. Optimal Computation of Overlaps for Segments and Features
Summary: The aim of this project is to find overlaps among segments and among features used to build the segments. This is done by finding effieecient sampling
models that use smaller sample sizes with higher accuracy. The project considers other possible, non-sampling, techniques with the
same goal of computing the overlaps more accurately and efficiently.
20.Detection of Security Breach on User’s Account
Summary: In this project, we divide the anomalous behaviors into two. The first type of anomalous behavior is not the cause of any alarm as human behavior changes in time organically. The objective of this project is to distinguish between this type and the second type of anomalies that represent security breach and are associated with the behavior of a different individual. This is a critical need of Adobe creative cloud to recognize the undesirable anomalies which represent security breach.
21.Using User Behavior to Detect Customers of Low Probability of Conversion
Summary: There has been a concern that many users of Adobe cloud products take advantage of the free and low promotional offers and constantly use these promotions. These users use pseudo names and multiple email addresses and other credentials. The task of this project is to distinguish these users through learning from the users’ behavior.
22. Data Standardization
Summary: When categorical variables have many classes or categories, direct conversion (one hot encoding)of these variables leads to the problem of “Curse of dimensionality”. This project addresses the problem by creating new models to avoid COD.
23. Valuation of Data
Summary: Though quite often we have access to massive amount of data, but not all data is created equal and, even more significantly, having access to big data (no matter how large it maybe) does not necessarily lead to the correct and reliable results . Model performance metrics such as accuracy, correctness and generalizability depend heavily on the model's access to the proper data sets so the data sets entail the insights the model attempts to decode and analyze. Thus, an important question is, what is the value of any specific data set? In this project, both intrinsic (unsupervised) and metric-oriented (supervised) valuation of data are addressed.
Some more completed projects:
24. Insights and Analysis for Creative Cloud Churn and Conversion
25. Adobe E-Learning Recommendation
26. Forecasting for Adobe Analytics
27. Machine Learning for Intruder Detection in Software Security
28. Phrase Similarities Detection
29. Predictive Analytic for Adobe INSIGHT’s Metrics
30. Estimating Customer Future Value, a Predictive Analytic Approach
31.Features Selection: Unsupervised Feature Selection
32. Collaborative Filtering, Recommendation Systems, and Missing Data
33. Evaluation of Input Impacts and Key Performance Indicator's Contributors
34. Automatic Data Storytelling: Natural Language Embellishment and Summarization for Q&A Systems
35. Variable Reduction and Variable Importance Ranking; UnSupervised Feature Selection
36. Outlier Detection and Restoration of Erroneous Data
37. Variable Reduction and Variable Importance Ranking