Accepted Posters

poster_neurips_marwa - Marwa Dhiaf.pdf

Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition

Marwa Dhiaf - Instadeep

Abstract
We explore the potential of continual self-supervised learning to alleviate the

catastrophic forgetting problem in handwritten text recognition, as an example of

sequence recognition. Our method consists of adding intermediate layers called

adapters for each task, and efficiently distilling knowledge from the previous model

while learning the current task. Our proposed framework is efficient in both

computation and memory complexity. To demonstrate its effectiveness, we evaluate

our method by transferring the learned model to diverse text recognition downstream

tasks, including Latin and non-Latin scripts.

NAML OAM - Ramzil Galiev.pdf

Neural Network-Based Classification of Vortex Beams for Turbulence-Immune FSO Communication

Ramzil Galiev - Technology Innovation Institute

Abstract
We investigate the characteristics of scalar and vector vortex beams propagating through turbulent atmosphere. We assess parameters such as the beam irradiance profiles and induced crosstalk while varying propagation distances, encompassing scenarios of both weak and strong turbulence. Our results demonstrate that neural networks consistently surpass traditional techniques in terms of classification accuracy, with this performance advantage becoming more conspicuous as propagation distances lengthen. We introduce a novel CycleGAN-assisted Kramers–Kronig receiver, which employs a CycleGAN to effectively eliminate turbulence-induced distortions in interference patterns. This advancement facilitates precise, single-shot measurements of the OAM spectrum utilizing the Kramers-Kronig relations, holding significant implications for real-time optical communication systems.

Grooming_in_darija_postervf - Khaoula Chehbouni.pdf

Unmasking Predators: Safeguarding Vulnerable Moroccan Communities Post-Earthquake

Khaoula Chehbouni - Mila, McGill University

Abstract
During the night of September 8, 2023, a 7 magnitude earthquake barreled through the area of Al-Haouz in Morocco, killing more than 2900 people. After the event, authorities, individuals and the international communities mobilized around the impacted community, and help flooded the area. Throughout this emergency, victims turned to social media to ask for help, seek life-saving information, and coordinate the rescue effort. Although these digital platforms are great communication tools, they can also be a nest of predators. After the earthquake, an alarming number of posts encouraging the grooming of young orphaned girls in Darija (the Moroccan Arabic dialect) were reported by the online community. Following Morocco earthquake, several pictures and videos showing adults acting inappropriately with orphaned children were shared on social media platforms, and worrisome posts encouraging pedophilia and sexual grooming were reported. Despite the strong content moderation filters used by social media platforms, such posts are not flagged and are only deleted after being mass reported by the online community. This is likely because Darija is a low-resource language for which these tools are not adapted. As such, in this project, we aim to  offer a first attempt at the detection of sexual predation in social media in Darija. To do so, we collect problematic posts and then train different pre-trained models for the task.

Homograph Poster - Fati.pdf

Homograph Attacks on Maghreb Sentiment Analyzers

Fatima Zahra Qachfar - University of Houston

Abstract
We examine the impact of homograph attacks on the Sentiment Analysis (SA) task of different Arabic dialects from the Maghreb North-African countries. Homograph attacks result in a 65.3% decrease in transformer classification from an F1-score of 0.95 to 0.33 when data is written in “Arabizi”. The goal of this study is to highlight LLMs weaknesses’ and to prioritize ethical and responsible Machine Learning.

Nt3awnou Poster for NAML - Nouamane Tazi.pdf

No Village Left Behind: A Moroccan Data-driven Platform for Effective Aid Coordination

Nouamane Tazi - Hugging Face

Abstract
Following the catastrophic earthquake that hit Morocco in September 2023, our platform emerged to optimize relief coordination, efficiently orchestrating resources to aid those in need. This paper presents the various techniques used to collect and process requests and interventions into a clean and actionable dataset, enabling authorities and fellow NGOs to efficiently extend aid to the affected areas.

NAML - OUARDI AMINE - Graph Oriented Attention Networks - Amine OUARDI.pdf

Graph Oriented Attention Networks

Amine Ouardi - Ecole Normale Supérieure de l'Enseignement Technique Mohammedia (ENSET Mohammedia) 

Abstract
Graph Attention Networks (GAT) is a type of neural network architecture designed to effectively model and process data represented as graphs. GATs leverage the concept of attention mechanisms to learn the importance of different nodes in a graph when performing tasks such as node classification or link prediction. By assigning attention weights to neighboring nodes, GATs can capture the relevance and influence of different graph elements in a localized manner.

In this poster, I will be presenting Graph Oriented Attention Networks (GOAT) as a novel approach for attention calculation in graph-related tasks, where attention is specifically directed towards a particular destination node during each iteration. This approach aims to enhance the interpretability and control of attention within a graph neural network. By focusing attention on a specific destination node, the model can explicitly consider the influence of that particular node on the different graph nodes, providing insights into its role and importance within the graph structure. This approach is particularly valuable in scenarios where understanding the impact of individual nodes on specific targets is critical: as a use case, to predict a heuristic for the A* search algorithm, the attention will be focused on the destination node.

NAML portrait template - benayad mohamed.pdf

Use of machine learning coupled with remote sensing to map sensitivity to road accidents, Case study Casablanca, Morocco

Mohamed BENAYAD - University Hassan II, Casablanca , Morocco

Abstract
Road accidents, major causes of congestion and substantial human and economic losses, necessitate precise identification of high-risk areas to enhance safety and the sustainability of transportation systems. This study, focused on Casablanca, assesses road accident susceptibility using historical accident data provided by the Waze application, as part of the 'Waze for Cities' program, and employing Geographic Information Systems (GIS) with machine learning algorithms XGBoost (XGB) and Random Forest (RF).A GIS database was compiled, incorporating ten varied risk factors (such as population density, traffic conditions, proximity to schools and train stations) and 9847 accident points. This data was divided into two sets for training (70%) and validation (30%). The XGB and RF models analyzed the spatial relationships between these factors and incidents, enabling the generation of road accident susceptibility maps.The performance of the models was evaluated using statistical measures and the ROC curve, revealing RF's superiority (AUC = 0.915) over XGBoost (AUC = 0.907). These susceptibility maps, a result of the synergy between Waze data and GIS analysis, offer an essential tool for optimizing the management of road infrastructures in Casablanca, and for improving planning and management of road safety.

Knowledge Distillation of BERT Language Model on the Arabic Language_NAML Poster.pptx-2 - Hager Adil.pdf

Knowledge Distillation of BERT Language Model on the Arabic language

Hager Adil - University of Khartoum

Abstract
The absence of good Arabic language models led to significant setbacks in the Arabic language related tasks and lag with respect to robustness and accuracy. While a pre-trained version of BERT on Arabic language is available, a smaller distilled version could be proven to be highly scalable. In this research paper, we propose the development of DistilBERT for the Arabic language for the pursuit of achieving comparable results with significantly less computational resources. We employ Knowledge Distillation compression technique to transfer knowledge from a large (teacher) model to a smaller (student) model to reproduce the behavior of the teacher model. Subsequently, we fine-tune the model for the question answering task and measure its performance. The model achieves an F1 score of 62.20%, compared to 67.28% achieved by the teacher model, demonstrating that it retains 92.4% of the teacher model’s performance while presenting a substantial reduction in size by 70%. Ultimately, this project aims to break down language barriers, bring greater inclusivity to the Arabic language in NLP applications worldwide. This project serves as a starting point for further research and investigation of the performance of the Arabic DistilBERT model across various NLP tasks.

Attention Augmented CTC with Bayesian Optimization for Amharic Text Image Recognition - Tariku Adane.pdf

Attention Augmented CTC with Bayesian Optimization for Amharic Text-image Recognition

Tariku Adane Gelaw - Ethiopian Artificial Intelligence Institute

Abstract
In this study, we introduce an attention-augmented Connectionist Temporal Classification (attn-CTC) network designed for the recognition of Amharic text-images. The inherent challenges of this task stem from the distinctive features of the script, characterized by a syllabic writing system, limited resources, and intricate orthographic diacritics. The unique structural aspects of the characters add both interest and difficulty to OCR research. Building upon the success of attention mechanisms, we seamlessly incorporate attention into the CTC network, optimizing hyperparameter values through Bayesian methods. Our holistic model includes an encoder module, attention module, and transcription module. Through experimentation on two Amharic script datasets, our approach proves effective, achieving outstanding character error rates of 1.04% on the ADOCR and 16% on the HHD-Ethiopic test datasets.

Zerouaoui_NAML_Poster - Hasnae Zerouaoui.pdf

Deep Heterogeneous Convolution Neural Networks Ensembles for Pathological Breast Cancer Diagnosis

Hasnae Zerouaoui - UM6P - College of computing

Abstract
This study proposes a deep end-to-end heterogeneous ensemble approach (DEHtE) for breast histopathological images classification. The ensemble approach combines two to seven learners among the following popular deep convolutional neural networks: VGG16, VGG19, ResNet50, Inception V3, Inception ResNet V2, Xception, and MobileNet V2. It is based on three selection criteria (by accuracy, by diversity, and by both accuracy and diversity) and two voting methods (majority voting and weighted voting). An experimental evaluation on the popular BreakHis dataset demonstrates a significant increase in performance compared to the learner ResNet50 used as a baseline with an accuracy rising from 78.14%, 78.57%, 82.80% and 79.43% to 93.80%, 93.40%, 93.30%, and 91.80% through the BreakHis dataset’s four magnification factors: 40X, 100X, 200X, and 400X respectively.

Alhassan_NAML_poster - Wathela Hamed.pdf

Einstein Telescope: detection of binary black holes gravitational wave signals using deep learning

Wathela  Alhassan - Nicolaus Copernicus Astronomcial Center

Abstract
Continuing from our prior work (Alhassan et al. 2022), where a single detector data of the Einstein Telescope (ET) was evaluated for the detection of binary black hole (BBHs) using deep learning (DL). In this work we explored the detection efficiency of BBHs using data combined from all the three proposed detectors of ET, with five different lower frequency cutoff (Flow): 5 Hz, 10 Hz, 15 Hz, 20 Hz and 30 Hz, and the same previously used SNR ranges of: 4-5, 5-6, 6-7, 7-8 and >8. Using ResNet model (which had the best overall performance on single detector data), the detection accuracy has improved from 60%, 60.5%, 84.5%, 94.5% and 98.5% to 78.5%, 84%, 99.5%, 100% and 100% for sources with SNR of 4-5, 5-6, 6-7, 7-8 and >8 respectively. The results show a great improvement in accuracy for lower SNR ranges: 4-5, 5-6 and 6-7 by 18.5%, 24.5%, 13% respectively, and by 5.5% and 1.5% for higher SNR ranges: 7-8 and >8 respectively. In a qualitative evaluation, ResNet model was able to detect sources at 86.601 Gpc, with 3.9 averaged SNR (averaged SNR from the three detectors) and 13.632 chirp mass at 5 Hz. It was also shown that the use of the three detectors combined data is appropriate for near-real-time detection, and can be significantly improved using more powerful setup.

NeurIPS_2023__Poster____NAML___North_Africans_in_Machine_Learning - Abder-Rahman Ali.pdf

Self-Supervised Learning Meets Liver Ultrasound Imaging

Abder-Rahman Ali - Harvard Medical School/Massachusetts General Hospital

Abstract
In the field of medical ultrasound imaging, conventional B-mode ""grey scale"" ultrasound and shear wave elastography (SWE) are widely used for chronic liver disease diagnosis and risk stratification. However, many abdominal ultrasound images do not include views of the liver, necessitating a pre-processing liver view detection step before feeding the image to the AI system. To address this, we propose a self-supervised learning method, SimCLR+LR, for image classification that utilizes a large set of unlabeled abdominal ultrasound images to learn image representations. These representations are then fine-tuned to the downstream task of liver view classification. This approach outperforms traditional supervised learning methods and achieves superior performance when compared to state-of-the-art (SOTA) models, ResNet-18 and MLP-Mixer. Once the liver view is detected, the next crucial phase involves the segmentation of the liver region, imperative for obtaining accurate and dependable results in SWE. For this, we present another self-supervised learning approach, SimCLR+ENet, which leverages the learned feature representations and fine-tunes them on the task of liver segmentation, followed by a refinement step using CascadePSP. The proposed approach outperforms the SOTA method U-Net. SimCLR+ENet was also used to detect poor probe contact (i.e., areas where the ultrasound probe/transducer does not have adequate contact with the patient's skin) in liver ultrasound images, an artifact that affects the reliability of SWE. The combination of the proposed self-supervised learning methods for liver view classification, liver segmentation, and poor probe contact detection not only reduces the time and cost associated with data labeling, but also optimizes the liver segmentation workflow and SWE reliability in a real-time setting.

NAMLposter - hamza bouzid.pdf

Decoupling Spatial and Temporal Modeling in 3D Facial Expression Recognition

Hamza  Bouzid - LRIT, Mohammed V University in Rabat, Rabat, Morocco.

Abstract
This work introduces a novel method for dynamic facial expression recognition using sequences of 3D meshes. Unlike existing methods that use hand-designed features or project faces to the 2D domain, our approach directly extracts spatio-temporal information from 3D meshes. It employs a spatial auto-encoder with spiral convolutions for spatial embedding and a temporal transformer for temporal context and expression classification. Evaluation on MUG and BU-4DFE databases shows promising results, but emphasizes the importance of refining pre-processing and mesh registration for improved accuracy and robustness.

NAML_poster - Oussama Mahfoudhi.pdf

From Humans to Agents: Reinventing Team Dynamics and Leadership in Multi-Agent RL

Oussama Mahfoudhi - InstaDeep Ltd

Abstract
Multi-Agent Reinforcement Learning (MARL) has a significant attention in the research community. In the realm of cooperative MARL, the endeavor to establish a collaborative paradigm among agents presents itself as a fundamental challenge . This challenge is compounded by the limited scope of available training schemes, which primarily encompass three key options: Independent learners, Centralized controller, and Centralized training Decentralized execution (CTDE). Inspired by humans in team interactions and in light of the impressive results achieved by PPO with the training scheme of CTDE called MAPPO, we propose incorporating a competitive paradigm within the team's agents to optimize both the team's policy and its ultimate goal under the guidance of a lead SuperAgent SA. Our novel paradigm has reached promising results on Multi Particle Environments (MPE) outperforming MAPPO.

Namla Poster - Scaling Down Multilingual Language Models of Code.pptx - ammar nasr.pdf

Scaling Down Multilingual Language Models of Code

Ammar Khiri - University of Edinburgh

Abstract
The democratization of AI and access to code language models is a pivotal goals in the field of artificial intelligence. Large Language Models (LLMs) have shown exceptional capabilities in code intelligence tasks, but with a high computational costs. This paper addresses these challenges by presenting a comprehensive approach to scaling down Code Intelligence LLMs. We focus on training smaller code language models, which lowers the computation cost of inference and training. We extend these models to diverse programming languages, enabling code completion tasks across various domains.

AuCALME Autonomous Collaborative Agents for Learning Methods Exploration - Massinissa Abboud.pdf

AuCALME: Autonomous Collaborative Agents for Learning Methods Exploration

Massi-Nissa ABBOUD - School of AI Algiers / EURECOM

Abstract
The advent of Large Language Models (LLMs) has revolutionized problem-solving methodologies and is widely recognized as a pivotal approach towards achieving Artificial General Intelligence (AGI). Numerous endeavors have sought to harness the reasoning capabilities of LLMs in various domains, employing techniques such as fine-tuning and retrieval-augmented generation, yielding remarkable outcomes. However, the efficacy of LLMs is predominantly observed in language-related tasks, revealing a significant gap in their performance when confronted with optimization-centric challenges inherent in general AI tasks.

This poster introduces AuCALME, a groundbreaking framework designed to exploit the knowledge and reasoning prowess of LLMs through the implementation of autonomous agents that emulate the intricate process of constructing AI models. AuCALME has demonstrated exceptional capabilities in autonomously crafting AI models that exhibit high performance across diverse tasks and datasets. Furthermore, our framework features a Learning Methods Explorator, dedicated to iteratively forecasting the most proficient learning method for model application. This exploratory component represents a promising avenue in automated machine learning (autoML), enabling an intuitive exploration of a textually formatted learning methods search space within the LLM.

NAML portrait template - Salima BOURBIA.pdf

Deep Learning Multi-View Fusion for Blind 3D Point Cloud Quality Assessment

Bourbia Salima -  Mohammed V University in Rabat, Morocco

Abstract
Digital representation of 3D content in the form of 3D point clouds  has gained increasing interest and has emerged in various computer vision applications. However, various degradation may appear on the 3D point cloud during acquisition, transmission, or treatment steps in the 3D processing pipeline. Therefore, several Full-Reference, Reduced-Reference, and No-Reference metrics have been proposed to estimate the visual quality of PC. However, Full-Reference and Reduced-Reference metrics require reference information, which is not accessible in real-world applications, and No-Reference metrics still lack precision in evaluating the PC quality. In this context, we propose a deep learning-based method for No-Reference Point Cloud Quality Assessment (NR-PCQA) that aims to automatically predict the perceived visual quality of the PC without using the reference content. More specifically, in order to imitate the human visual system during the PC quality evaluation that captures the geometric and color degradation, we render the PC into different 2D views using a perspective projection. Then, the projected 2D views are divided into patches that are fed to a Convolutional Neural Network (CNN) to learn sophisticated and discriminative visual quality features for evaluating the local quality of each patch. Finally, the overall quality score of the 3D point cloud is obtained by fusing the quality score patches. We conduct extensive experiments on three benchmark databases: ICIP2020, SJTU, and WPC, and we compare the proposed model to the existing state-of-the-art methods. Based on the experimental results, our proposed model achieves high correlations with the subjective quality scores and outperforms the state-of-the-art methods.

NAML - Saba.pdf

Multilingual Stable Diffusion: Towards more inclusive Text-to-Image Synthesis

Saba  Abdulaziz  - University of Khartoum 

Abstract
The models proposed in CLIP and Stable Diffusion significantly underperform when tested on data in languages other than English. We democratize and enhance the accessibility of text-to-image diffusion models by presenting a multilingual variant of the Stable Diffusion model. To do this we utilise multilingual-CLIP, however, we empirically observe a mismatch between the embedding spaces of the existing multilingual-CLIP model and the original CLIP model, making it unsuitable for use with Stable Diffusion text-to-image generation. We trained an adapter layer to align the output of multilingual-CLIP with the original CLIP, enabling text-to-image generation in up to 40 languages. Crucially, we trained this adapter only using English text examples, where the original CLIP embedding space is well-defined, and our model generalises well to languages beyond English achieving 61% FID score and 125% CLIP scoring improvements over stable diffusion when evaluated on Arabic prompts. Text-to-image generative models are increasingly becoming prevalent in many of the up and coming AI-based experiences. We aim to make these experiences mode inclusive through this work.

neurIPS_NeuraFirst_symbolic_regression_poster - Smail Alaoui.pdf

Symbolic Regression for Scientific Discovery: an Application to Wind Speed Forecasting

Ismail Alaoui Abdellaoui - NeuraFirst

Abstract
Symbolic regression corresponds to an ensemble of techniques that allow to uncover an analytical equation from data. Through a closed form formula, 

these techniques provide great advantages such as potential scientific discovery of new laws, as well as explainability, feature engineering and fast inference. The present paper aims at applying a recent end-to-end symbolic regression technique, i.e. the equation learner (EQL), to get an analytical equation for wind speed forecasting. We show that it is possible to derive an analytical equation that can achieve reasonable accuracy for short term horizons predictions only using a few number of features.

Poster_NAML2023 - Nouhaila Innan.pdf

Quantum Machine Learning for the Traveling Salesman Problem: A Variational Quantum Algorithm Approach

Nouhaila  Innan - Hassan II University of Casablanca

Abstract
The Traveling Salesman Problem (TSP) is a classic combinatorial optimization problem with many real-world applications. It involves finding the shortest route that visits multiple cities and returns to the starting point. Existing classical algorithms for solving the TSP are often inefficient, especially for large problem instances.

Quantum machine learning (QML) is a rapidly emerging field that combines the power of quantum computing with machine learning algorithms to solve complex problems. Variational quantum algorithms (VQAs) are a type of QML algorithm that can be used to solve the TSP by approximating the optimal solution.

In this work, we present a VQA for the TSP that uses parameterized quantum circuits, and we update the circuit parameters iteratively using gradient-based methods to minimize the tour length while incorporating classical optimization techniques to enhance the search for optimal parameters.

Our experimental evaluations on a quantum simulator show that the VQA effectively provides competitive solutions compared to state-of-the-art classical approaches for small- to medium-sized TSP instances. This study demonstrates the potential of QML to revolutionize how we solve the TSP and other combinatorial optimization problems.

Poster_NAML 2023 - Future.pdf

Hate speech detection in Algerian dialect using deep learning

Sifal KLIOUI - OMDENA

Abstract
The surge in hate speech across various formats on social media has led to a notable

rise in violence. In recent years, substantial efforts have been focused on addressing

this issue by developing methods to detect hate speech in multiple structured

languages, such as English, French, and Arabic. However, there has been limited

attention given to Arabic dialects, particularly those from Tunisia, Egypt, and,

primarily, Algeria.

We propose in this work a complete approach for detecting hate speech on

online Algerian messages. Many deep learning architectures have been evaluated

on our corpus created from Algerian social networks (Facebook, YouTube, and

Twitter). It contains more than 13.5K documents in Algerian dialect written in

Arabic, labeled as hateful or non-hateful. Promising results are obtained, which

show the efficiency of our approach.

NeuRIPS Poster Feres Jerbi - Feres Jerbi.pdf

A Comparative Study of ML Models and Experts Labeling in Energy Consumption Analysis

Feres Jerbi - Wattnow

Abstract
Automatic energy anomaly consumption plays a critical role in  energy-saving. Off-the-shelf  Machine Learning (ML) methods rely on large training datasets manually annotated, which are difficult to build. This knowledge gap is addressed in this study, where unsupervised anomaly detection models are evaluated on three energy anomaly consumption datasets. Results show that automatic annotation based ML is a feasible alternative to manual annotation.