Publications:
Few-Shot Adversarial Low-Rank Fine-Tuning of Vision-Language Models (Submitted to ICLR)
Authors: Sajjad Ghiasvand*, Haniyeh Ehsani Oskouie*, Mahnoosh Alizadeh, Ramtin Pedarsani
Abstract: Vision-Language Models (VLMs) such as CLIP have shown remarkable performance in cross-modal tasks through large-scale contrastive pre-training. To adapt these large transformer-based models efficiently for downstream tasks, Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA have emerged as scalable alternatives to full fine-tuning, especially in few-shot scenarios. However, like traditional deep neural networks, VLMs are highly vulnerable to adversarial attacks, where imperceptible perturbations can significantly degrade model performance. Adversarial training remains the most effective strategy for improving model robustness in PEFT. In this work, we propose AdvCLIP-LoRA, the first algorithm designed to enhance the adversarial robustness of CLIP models fine-tuned with LoRA in few-shot settings. Our method formulates adversarial fine-tuning as a minimax optimization problem and provides theoretical guarantees for convergence under smoothness and nonconvex-strong-concavity assumptions. Empirical results across eight datasets using ViT-B/16 and ViT-B/32 models show that AdvCLIP-LoRA significantly improves robustness against common adversarial attacks (e.g., FGSM, PGD), without sacrificing much clean accuracy. These findings highlight AdvCLIP-LoRA as a practical and theoretically grounded approach for robust adaptation of VLMs in resource-constrained settings. The code is available at https://github.com/sajjad-ucsb/AdvCLIP-LoRA.
Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles (Submitted to ICLR)
Authors: Dong Lao, Yuxiang Zhang, Haniyeh Ehsani Oskouie, Yangchao Wu, Alex Wong, Stefano Soatto
Abstract: We propose a test-time defense mechanism against adversarial attacks - imperceptible image perturbations that significantly alter a model’s predictions. Unlike existing methods that rely on feature filtering or smoothing, which can lead to information loss, we propose to "combat noise with noise," which leverages Stochastic Resonance to enhance robustness while minimizing information loss. Our approach introduces small translational perturbations to the input image, aligns the transformed feature embeddings, and aggregates them before mapping back to the original reference frame. This can be expressed in a closed-form formula, which can be deployed on any existing architecture without modification, re-training, or fine-tuning for specific attack types. The method is entirely training-free, architecture-agnostic, and attack-agnostic. In the experiments, beyond demonstrating its effectiveness with image classification, we present test-time defense results on dense prediction tasks such as stereo matching and optical flow, highlighting its versatility and practicality in real-world scenarios. In particular, we reduce the prediction error by as much as 71% on stereo matching and 28% on optical flow, demonstrating the effectiveness of our method.
Leveraging Large Language Models and Topic Modeling for Toxicity Classification (IEEE CNC Workshop 2025)
Authors: Haniyeh Ehsani Oskouie*, Christina Chance*, Claire Huang*, Margaret Capetz*, Elizabeth Eyeson*, Majid Sarrafzadeh
Abstract: Content moderation and toxicity classification represent critical tasks with significant social implications. However, studies have shown that major classification models exhibit tendencies to magnify or reduce biases and potentially overlook or disadvantage certain marginalized groups within their classification processes. Researchers suggest that the positionality of annotators influences the gold standard labels in which the models learned from propagate annotators' bias. To further investigate the impact of annotator positionality, we delve into fine-tuning BERTweet and HateBERT on the dataset while using topic modeling strategies for content moderation. The results indicate that fine-tuning the models on specific topics results in a notable improvement in the F1 score of the models when compared to the predictions generated by other prominent classification models such as GPT-4, PerspectiveAPI, and RewireAPI. These findings further reveal that the state-of-the-art large language models exhibit significant limitations in accurately detecting and interpreting text toxicity contrasted with earlier methodologies. Code is available at https://github.com/aheldis/Toxicity-Classification.git.
Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Authors: Haniyeh Ehsani Oskouie, Lionel Levine, and Majid Sarrafzadeh
Abstract: As Artificial Intelligence (AI) models are increasingly integrated into critical systems, the need for a robust framework to establish the trustworthiness of AI is increasingly paramount. While collaborative efforts have established conceptual foundations for such a framework, there remains a significant gap in developing concrete, technically robust methods for assessing AI model quality and performance. A critical drawback in the traditional methods for assessing the validity and generalizability of models is their dependence on internal developer datasets, rendering it challenging to independently assess and verify their performance claims. This paper introduces a novel approach for assessing a newly trained model's performance based on another known model by calculating correlation between neural networks. The proposed method evaluates correlations by determining if, for each neuron in one network, there exists a neuron in the other network that produces similar output. This approach has implications for memory efficiency, allowing for the use of smaller networks when high correlation exists between networks of different sizes. Additionally, the method provides insights into robustness, suggesting that if two highly correlated networks are compared and one demonstrates robustness when operating in production environments, the other is likely to exhibit similar robustness. This contribution advances the technical toolkit for responsible AI, supporting more comprehensive and nuanced evaluations of AI models to ensure their safe and effective deployment. Code is available at https://github.com/aheldis/Cross-model-Correlation.git.
Attack on Scene Flow using Point Clouds (IEEE MLSP 2024)
Authors: Haniyeh Ehsani Oskouie, Mohammad-Shahram Moin, and Shohreh Kasaei
Abstract: Deep neural networks have made significant advancements in accurately estimating scene flow using point clouds, which is vital for many applications like video analysis, action recognition, and navigation. The robustness of these techniques, however, remains a concern, particularly in the face of adversarial attacks that have been proven to deceive state-of-the-art deep neural networks in many domains. Surprisingly, the robustness of scene flow networks against such attacks has not been thoroughly investigated. To address this problem, the proposed approach aims to bridge this gap by introducing adversarial white-box attacks specifically tailored for scene flow networks. Experimental results show that the generated adversarial examples obtain up to 33.7 relative degradation in average end-point error on the KITTI and FlyingThings3D datasets. The study also reveals the significant impact that attacks targeting point clouds in only one dimension or color channel have on average end-point error. Analyzing the success and failure of these attacks on the scene flow networks and their 2D optical flow network variants shows a higher vulnerability for the optical flow networks. Code is available at https://github.com/aheldis/Attack-on-Scene-Flow-using-Point-Clouds.git.
Interpretation of Neural Networks is Susceptible to Universal Adversarial Perturbations (ICASSP 2023)
Authors: Haniyeh Ehsani Oskouie, and Farzan Farnia
Abstract: Interpreting neural network classifiers using gradient-based saliency maps has been extensively studied in the deep learning literature. While the existing algorithms manage to achieve satisfactory performance in application to standard image recognition datasets, recent works demonstrate the vulnerability of widely-used gradient-based interpretation schemes to norm-bounded perturbations adversarially designed for every individual input sample. However, such adversarial perturbations are commonly designed using the knowledge of an input sample, and hence perform sub-optimally in application to an unknown or constantly changing data point. In this paper, we show the existence of a Universal Perturbation for Interpretation (UPI) for standard image datasets, which can alter a gradient-based feature map of neural networks over a significant fraction of test samples. To design such a UPI, we propose a gradient-based optimization method as well as a principal component analysis (PCA)-based approach to compute a UPI which can effectively alter a neural network's gradient-based interpretation on different samples. We support the proposed UPI approaches by presenting several numerical results of their successful applications to standard image datasets.
Collaboration on a paper: Back to the Future: Toward a Hybrid Architecture for Ad Hoc Teamwork (AAAI 2023)
Authors: Hasra Dodampegama, Mohan Sridharan
Abstract: State of the art methods for ad hoc teamwork, i.e., for collaboration without prior coordination, often use a long history of prior observations to model the behavior of other agents (or agent types) and to determine the ad hoc agent's behavior. In many practical domains, it is difficult to obtain large training datasets, and necessary to quickly revise the existing models to account for changes in team composition or domain attributes. Our architecture builds on the principles of step-wise refinement and ecological rationality to enable an ad hoc agent to perform non-monotonic logical reasoning with prior commonsense domain knowledge and models learned rapidly from limited examples to predict the behavior of other agents. In the simulated multiagent collaboration domain Fort Attack, we experimentally demonstrate that our architecture enables an ad hoc agent to adapt to changes in the behavior of other agents, and provides enhanced transparency and better performance than a state of the art data-driven baseline.
I helped in exploring different heuristic methods implemented in Python.
Collaboration on a paper: The Psychological Effects of the Home Environment during Self-Quarantine: a Web-based Cross-Sectional Survey in Iran (IJAUP 2023)
Authors: Jamal-E-Din Mahdi Nejad, Hamidreza Azemati, Seyede Fereshteh Ehsani Oskouei, and Zinat Aminifar
Abstract: During the COVID-19 outbreak in Iran, self-quarantine was a measure to slow the spread of this infection. We conducted this cross-sectional study to explore the psychological effects of the home environment while people had to stay at home for a long time. For the survey, 536 individuals took part. Collecting data was via an online questionnaire including three sections: (1) Demographic characteristics and general information; (2) Home environment features and (3) Negative psychological experiences (NPE) considered as (a) feeling of sadness and depression; (b) feeling of stress and anxiety; and, (c) experiencing domestic violence during quarantine. For data analysis, first, some descriptive information about the participants was presented; then, we used a logistic regression model, one of the classification algorithms in machine learning methods to investigate the association of home environment features and NPE during self-quarantine. The results indicate the home environment affects NPE differently among men and women. Generally, the individuals who were more satisfied with their house performance during quarantine, and people considered the light quality of their house as appropriate; besides, residents with less noise disturbance issues had a better mood during this period. Conversely, failure in the possibility of indoor exercising and the feeling of being in a crowded house increased the level of NPE.
I helped with implementing the methods in Python (using Scikit-learn and Pandas).
Experience:
Gen AI Contributor at Scale AI + Outlier
May 2025 - December 2026
Description: Dataset generation, data quality improvements, performance evaluation on LLMs, and training MLE agents for phase 4 of AIRA.
AI Trainer (Machine Learning Expert) at Handshake AI + Open AI
July 2025 - September 2025
Description: Developing and answering machine learning prompts, and evaluating responses from LLMs.
Research Intern at Bright Flourishing Health
June 2025 - August 2025
Description: Correcting exercise posture and performing pose estimation through video processing and frequency domain denoising.
Research Assistant and Instructor at the University of California, Los Angeles
September 2023 - Present
Supervisor: Prof. Majid Sarrafzadeh
Junior Research Assistant at the Chinese University of Hong Kong & Imperial College London
July 2023 - September 2023
Supervisor: Prof. Farzan Farnia and Prof. Seyed Mohsen Moosavi-Dezfooli
Description: In this research, we aimed to discover a universal adversarial perturbation for the interpretation (UPI) of neural networks while ensuring that the perturbations do not alter the classification decisions of the models.
Undergraduate Research Assistant at the University of Birmingham
March 2022 - April 2023
Supervisor: Prof. Mohan Sridharan
Description: Our goal in this research was optimizing an ad hoc teamwork (AHT) problem in the FortAttack domain. An important observation in the ad hoc teamwork (AHT) domain is that deep neural networks have not demonstrated a significant advantage over simple machine learning approaches. For this reason, we employed Multinomial Logistic Regression with the STEW loss function. This approach allowed us to effectively model and improve the performance of the AHT system in the FortAttack domain.
Intern at Iran Telecommunication Research Center (ITRC)
Computer vision researcher at AI labs, October 2022 - February 2023
Supervisor: Prof. Mohammad-Shahram Moin
Description: Worked on Liveness Detection, Biometrics, and Face Detection using Optical Flow.
Bachelor Thesis
October 2021 - February 2023
Supervisor: Prof. Shohreh Kasaei
Description: In this project, I studied the effects of attention layers and deformable convolutions on optical flow estimation. I explored different loss functions and investigated the robustness of scene flow estimation networks against adversarial attacks.
Summer Intern at the Chinese University of Hong Kong
July 2022 - October 2022
Supervisor: Prof. Farzan Farnia
Description: In this work, we focused on finding a universal adversarial perturbation for the interpretation (UPI) of neural networks. To design such a UPI, we propose a gradient-based optimization method as well as a principal component analysis (PCA)-based approach to compute a UPI which can effectively alter a neural network's gradient-based interpretation on different samples.
Undergraduate Research Assistant at Sharif University of Technology
March 2021 - December 2021
Supervisor: Hamid R. Rabiee
Description: In this project, we focused on cancer detection by employing representation learning and semantic segmentation techniques on CennaLab's dataset. we explored different models, including ResNet-50 for supervised learning and SwAV for unsupervised learning, to improve the accuracy of the detection system. For this purpose, we leveraged the strengths of SwAV by integrating it as the encoder component of the U-Net architecture, aiming to enhance the efficiency and performance of the segmentation task. By combining representation learning and semantic segmentation approaches, we aimed to develop an effective and efficient cancer detection system that can contribute to advancements in medical imaging and diagnosis.