LLM in a flash: Efficient Large Language Model Inference with Limited Memory [arXiv]
K Alizadeh, I Mirzadeh, D Blenko, K Khatamifard, M Cho, CCD Mundo, M Rastegari, M Farajtabar
arXiv:2312.11514, 2023
Weight Subcloning: Directly Initializing Transformers using Larger Pretrained Models [arXiv]
M Samragh, M Farajtabar, F Faghri, R Vemulapalli, S Mehta, O Tuzel, D Naik, M Rastegari
arXiv:2312.09299, 2023
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models [arXiv]
I Mirzadeh, K Alizadeh, S Mehta, CC Del Mundo, O Tuzel, G Samei, M Rastegari, M Farajtabar
NeurIPS workshop on Efficient Natural Language and Speech Processing, 2023
TiC-CLIP: Continual Training of CLIP Models [arXiv]
S Garg, M Farajtabar, H Pouransari, R Vemulapalli, S Mehta, O Tuzel, F Faghri
NeurIPS Workshop on Distribution Shifts: New Frontiers with Foundation Models, 2023
CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement [arXiv]
M Salehi, M Farajtabar, M Horton, F Faghri, H Pouransari, R Vemulapalli, A Farhadi, O Tuzel, M Rastegari, S Mehta
NeurIPS Workshop on Unifying Representations in Neural Models, 2023
Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models [arXiv]
R Vemulapalli, H Pouransari, F Faghri, S Mehta, M Farajtabar, M Rastegari, O Tuzel
arXiv preprint arXiv:2311.18237, 2023
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding [arXiv]
H Wang, PKA Vasu, F Faghri, R Vemulapalli, M Farajtabar, S Mehta, M Rastegari, O Tuzel, H Pouransari
NeurIPS Workshop on Unifying Representations in Neural Models, 2023
On the Efficacy of Multi-scale Data Samplers for Vision Applications [arXiv]
E Nunez, T Merth, A Prabhu, M Farajtabar, M Rastegari, S Mehta, M Horton
arXiv preprint arXiv:2309.04502, 2023
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement [arXiv]
F Faghri, H Pouransari, S Mehta, M Farajtabar, A Farhadi, M Rastegari, O Tuzel
The IEEE International Conference on Computer Vision (ICCV), 2023
An empirical study of implicit regularization in deep offline RL [arXiv]
C Gulcehre, S Srinivasan, J Sygnowski, G Ostrovski, M Farajtabar, M Hoffman, R Pascanu, A Doucet
Transactions on Machine Learning Research (TMLR), 2022
Architecture Matters in Continual Learning [arXiv]
I Mirzadeh, A Chaudhry, D Yin, T Nguyen, R Pascanu, D Gorur, M Farajtabar
arXiv preprint arXiv:2309.04502, 2022
Efficient Continual Learning in Neural Network Subspaces [arXiv]
T Doan, I Mirzadeh,J Pineau, M Farajtabar
Conference on Lifelong Learning Agents (COLLAs), 2023
Wide Neural Networks Forget Less Catastrophically [arXiv]
I Mirzadeh, A Chaudhry, D Yin, H Hu, R Pascanu, D Gorur, M Farajtabar
International conference on Machine Learning (ICML), 2022
Linear Mode Connectivity in Multitask and Continual Learning [paper] [code] [video]
I Mirzadeh*, M Farajtabar*, D Gorur, R Pascanu, H Ghasemzadeh
International Conference on Learning Representations (ICLR), 2021
Balance Regularized Neural Network Models for Causal Effect Estimation [arXiv] [slides] [video]
M Farajtabar, A Lee, Y Feng, V Gupta, P Dolan, H Chandran, M Szummer
Causal Discovery & Causality-Inspired Machine Learning Workshop (NeurIPS), 2020
Understanding the Role of Training Regimes in Continual Learning [paper] [arXiv] [slides] [code] [video]
I Mirzadeh, M Farajtabar, R Pascanu, H Ghasemzadeh
Neural Information Processing Systems (NeurIPS), 2020
Self-distillation Amplifies Regularization in Hilbert Space [paper] [arXiv] [video]
H Mobahi, M Farajtabar, PL Bartlett
Neural Information Processing Systems (NeurIPS), 2020
A Maximum-entropy Approach to Off-policy Evaluation in Average-reward MDPs [paper] [arXiv]
N Lazic, D Yin, M Farajtabar, N Levine, D Gorur, C Harris, D Schuurmans
Neural Information Processing Systems (NeurIPS), 2020
Learning to Incentivize Other Learning Agents [paper] [arXiv] [code]
J Yang, A Li, M Farajtabar, P Sunehag, E Hughes, H Zha
Neural Information Processing Systems (NeurIPS), 2020
Orthogonal Gradient Descent for Continual Learning [paper] [arXiv] [slides]
M Farajtabar, N Azizan, A Mott, A Li
The International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Task-agnostic Continual Learning with Hybrid Probabilistic Models
P Kirichenko, M Farajtabar, D Rao, B Lakashminarayanan, N Levine, A Li, H Hu, A Wilson, R Pascanu
INNF+: Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, (ICML), 2021
The Effectiveness of Memory Replay in Large Scale Continual Learning [arXiv]
Y Balaji, M Farajtabar, D Yin, A Mott, A Li
Workshop on Continual Learning in Computer Vision, (CVPR), 2021
Optimization and Generalization of Regularization-Based Continual Learning: a Loss Approximation Viewpoint [arXiv]
D Ying, M Farajtabar, A Li, N Levine, A Mott Workshop on Continual Learning, (ICML), 2021
Adapting Auxiliary Losses using Gradient Similarity [arXiv]
Y Du, WM Czarnecki, SM Jayakumar, M Farajtabar, R Pascanu, B Lakshminarayanan
arXiv preprint arXiv:1812.02224, 2021
Dropout as an Implicit Gating Mechanism for Continual Learning [paper] [code] [video]
I Mirzadeh, M Farajtabar, H Ghasemzadeh
The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020
Improved Knowledge Distillation via Teacher Assistant [paper] [arXiv] [code] [poster] [video]
I Mirzadeh*, M Farajtabar*, A Li, N Levine, A Matsukawa, H Ghasemzadeh
The AAAI Conference on Artificial Intelligence (AAAI), 2020
Dyrep: Learning Representations over Dynamic Graphs [paper]
R Trivedi, M Farajtabar, P Biswal, H Zha
The International Conference on Learning Representations (ICLR), 2019
Cross-View Policy Learning for Street Navigation [paper]
A Li, H Hu, P Mirowski, M Farajtabar
The IEEE International Conference on Computer Vision (ICCV), 2019
Learning Time Series Associated Event Sequences with Recurrent Point Process Networks. [paper]
S Xiao, J Yan, M Farajtabar, L Song, X Yang, H Zha
IEEE transactions on neural networks and learning systems, 2019
Modeling Behaviors and Lifestyle with Online and Social Data for Predicting and Analyzing Sleep and Exercise Quality. [paper]
M Farajtabar, E Kıcıman, G Nathan, RW White
International Journal of Data Science and Analytics, 2019
More Robust Doubly Robust Off-policy Evaluation [paper]
M Farajtabar*, Y Chow*, M Ghavamzadeh
International conference on Machine Learning (ICML), 2018