The state-of-the-art AI models are powerful; however, they are also storage and/or computation intensive. Improving AI efficiency, with respect to different metrics (e.g. speed, energy and accuracy) and from different perspectives (e.g. algorithm, compiler, hardware and system), is very important and highly demanded.
As an integrated part of "Introduction to Deep Learning", "VLSI Design" and "Computer Architecture" courses in Rutgers ECE department, the goal of this series of online seminar talk is to introduce the cutting-edge research progress on efficient AI to Rutgers students/faculty, and also the public audience who have general interest in AI efficiency.
Upcoming REFAI Talks in 2025 Fall
10/30/25 11AM EST, "Opportunities and Challenges for In- and Near-Memory Computing for Machine Intelligence", Prof. Siddharth Joshi, University of Notre Dame, Zoom Link
11/06/25 11AM EST, TBD, Prof. Dong Li, University of California Merced
11/13/25 11AM EST, TBD, Prof. Jeff Zhang, Arizona State University
11/20/25 11AM EST, TBD, Prof. Haitong Li, Purdue University
12/11/25 11AM EST, TBD, Prof. Bahar Asgari, University of Maryland
Past REFAI Talks (with Recorded Youtube Videos)
Talks in 2025 Fall
10/16/25 11AM EST, "AI-driven Analog Circuit Design Automation: from Efficient Inverse Design to Novel Topology Discovery", Prof. Weidong Cao, George Washington University,
10/10/25 11AM EST, “High Performance and Secure Edge Computing Using Ferroelectric FET”, Prof. Kai Ni, University of Notre Dame
Talks in 2025 Spring
05/02/25 10AM EST, "A Case for the KV Cache Layer: Enabling the Next Phase of Fast Distributed LLM Serving", Yuhan Liu, University of Chicago, Youtube Link
04/28/25 10AM EST, "Hardware/Software Co-Design for Efficient Acceleration on CGRAs ", Dr. Cheng Tan, ASU/Google, Youtube Link
04/22/25 11AM EST, "Fast Video Generation with Sliding Tile Attention", Prof. Hao Zhang, UCSD/Snowflake, Youtube Link
02/24/25 11AM EST, "Intelligent Software in the Era of Deep Learning", Prof. Yuke Wang, Rice University/Amazon, Youtube Link
02/17/25 10AM EST, "Forging the Pathways towards Truly Efficient AI — From Extending to Beyond Moore’s Law", Prof. Tong Geng, University of Rochester
02/11/25 11AM EST, "Partition Is All You Need", Dr. Cheng Luo, California Institute of Technology, Youtube Link
Talks in 2024 Fall
12/10/24 11AM EST, "Brains on Light: Silicon Photonics for Deep Learning", Prof. Mahdi Nikdast, Colorado State University
11/26/24 10AM EST, "Efficient Programming on Heterogeneous Accelerators", Prof. Peipei Zhou, Brown University, Youtube Link
11/07/24 11AM EST, "A Systematic and Rapid Approach to Design Space Exploration for Tensor Accelerators", Dr. Jenny Huang, Nvidia Research, Youtube Link
10/31/24 11AM EST, "ML Workloads in AR/VR and Their Implication to the ML System Design", Prof. Hyoukjun Kwon, University of California, Irvine, Youtube Link
10/24/24 11AM EST, "The Role of AI in Physical Design", Prof. Vidya Chhabria, Arizona State University, Youtube Link
10/03/24 11AM EST, "Sparse Linear Algebra Acceleration on High Bandwidth Memory FPGAs", Prof. Linghao Song, Yale University, Youtube Link
Talks in 2024 Spring
04/30/24 10AM EST, "Embracing Machine Learning for System Optimization", Dr. Amir Yazdanbakhsh, Google DeepMind
04/16/24 1PM EST, "ML for ML Compilers at Google", Dr. Phitchaya Mangpo Phothilimthana, Google DeepMind, Youtube Link
04/09/24 10AM EST, "Pre‐training Language Models with Less Compute", Prof. Danqi Chen, Princeton University
03/26/24 10AM EST, "Enable Edge Intelligence via Scalable and Efficient Federated Learning", Prof. Ang Li, University of Maryland, College Park, Youtube Link
03/19/24 1PM EST, "Parameter Efficient LLM Inference with LoRAX", Travis Addair, Predibase, Youtube Link
02/20/24 10AM EST, "A Computer Engineering Journey to Optical Neural Networks: Infrastructure, Algorithms, and Co-design", Prof. Cunxi Yu, University of Maryland, College Park, Youtube Link
02/13/24 10AM EST, "Demystify Efficient and Accountable Large Languge Models from a SMoE Perspective", Prof. Tianlong Chen, University of North Carolina at Chapel Hill, Youtube Link
Talks in 2023 Fall
12/05/23 10AM EST, "Light-AI Interaction: Bridging Photonics and Artificial Intelligence via Cross-Layer Hardware / Software Co-Design", Prof. Jiaqi Gu, Arizona State University, Youtube Link
11/28/23 10AM EST, "Probabilistic Computing with p-bits: Optimization, Machine Learning and Quantum Simulation", Prof. Kerem Çamsarı, University of California, Santa Barbara, Youtube Link
11/21/23 10AM EST, "Hyperdimensional Computing for Efficient and Robust Cognitive Learning", Prof. Mohsen Imani, University of California, Irvine
11/14/23 10AM EST, "Designing Physics-Inspired Computational Platforms to Solve Hard Combinatorial Optimization", Prof. Nikhil Shukla, University of Virginia, Youtube Link
11/07/23 10AM EST, "Hardware-aware Algorithms for Language Modeling", Prof. Tri Dao, Princeton University, Youtube Link
10/31/23 10AM EST, "Boosting Performance and Scalability of Distributed Deep Learning Systems via Efficient Data Management", Prof. Dingwen Tao, Indiana University Bloomington, Youtube Link
10/24/23 10AM EST, "Ultra Low-Power Machine Learning via Hardware and Software Co-design", Prof. Mingu Kang, University of California, San Diego, Youtube Link
10/10/23 10AM EST, "Analytics and Machine Learning Systems on Graphs", Prof. Xuehai Qian, Purdue University, Youtube Link
10/03/23 10AM EST, "A New Paradigm of Efficiency, Security and Privacy by Design for Intelligent Data Processing", Prof. Wujie Wen, North Carolina State University, Youtube Link
Talks in 2023 Spring
04/20/23 11AM EST, "Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time", Prof. Beidi Chen, Carnegie Mellon University Youtube Link
03/30/23 1PM EST, "Efficient Trillion Parameter Scale Training and Inference with DeepSpeed", Dr. Samyam Rajbhandari and Dr. Jeff Rasley, Microsoft Youtube Link
03/23/23 11AM EST, "High-Performance Spectral Methods for Scalable Graph Learning, Adversarial Robustness Evaluation, and Electronic Design Automation", Prof. Zhuo Feng, Stevens Institute of Technology Youtube Link
03/02/23 11AM EST, "Efficient Neural Networks and the Practical Mobile Applications", Dr. Jian Ren, SNAP Research Youtube Link
02/23/23 11AM EST, "Neural Architecture Search for Scientific Machine Learning with Quantified Uncertainty", Prof. Romit Maulik, Argonne National Laboratory/Pennsylvania State University Youtube Link
02/16/23 11AM EST, "Testing Accuracy is Not All You Need: Less Training Cost & More Testing Reliability", Prof. Dongkuan Xu, North Carolina State University Youtube Link
02/02/23 11AM EST, "Fully-Homomorphic-Encryption-based Privacy-Preserving Machine Learning", Prof. Lei Jiang, Indiana University Bloomington Youtube Link Slides
Talks in 2022 Fall
12/15/22 11AM EST, "Systems for End-to-End Private Inference", Prof. Brandon Reagen, New York University
12/01/22 11AM EST, "Fast and Hardware-Aware Neural Architecture Search", Prof. Mohamed Abdelfattah, Cornell University Youtube Link
11/17/22 11AM EST, "Hardware Generator for Efficient Deep Learning Inference", Dr. Rangharajan Venkatesan, Nvidia Research Slides Link
11/10/22 11AM EST, "Towards a Benchmark for Automotive Computing", Dr. Tom St. John, Cruise Youtube Link
10/24/22 11AM EST, "Towards Functional Safety of Deep Learning Hardware Accelerators", Prof. Kanad Basu, University of Texas at Dallas Youtube Link
10/20/22 11AM EST, "Low latency, Efficient Speech Recognition for the Edge", Yuan Shangguan, META Research Youtube Link
10/06/22 11AM EST, "Visual Computing: A Horizontal Approach", Prof. Yuhao Zhu, University of Rochester Youtube Link
09/26/22 11AM EST, "Boosting Deep Learning Accelerators with General Purpose Computing and Cognitive Reasoning", Prof. Jie Gu, Northwestern University Youtube Link
09/22/22 11AM EST, "Towards Efficient AI Systems", Dr. Xiaofan Zhang, Google Youtube Link
Talks in 2022 Spring
05/10/22 11AM EST, "Democratizing TinyML", Prof. Vijay Janapa Reddi, Harvard University Youtube Link
05/03/22 2pm EST, "Federated and Distributed Machine Learning at Scale: From Systems to Algorithms to Applications", Dr. Chaoyang He, FedML Inc. Youtube Link
04/19/22 11AM EST, "From Neural Architecture Search to Data-Centric AutoML", Prof. Mi Zhang, Michigan State University Youtube Link
04/05/22 11AM EST, "Reducing Longform Errors in End2End Speech Recognition", Dr. Liangliang Cao, Google AI Youtube Link
03/29/22 11AM EST, "Towards Effective and Efficient Interpretation of Deep Neural Networks: Algorithms and Applications", Prof. Xia Hu, Rice University Youtube Link
03/22/22 11AM EST, "Toward Efficiently Solving Long-Horizon Robotic Manipulation Tasks", Prof. Jingjin Yu, Rutgers University Youtube Link
03/15/22 11AM EST "Self-Supervised Learning for Robotics: A Tale of Baymax and Wall-E", Prof. Chen Feng, New York University Youtube Link
03/08/22 11AM EST "Pushing NLP to the Edge", Prof. Alexander Rush, Cornell University Youtube Link
03/01/22 11AM EST "Resource-Efficient Execution of Deep Learning Computations", Dr. Deepak Narayanan, Microsoft Research Youtube Link
02/23/22 11AM EST "Real-Time DNN Execution on Mobile Devices with Compiler Optimizations", Prof. Bin Ren, The College of William & Mary Youtube Link
02/15/22 11AM EST "Building Computing Systems for Autonomous Machines", Dr. Shaoshan Liu, PerceptIn Youtube Link
Talks in 2021 Fall
12/09/21 11AM EST "Teaching AI the game of AI Accelerator Design", Prof. Tushar Krishna, Georgia Institute of Technology Youtube Link
12/02/21 11AM EST "Exploring Autonomous Edge Intelligence in the Analog Domain", Prof. Xuan Zhang, Washington University in St. Louis Youtube Link
11/18/21 11AM EST "Pruning on the Fly: Efficient Deep Learning with Adaptive Fine-Grained Optimizations", Prof. Zhiru Zhang, Cornell University Youtube Link
11/11/21 11AM EST "Energy-Efficient AI ASIC Designs: CNN Accelerator with Conditional Computing and LSTM Accelerator with Hierarchical Structured Sparsity", Prof. Jae-Sun Seo, Arizona State University Youtube Link
11/04/21 11AM EST "Neuromorphic Computing: Bridging the Gap between Nanoelectronics, Neuroscience and Machine Learning", Prof. Abhronil Sengupta, Pennsylvania State University Youtube Link
10/21/21 11AM EST "CHIMERA: Efficient DNN Training and Inference at the Edge with On-Chip Resistive RAM", Prof. Priyanka Raina, Stanford University Youtube Link
10/14/21 11AM EST "Towards Understanding the Complex Interplay between ML and Systems", Dr. Murali Krishna Emani, Argonne National Laboratory Youtube Link
10/07/21 11AM EST "Circuit and System Innovations Towards Efficient Processing-In-Memory Accelerators for AI and Security", Prof. Kaiyuan Yang, Rice University Youtube Link
09/30/21 11AM EST "Circuit Design and Silicon Prototypes for Compute-in-Memory for Deep Learning Inference Engine", Prof. Shimeng Yu, Georgia Institute of Technology Youtube Link
09/23/21 11AM EST "How Powerful are Graph Neural Networks and Reinforcement Learning in EDA: a case study in High Level Synthesis", Prof. Cong Hao, Georgia Institute of Technology Youtube Link
09/16/21 11AM EST "Hardware/Software Co-Design of Deep Learning Accelerators" from Prof. Yiyu Shi, University of Notre Dame Youtube Link
Talks in 2021 Summer
08/31/21 11AM EST "Efficient AI via Extreme Network Quantization and Binarization" from Dr. Adrian Bulat, Samsung AI Research Youtube Link
08/24/21 11AM EST "Memory- and Energy-efficient On-device Training via Tensor Computation: From Algorithms to Hardware" from Prof. Zheng Zhang, University of California, Santa Barbara Youtube Link
08/17/21 11AM EST "Dynamic Neural Networks for Efficient Inference" from Prof. Gao Huang, Tsinghua University Youtube Link
08/10/21 11AM EST "Label-Efficient Learning of Vision Transformers" from Dr. Boqing Gong, Google Research Slides Link
07/20/21 11AM EST "Efficient Always-on Computer Vision" from Dr. Aravind Natarajan, Qualcomm AI Research Youtube Link
07/13/21 11AM EST "Efficient Deep Learning Training and Inference: Reduced-Precision and Model Compression" from Dr. Chia-yu Chen, IBM Research Youtube Link
07/06/21 11AM EST "The lottery ticket hypothesis for gigantic pre-trained models" from Prof. Zhangyang Wang, University of Texas, Austin Youtube Link
06/29/21 11AM EST "Efficient Deep Learning - on Automated Design, Distributed Training and Edge Inference" from Dr. Wei Wen, Facebook AI Research Youtube Link
06/22/21 11AM EST "The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks" from Jonathan Frankle, MIT
06/15/21 11AM EST "Convolutional Tensor-Train LSTM for Spatio-Temporal Learning " from Dr. Wonmin Byeon, Nvidia Research Youtube Link
06/08/21 11AM EST "Transformer efficiency: from model compression to training acceleration" from Dr. Yu Cheng, Microsoft Research Youtube Link
06/01/21 11AM EST "Pushing the Limits of Compression using Kronecker Products and Doping" from Urmish Thakker, SambaNova Systems Youtube Link
Talks in 2021 Spring
04/27/21 2PM EST "Efficient Audio-Visual Understanding on AR Devices" from Dr. Meng Li, Facebook AR/VR Research Youtube Link
04/20/21 2PM EST "Tiny Model Design" from Dr. Igor Fedorov, ARM ML Research Youtube Link
04/13/21 2PM EST "Programming Neuromorphic Hardware for Fast, Efficient and Online learning" from Prof. Emre Neftci, University of California, Irvine Youtube Link
04/06/21 2PM EST "Systematic Quantization and Pruning for Efficient Neural Networks " from Dr. Amir Gholami, University of California, Berkeley Youtube Link
03/30/21 2PM EST "Energy-Efficient, Robust and Interpretable Neuromorphic Computing through Algorithm-Hardware Co-Design " from Prof. Priyadarshini Panda, Yale University Youtube Link
03/23/21 2PM EST "Efficient DNN Algorithms, Accelerators, and Automated Tools Towards Ubiquitous on-Device Intelligence and Green AI" from Prof. Yingyan Lin, Rice University Youtube Link
03/11/21 2PM EST "Secure and Efficient Deep Learning Computing System - A Software and Hardware Co-Design Perspective" from Prof. Deliang Fan, Arizona State University Youtube Link
03/02/21 2PM EST "Towards Best Possible Deep Learning Acceleration on the Edge – A Compression-Compilation Co-Design" from Prof. Yanzhi Wang, Northeastern University Youtube Link