ML Efficiency
Channel: #ml-efficiency
Co-leads:Â
Viraat - @viraat on Discord, @viraataryabumi on Twitter
Harsha - @majormelancholy on Discord, @Sree_Harsha_N on Twitter
Previous leads:
Bhavnick - @bhavnicksm on Discord, @BhavnickMinhas on Twitter
Recent Presentations
May 24, 2024
May 10, 2024
April 12, 2024
March 1, 2024
February 19, 2024
Join us to explore the theme of Transformer Inference Optimization.
After last session's paper on LLM.int8(), this week we will be going through "SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression" by Dettmers et. al.
The wonderful @Srishti Gureja will be presenting again!
On the Efficacy of Knowledge Distillation" by Hariharan et. al (2019)
Slides
Knowledge Distillation and the paper: Model Compression by Caruana et. al.
Materials from all past sessions