BIRDS
Beginners in Research-Driven Studies
Beginners in Research-Driven Studies
BIRDS of Cohere Labs
(Beginners In Research Driven Studies)
Channel: #beginners-in-research
Hello, fellow learners! 👋
Welcome to BIRDS (Beginners In Research Driven Studies) of Cohere Labs, where we're all about learning to soar in the world of research.
This community is dedicated to sparking a passion for research among beginners.
Whether you’re just starting or looking to hone your skills, we’re here to support your journey. Together, we can grow and learn more effectively. Welcome to the channel!
Co-leads:
Akanksha – @akankshanc on Discord, Twitter: @akankshanc
Caroline – (Past Co-Lead from Jan 2024 – Dec 2024) @c.s1693 on Discord, LinkedIn: Caroline Shamiso
Herumb (Past Co-Lead from Feb 2023 – Aug 2024) – @krypticmouse on Discord, Twitter: @krypticmouse
Reza (Past Co-Lead from Feb 2023 – Feb 2024) – @rzsgrt#9099 on Discord, Twitter: @rzsgrt
Goals:
Introduce beginners to the exciting world of research.
Cultivate a supportive community where questions, ideas, and discussions are always welcome.
Share resources and opportunities to help members grow as researchers.
Organize events like workshops and seminars to inspire and educate.
Logistics:
Primary communication is through this Discord channel, #beginners-in-research.
Weekly Meetings: Fridays at 1:00 PM ET via Google Meet
Looking forward to exploring research together!
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
LLM Cohort
All the material and session video recordings can be found on this PAGE
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
AI ALIGNMENT COHORT based on ARENA 3.0 Curriculum designed by Callum McDougall
This cohort was presented in collaboration with Safety and Alignment Group at C4AI
All Video Recordings can be found on this PAGE
All Cohort Material (Videos, Assignments, Notebooks) can be accessed through this GitHub Repo
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
CUDA Programming Mini-Cohort
Cuda Cohort Kick-Off Call - April 5, 2024
Introduction to Concurrency - April 13, 2024
Introduction to Concurrent Programming - April 19, 2024
Understanding your GPU & CUDA Programming Fundamentals - April 26, 2024
Advanced CUDA Programming - May 3, 2024
Writing Kernels for PyTorch - May 10, 2024
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
Paper Implementation Sprints
4 Week Sprints based on relevant ML papers in the following structure:
Week 1 - Paper Reading led by Volunteer
Week 2 - Paper Implementation led by Volunteer
Week 3 - Guest Speaker Session
Week 4 - Community spotlight led by Volunteer where attendees of sprint share their versions of implementations
PAPER #8: Toy Models of Superposition led by Akanksha in collaboration with ML Theory group lead Martina
Toy Models of Superposition Paper Sprint Week 1 led by Akanksha
Toy Models of Superposition Paper Sprint Week 2 led by Martina
Toy Models of Superposition Paper Sprint Week 3 led by Akanksha
PAPER #7: Mixture of Experts led by Reza
Mixture of Experts Blog Sprint Week 1 led by Reza
Mixture of Experts Blog Sprint Week 2 led by Reza
PAPER #6: MISTRAL 7B led by Anier and Herumb
MISTRAL 7B Paper Sprint Week 1 led by Anier
MISTRAL 7B Paper Sprint Week 2 led by Herumb
PAPER #5: LoRA (Low-Rank Adaptation of Large Language Models) led by Shivalika
LoRA (Low-Rank Adaptation of Large Language Models) Paper Sprint Week 1 led by Shivalika
LoRA (Low-Rank Adaptation of Large Language Models) Paper Sprint Week 2 led by Shivalika
LoRA (Low-Rank Adaptation of Large Language Models) Paper Sprint Week 4 led by Shivalika
PAPER #4: ZeRO: Memory Optimizations Toward Training Trillion Parameter Models led by Herumb
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper Sprint Week 1 led by Herumb
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper Sprint Week 2 led by Herumb
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models Paper Sprint Week 4 led by Herumb
PAPER #3: Zoom In - An Introduction to Circuits led by Akanksha
Zoom in: Introduction to Circuits Paper Sprint Week 1 led by Akanksha
Zoom in: Introduction to Circuits Paper Sprint Week 2 led by Akanksha
Zoom in: Introduction to Circuits Paper Sprint Week 3 led by Akanksha
Zoom in: Introduction to Circuits Paper Sprint Week 4 Community Spotlight led by Akanksha
PAPER #2: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) led by Reza
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) Paper Sprint Week 1 led by Reza
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) Paper Sprint Week 2 led by Reza
PAPER #1: Parameter Efficient Transfer Learning for NLP led by Herumb
Parameter Efficient Transfer Learning for NLP Paper Sprint Week 1 led by Herumb
Parameter Efficient Transfer Learning for NLP Paper Sprint Week 2 led by Herumb
Parameter Efficient Transfer Learning for NLP Paper Sprint Week 4 led by Herumb
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
Guest Speaker Sessions
Tips and Tricks for Doing Good Research with Sara Hooker - February 23, 2023
'Flying lessons with ex-Birds' with Hailey Schoelkopf -
March 30, 2023
'Flying lessons with ex-Birds' with Harsha Nelaturu -
May 11, 2023
Katie Matton - "Unveiling her amazing research journey at the crossroads of machine learning, affective computing, and behavioral health" - July 27, 2023
Unveiling the Journey with Edward Hu - Insights into LoRA, μTransfer, and the Art of Reasoning - October 5, 2023
The uncertain journey in uncertainty quantification with Ghifari Adam Faza - November 10, 2023
Journey into AI with Lewis Tunstall - February 2, 2024
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
Nidification
This series is aimed at going through ML topics in depth
Nidification Session #1: Tokenization in NLP
Nidification Session #2: Herumb presents of LangChain
Nidification Session #3: Introduction to Federated Learning
Nidification Session #4: Tutorial on deepspeed
Nidification Session #5: Introduction to EINOPS Library
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
BIRDS Coffee Chats
January 5, 2024
🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦🐦
Summary Banks
Summary Bank #1: Conformal Prediction
Hi BIRDS! This is the first summary bank, took a bit longer than we expected mainly because we can’t post long articles on Discord and short ones were well…not too good. Frankly speaking if we did want to do this it would be because it helps you understand a vague idea of a topic or domain in one go, rather than resource hoarding. So what is this bank about? Well, this one is about Conformal Prediction. It’s not really my favorite topic to talk on to be frank, nothing against it it’s just not something I use in daily life but it’s a really cool topic to understand!
What is Conformal Prediction?
Conformal Prediction comes under the topic of Uncertainity Quantification, which is quantification of uncertainity. That doesn’t seem very helpful does it? Let me elaborate, in the most simplistic setting of Deep Learning problem we are trying to find an answer. This answer can be predicting the breed of dog, type of object or maybe the next word too. Given some inputs to the model, we get this answer and that’s usually all that we care about. This type of setting is called Point Prediction where point is the answer we get from the model. Uncertainity Quantification can provide us a way to find how certain or confident the model is about these predictions.
Conformal Prediction also provides a way to generate a prediction set for any model. To put it simply:-
Conformal prediction uses an already trained model to estimate “all” predictions that a model can make for the given input with high confidence.
Suppose your model predicts a breed of a dog, with conformal prediction you can get the possible prediction set for that image. So given an image instead of breed as an answer you’ll get a list of breeds possible for that image. The catch is that Conformal Prediction provides us a guarentee that this prediction set fall under a high probablity range that you set. This guarentee is called Coverage.
How do I do it?
So, we know a bit about conformal prediction but how do we do it, say…in code. Well let’s talk about the easier way first, its as easy as cooking instant noodles. You have the tools(libraries) just follow the instructions and you should be good. So, you essentially have 2 steps:-
Training Step: Training a model on the point prediction task on a dataset.
Calibration Step: Estimating the Confidence interval/Prediction set of the prediction from the model over a held out IID dataset for this task.
During the calibration step, we do the following:-
Decide a scoring function and get the score of the correct class. This score could be absolute difference or anything to be fair. So you feed input to the net and let the softmax vector from that vector find the softmax value of the correct label and find its score. Note this softmax value of correct class that the model predicted may or may not be highest.
Repeat the above for all the points in the calibration set and get the “score list”. Take ~10% quantile of these scores. With this you get a score(let’s call it q) such that 90% points/scores in score list have a true class score more than q.
Now during prediction, when you get softmax score for the outputs all the score that are more than q form our prediction set.
I’m in where do I go?
Cool, really glad that you liked conformal prediction. Darn, now I really wish it was a blog post but I have really cool resources you can refer to learn about it!
A Tutorial on Conformal Prediction(Playlist, basis of the section 2 example): https://www.youtube.com/watch?v=nql000Lu_iE&list=PLBa0oe-LYIHa68NOJbMxDTMMjT8Is4WkI&ab_channel=AnastasiosNikolasAngelopoulos
A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification: https://arxiv.org/abs/2107.07511
Introduction To Conformal Prediction With Python: https://christophmolnar.com/books/conformal-prediction/
Conformal Prediction in 2020: https://www.youtube.com/watch?v=61tpigfLHso
Resource Stack: https://github.com/valeman/awesome-conformal-prediction