AI Safety and Alignment
Channel: #safety-and-alignment
Co-leads:
Co-leads:
Alif - @biggmon on Discord
Sunitha - @prisca6117 on Discord
Goal:
Goal:
Each session covers different topics in safety and alignment. The goal is for you to come out of each session with at least a high-level understanding of what we discussed.
Topics include (but are not limited to) RLHF, inner & outer misalignment, goal mis-generalization, inverse RL, scalable oversight, interpretability.
Logistics:
Logistics:
Everyone is welcome to join! A basic ML background is sufficient.
We meet on Thursday at 10 am PST (on a bi-weekly basis).
AI Alignment Cohort
Together with the BIRDS group, Safety & Alignment is organizing the ARENA Cohort.
Recent Recordings
![](https://www.google.com/images/icons/product/drive-32.png)
May 30, 2024
![](https://www.google.com/images/icons/product/drive-32.png)
May 2, 2024
![](https://www.google.com/images/icons/product/drive-32.png)
April 18, 2024
![](https://www.google.com/images/icons/product/drive-32.png)
May 30, 2024
![](https://www.google.com/images/icons/product/drive-32.png)
March 7, 2024
![](https://www.google.com/images/icons/product/drive-32.png)