Channel: #safety-and-alignment
Alif - @biggmon on Discord
Abrar - @abr_r on Discord
Each session covers different topics in safety and alignment. The goal is for you to come out of each session with at least a high-level understanding of what we discussed.
Topics include (but are not limited to) RLHF, inner & outer misalignment, goal mis-generalization, inverse RL, scalable oversight, interpretability.
Everyone is welcome to join! A basic ML background is sufficient.
We meet on Thursday at 10 am PST (on a bi-weekly basis).
AI Alignment Cohort
Together with the BIRDS group, Safety & Alignment is organizing the ARENA Cohort.
Recent Recordings
Nov 7, 2024
May 30, 2024
May 2, 2024
April 18, 2024
May 30, 2024
March 7, 2024