AI Safety & Alignment

Lecture Slides 

Slides by instructor / visiting lecturers

TBD

Slides by students: 

** disclaimer: these slides were created by students and may contain errors/omissions/inaccuracies of various severity**

1.(Public Release) Does Alignment Get Better with Scaling, Generally_.pdf
COS 597Q_Criticism on AI risk and safety-public.pdf
4.Rewards and Goals 2 - RLHF.pdf
3.Rewards and Goals_ Constitutional AI.pdf
6.adversarial_attack.pdf
5.Understanding and Aligning Ethics - Fixed.pdf
7.Game Theoretic Approaches.pdf
8.Interpretability.pdf
9.Economic Impacts of AI.pdf