Geo-Culturally Inclusive Annotations for AI

Turin, Italy

May 25th, 2024

Training and evaluation of language models are increasingly relying on semi-structured data that is annotated by humans, along with techniques such as RLHF growing in usage across the board. As a result, both the data and the human perspectives involved in this process play a key role in what is taken as ground truth by our models. As annotation tasks are becoming increasingly more subjective and culturally complex, it is unclear how much of their socio-cultural identity annotators use to respond to tasks. We also currently do not have ways to integrate rich and diverse community perspectives into our language technologies. In this tutorial, we will build towards this goal by through a series of interactive exercises centered on :

How do different methods of annotation shape data on social representation ?

How do socio-cultural identities of annotators shape models and evaluations?

How do we envision culturally inclusive annotation guidelines?

Page updated

Google Sites

Report abuse