Geo-Culturally Inclusive Annotations for AI
Turin, Italy
May 25th, 2024
Turin, Italy
May 25th, 2024
Training and evaluation of language models are increasingly relying on semi-structured data that is annotated by humans, along with techniques such as RLHF growing in usage across the board. As a result, both the data and the human perspectives involved in this process play a key role in what is taken as ground truth by our models. As annotation tasks are becoming increasingly more subjective and culturally complex, it is unclear how much of their socio-cultural identity annotators use to respond to tasks. We also currently do not have ways to integrate rich and diverse community perspectives into our language technologies. In this tutorial, we will build towards this goal by through a series of interactive exercises centered on :