Probing Machine Learning Models in Angluin's Style

KR 2024 Tutorial

 

Abstract

A major concern when dealing with complex machine learning models, such as language models, is to determine what influences their outcome. This tutorial casts light on Angluin’s exact learning framework and Valiant’s probably approximately correct framework and whether/how they can be employed to systematically probe machine learning models, extracting high level abstractions which can inform about their knowledge, general behaviour, and potentially harmful biases.

 


Potential Target Audience: 


This tutorial is of potential interest to the AI community working on systematic ways to probe machine learning models, so as to investigate their behaviour, potential harmful biases, and potential for knowledge extraction. 


Prerequisites: 


The tutorial is mostly self-contained. As a prerequisite, we expect a master level background in computer science.


Outline:



  Material: 


  Extended Abstract

           Hands-on DFA Implementation by Mikel Alesha and Montserrat Hermo

           Slides

 Notes (peer-reviewed, to appear in RW 2023 proceedings)


Speaker: Ana Ozaki

Associate Professor at the University of Oslo & University of Bergen