Probing Machine Learning Models in Angluin's Style

KR 2024 Tutorial

Abstract

A major concern when dealing with complex machine learning models, such as language models, is to determine what influences their outcome. This tutorial casts light on Angluin’s exact learning framework and Valiant’s probably approximately correct framework and whether/how they can be employed to systematically probe machine learning models, extracting high level abstractions which can inform about their knowledge, general behaviour, and potentially harmful biases.

Potential Target Audience:

This tutorial is of potential interest to the AI community working on systematic ways to probe machine learning models, so as to investigate their behaviour, potential harmful biases, and potential for knowledge extraction.

Prerequisites:

The tutorial is mostly self-contained. As a prerequisite, we expect a master level background in computer science.

Outline:

Introduction and Motivation (5 min)
Background: Exact Learning, PAC Learning (10 min)
Probing NNs: Extracting Automata (30 min)
Hands-on Activity (30 min)
Pause
Probing LMs: Extracting Horn Expressions (30 min)
Probing LMs: Extracting Decision Trees (30 min)

Conclusion and Discussion (15 min)

Material:

Extended Abstract

Hands-on DFA Implementation by Mikel Alesha and Montserrat Hermo

Slides

Notes (peer-reviewed, to appear in RW 2023 proceedings)

Speaker: Ana Ozaki

Associate Professor at the University of Oslo & University of Bergen

Page updated

Google Sites

Report abuse