PGT projects

Data-centric AI explainability #1

Research around Explainable AI (XAI) is interpreted differently depending on the type of model, data, and applications. In general, however, it has been focused primarily on explaining model inference (see eg LIME, occlusion testing for images), with relatively little relevance given to linking the inference to the data used for training.

This is changing, with recent advances in the area of so-called "Data-Centric AI". For example, concepts underpinning data valuation, such as Influence Functions [WFW+20] and more recently AME (Average Marginal Effects) [LLZ+22] help pinpoint specific data points in the training set, which are most responsible for a given inference. Other, relatively older concepts like Explanation Tables [EF+18] have also been "resurrected" with the aim to provide data-centric explanations.

This project involves:


[WFW+20] Wu, Weiyuan, Lampros Flokas, Eugene Wu, and Jiannan Wang. ‘Complaint-Driven Training Data Debugging for Query 2.0’. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 1317–34, 2020. https://dl.acm.org/doi/abs/10.1145/3318464.3389696.

[LLZ+22] Lin, Jinkun, Anqi Zhang, Mathias Lécuyer, Jinyang Li, Aurojit Panda, and Siddhartha Sen. ‘Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments’. In Proceedings of the 39th International Conference on Machine Learning, 13468–504. PMLR, 2022. https://proceedings.mlr.press/v162/lin22h.html.

[EF+18] El Gebaly, Kareem, Guoyao Feng, Lukasz Golab, Flip Korn, and Divesh Srivastava. ‘Explanation Tables’. Sat 5 (2018): 14.

Data-centric AI explainability #2 One of the core ideas behind "data-centric AI" is that producing successful and practically viable AI models requires a balanced effort between optimising the model itself (the combination of its architecture and hyper-parameters) and improving the training sets used to train the model.  This places a new focus on interventions on the data that are eventually shaped into a training set, including for instance data cleaning, pruning, bias reduction, alterations designed to promote fairness, and more. As these interventions are designed to have an effect on the model and its inferencing properties, it is important that they are accounted for in the context of end-to-end explanations.

This project addresses the question of how to automatically generate explanations that (1) justify specific data interventions (eg "imputation was performed on these variables based on a missingness analysis tha revealed [...]"), (2) describe their nature ("missing data on Y  was imputed using a regression model that used variables X1... Xn, because [...]") and (3)  describe their effect ("z% of the values in X have been imputed, and the imputation  modified the distribution of X, reducing its variance [...]").

As you can see from this relatively simple example, automatically "filling the gaps"  requires a combination of elements, from a recording of the actual data transformations, to conscious targeting of the explanation to recipients with diverse roles, competence, perspectives...(data scientists vs users vs regulators...).  For example, should these explanations be provided in natural language form, would they be generated from some encoding of observed data transformations, and how would those differ for different roles? Are generative LLMs appropriate for addressing these questions?

This is a big, unsolved problem, despite numerous explainability features being available in the XAI space. The project will take initial steps, starting by identifying exemplar case studies from which we can start to better understand the problem, and using them to propose and experiment with technical approaches that might work in simple cases.

AI to support Social Work and Social Care staff for reflective supervision.

Background. Social work organisations have a duty of care to their employees and the citizens they support. Providing high quality supervision can help organisations to meet this duty, by ensuring that their staff are skilled, knowledgeable, clear about their job roles, and offered practical assistance from a supervisor in the form of job-related advice and emotional support. Reflective supervision is widely accepted, by professionals and academics alike, as being a necessary component of safe social work practice and efforts which seek to support organisations to provide reflective supervision to the workforce are of central importance to the profession. However, in reality it can be difficult for supervision to provide a sufficiently reflective space to support social workers to develop their practice. 

 

Project. The project aims to develop a proof-of-concept AI tool to support reflective supervision in social work and social care practice. This will most likely be based on LLM technology, thus the project will have to identify (1) a set of target tasks that are appropriate for the context, (2) a model architecture that can be trained / tuned for the tasks, and (3) suitable input datasets for pre-training and/or fine-tuning, drawing primarily from large corpora of publications in the social work scholarly space (Open Access). The project hopes to reach a stage where meaningful chatbot style interactions can be generated, from which we can derive a qualitative assessment of the model's suitability for the target tasks.