PGT projects 2024
Please see two project themes below:
Theme 1: Data-centric AI explainability
Theme 2: AI to support Social Work and Social Care staff for reflective supervision
Theme 1: Data-centric AI explainability
Context. Research around Explainable AI (XAI) is interpreted differently depending on the type of model, data, and applications. In general, however, it has been focused primarily on explaining model inference (see eg LIME, occlusion testing for images), with relatively little relevance given to linking the inference to the data used for training.
This is changing, with recent advances in the area of so-called "Data-Centric AI". For example, concepts underpinning data valuation, such as Influence Functions [WFW+20] and more recently AME (Average Marginal Effects) [LLZ+22] help pinpoint specific data points in the training set, which are most responsible for a given inference. Other, relatively older concepts like Explanation Tables [EF+18] have also been "resurrected" with the aim to provide data-centric explanations.
References
[WFW+20] Wu, Weiyuan, Lampros Flokas, Eugene Wu, and Jiannan Wang. ‘Complaint-Driven Training Data Debugging for Query 2.0’. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 1317–34, 2020. https://dl.acm.org/doi/abs/10.1145/3318464.3389696.
[LLZ+22] Lin, Jinkun, Anqi Zhang, Mathias Lécuyer, Jinyang Li, Aurojit Panda, and Siddhartha Sen. ‘Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments’. In Proceedings of the 39th International Conference on Machine Learning, 13468–504. PMLR, 2022. https://proceedings.mlr.press/v162/lin22h.html.
[EF+18] El Gebaly, Kareem, Guoyao Feng, Lukasz Golab, Flip Korn, and Divesh Srivastava. ‘Explanation Tables’. Sat 5 (2018): 14.
Starting point:
We have been developing a tool for capturing the derivations of data through python/Pandas scripts: https://github.com/Lucass97/data_provenance_for_data_science.
The tool is described in this recent presentation: https://www.slideshare.net/slideshow/design-and-development-of-a-provenance-capture-platform-for-data-science/268330702
Concrete projects:
Using the above as a starting point, we can develop ideas in a number of interesting directions. Please see here for a recent talk that provides more background: https://www.slideshare.net/slideshow/explainable-data-centric-ai-what-are-you-explaininhg-and-to-whom/268463441
LLMs exploiting the derivation graphs to provide suggestions on pipeline and data repairs. This project will look at using RAG (Retrieval Augmented Generation), specifically to incorporate results from Neo4J queries (Cypher) into narratives that suggest interventions on data and on pipelines
Graph analysis of provenance graphs. The derivation graph is natively stored in a Neo4J graph DB. The project will experiment with graph analysis algorithms on these derivations, using the Neo4J Graph Data Science library
"Why+" explanations: augmenting data derivations to describe the behaviour of complex data processing algorithms, for example training set optimisation, incremental data cleaning, etc. please see this presentation paper: https://www.dropbox.com/scl/fi/yfpzxtsbrtj9oppc52ymk/DCAI_position_SEBD_24_CR.pdf
I am open to discussing other related ideas for alternative projects.
Theme 2: AI to support Social Work and Social Care staff for reflective supervision.
Background. Social work organisations have a duty of care to their employees and the citizens they support. Providing high quality supervision can help organisations to meet this duty, by ensuring that their staff are skilled, knowledgeable, clear about their job roles, and offered practical assistance from a supervisor in the form of job-related advice and emotional support. Reflective supervision is widely accepted, by professionals and academics alike, as being a necessary component of safe social work practice and efforts which seek to support organisations to provide reflective supervision to the workforce are of central importance to the profession. However, in reality it can be difficult for supervision to provide a sufficiently reflective space to support social workers to develop their practice.
Project. The project aims to develop a proof-of-concept AI tool to support reflective supervision in social work and social care practice. This will most likely be based on LLM technology, thus the project will have to identify (1) a set of target tasks that are appropriate for the context, (2) a model architecture that can be trained / tuned for the tasks, and (3) suitable input datasets for pre-training and/or fine-tuning, drawing primarily from large corpora of publications in the social work scholarly space (Open Access). The project hopes to reach a stage where meaningful chatbot style interactions can be generated, from which we can derive a qualitative assessment of the model's suitability for the target tasks.