Carmelo Fabio Longo
Researcher
CNR-ISTC Catania (Italy)
Researcher
CNR-ISTC Catania (Italy)
Carmelo Fabio Longo is a researcher at the CNR-ISTC (Italy), better call him just "Fabio" though. After graduating in Computer Science in 2000, he worked on IT engineering in health-care, in Catania (Italy). Since 2017 he carried out applied research on robotic and multi-agent systems, focused on natural language processing. In 2022 he obtained his Ph.D. in Computer Science at the University of Catania. His main research interests include: Artificial Intelligence, Cognitive Architectures, Natural Language Processing and Semantic Web.
The paper presents a novel fine-tuning approach for LLMs named EXAR (EXclusive AutoRegressive fine-tuning) aimed at injecting metaknowledge about a particular task (in this case Question-Answering) in order to improve or self-correct previous errors.
Title: Eliciting metaknowledge in Large Language Models
Abstract:
The introduction of Large Language Models (LLMs) able to exhibit a number of linguistic and extra-linguistic capabilities has represented, in the last years, one of the main frontiers in Artificial Intelligence (AI) research. Researcher from various disciplines debate about whether or not, among the capabilities of LLMs, there is the one of using knowledge about knowledge – usually considered one of the antechambers of meta-cognition in cognitive agents – about a particular task in order to improve or self-correct previous errors. In this work we propose a novel fine-tuning approach for LLMs, named EXAR, based on a multi-stage process leveraging past predictions from an early version of the same, and aimed at injecting metacognitive features for the task of Question-Answering. The conducted experiments on Llama-2-7B-chat showed promising improvements on the quality of the outcomes, due to the fact that the LLM acquired the ability to detect its own wrong predictions forcing itself to repeat submissions, thorough a prompt designed to fix inadmissible predictions, whenever detected. Such detection is achieved by enquiring the same LLM acting as meta-validator, through another prompt specifically designed for such purpose.
Among LLMs meta-cognitive techniques based on chain-of-thoughs, here we investigated whether metaknowledge can be injected by fine-tuning in an auto-regressive fashion-way based on past inferences.
In June 17-19, 2024, the Conference Advance in Cognitive Systems held in Palermo, where, together with Misael Mongiovì, Luana Bulla and Antonio Lieto, I submitted a paper titled: "Eliciting Metaknowledge in Large Language Models". The contribution investigated on improving predictions on Large Language Models, through a self-corrective inference taking in account of past inferences, which can be considered also kind of Metacognition. The work was accepted as poster and presented by Antonio Lieto.
The Eleventh Annual Conference held in Palermo (Italy), June 17-19, 2024
Eliciting Metaknowledge in Large Language Models
In the 13th International Conference on Data Science, Techonology and Applications, the organization impressed me for the capillary communication for every aspect of the conference, the keynotes were interesting and I met several scholars with whom there could be future collaborations. For the occasion I presented an article on hierarchical classification of texts, in the absence of training datasets, using the Large Language Model Llama2 for the generation of synthetic texts.
The conference was featured by Jurgen Schmidhuber keynote, who can be considered the father of the pre-transformers state-of-the-art architecture known as Long Short Term Memory (LSTM)
Proud of achieving the Best Paper Award at DATA 2024 held in Dijon.
HTC-GEN: A Generative LLM-Based Approach to Handle Data Scarcity in Hierarchical Text Classification
14th International Conference on Formal Ontology in Information Systems
Last conference day
Together with Mario Monti and John Beverley