PhD Thesis

On May 5th, 2023 I successfully defended my PhD thesis with the jury recommendation for the best thesis award at Polytechnique Montreal.

[SLIDES]

Jury Members

Thesis Title

"Natural Language Reasoning with Transformer Language Models"

Abstract

Due to the growing popularity of Transformer Language Models (TLMs), there is an increasing need to better understand their strengths and limitations if they are to be widely used to help humans solve complex tasks with real-world implications. This thesis is particularly centered around their multi-step reasoning capabilities as it is both a weakness of language models and a potentially impactful research direction.

First, the compositional generalization of TLMs is evaluated on a logical reasoning task in natural language. Transformer decoder models are trained to answer link-prediction questions by reasoning over relationships between entities. In particular, to better understand how TLMs reason, models are trained to generate various types of natural language explanations (proofs) before generating their final answer. Both the models’ answer accuracy and proof accuracy are evaluated on problems requiring specific numbers of reasoning steps that are not seen during training. This first contribution confirms that TLMs suffer from length generalization issues when tested on longer-than-trained problems. Additionally, it reveals that TLMs generalize better when trained on longer, exhaustive proofs than with shorter ones. Results also show that TLMs generalize better when trained to generate backward-chaining rather than forward-chaining proofs. However, it is also observed that models trained to predict the answer directly without generating a logical explanation generalize better to more complex problems. This suggests that TLMs have internal reasoning strategies that are hard to interpret and that benefiting from naturally stated logical proof statements requires more complex internal representations. Additional experiments showed for instance that pre-trained models have better reasoning capacities although not explicitly trained to solve such tasks. This first contribution is published as a conference paper in the Advances in Neural Information Processing Systems (NeurIPS) 2020. 

The next contribution introduces an abstraction inductive bias into pre-trained TLMs and demonstrates its benefits on symbolic reasoning tasks. As manipulating generic concepts simplifies reasoning processes and allows humans to generalize knowledge across domains, this contribution makes use of named entity recognition to label entity types in input sequences. Five strategies are proposed to incorporate this additional knowledge into an encoder-decoder TLM: two embedding-based methods, two encoding-based methods, and one auxiliary-loss-based method. Models are evaluated on various reasoning datasets ranging from compositional reasoning, abductive reasoning, multi-hop question answering, and conversational question answering. Experimental results indicate that the best entity-type abstraction-aware models improve the performance of TLMs by up to 20% on tasks explicitly requiring symbolic reasoning thus confirming the advantages of this inductive bias. However, the proposed abstraction method is not as effective on more natural language tasks. Further analysis suggests that entity-type abstraction is only beneficial in tasks with (1) good quality abstraction labels and (2) with train/test data split according to the reasoning complexity of each example. This second contribution is published as a journal paper in the Transactions on Machine Learning Research (TMLR). 

Finally, with the increasing prevalence of chat interfaces, the third contribution moves away from single-turn question-answering tasks and towards interactive text environments. These environments require the model to perform multi-step reasoning by design as the goal is to reach a final objective by generating text commands to interact with the environment and evolve toward that final goal step by step. In an effort to better control the behavior of TLMs in those environments, the last contribution proposes an offline reinforcement learning method that leverages pre-trained TLMs and conditions them on a desired outcome. Experimental results on some of the most challenging Jericho text games show that TLMs can learn a mapping from goal condition to action, and thus confirm the significant advantage of using exponential tilt when the model is generating its own outcome condition. Furthermore, multiple conditioning methods are proposed and compared against each other. Results show that the proposed methods can improve average performance by up to 10% over previous baselines. Eventually, taking advantage of the use of TLMs in text environments, additional experiments demonstrate that models trained to predict the consequences of their actions also improve the averaged normalized performance by 10%. 

In summary, this thesis attempts to shed light on the multi-step reasoning abilities of Transformer language models and introduces novel mechanisms to build more logical and controllable language models.

Link to the thesis

will be available here once published by the university.

Papers Used in the Thesis