Large Language Models, Artificial Intelligence and the Future of Law

Session 4: How do you train an LLM chatbot?

The problem with pre-trained large language models

Language models like GPT (Generative Pre-trained Transformer) are fundamentally designed to produce text in response to given inputs, leveraging vast amounts of data they were trained on to generate further content. However, in this raw, pre-trained state, these models are not particularly useful for human tasks.
To transform these sophisticated text generators into practical tools, a layer of interfacing—often involving additional training or fine-tuning with specific datasets and implementing user query understanding capabilities—is essential.

How do we convert generative text models into helpful, harmless and honest chatbots?

Reinforcement Learning Through Human Feedback (RLHF)

Image credits: https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/

Some chatbots based on Foundational LLMs

Helpful

Relevance: Does the chatbot provide information that is directly relevant to the user's query?

Clarity: Is the information presented in a clear and understandable manner?

Efficiency: Does the chatbot help users achieve their goals with minimal effort?

HARMLESS

Private: Does the chatbot protect user privacy and handle personal data responsibly?

Inoffensive: Does the chatbot avoid generating content that could be harmful, offensive, or inappropriate?

Legal: Does the chatbot refrain from providing information that could be used for illegal or harmful purposes?

Image Credits: https://www.riskinsight-wavestone.com/en/2023/10/language-as-a-sword-the-risk-of-prompt-injection-on-ai-generative/

HONEST

Accuracy: Is the information provided by the chatbot accurate and up-to-date?

Transparency: Is the chatbot transparent about what it can and cannot do?

Veracity: Does the chatbot avoid generating false or misleading information that it presents as fact?

Balancing these criteria is tricky and even the biggest companies often go wrong:

Google's extra "woke" image generation vs. Microsoft's fascist tweet generator

Additionally, many users have now found jailbreaks that can get through the safeguards created through system prompts and RLHF.

2307.15043.pdf

Human?

Empathy: Does the chatbot demonstrate understanding and sensitivity to the user's emotional state or context?

Personalization: Can the chatbot tailor its responses based on the user's preferences, history, or specific needs?

Engagement: Does the chatbot maintain an engaging and natural conversation flow that mimics human interaction?

Chatbots are trained to appear artificial. They often remind the user that they are a chatbot. They also use sophisticated language and give longwinded answers without attempting to have a genuine conversation.

2212.03551.pdf

Moreover Chatbots continue to have the following limitations

Understanding Complexity: Chatbots often struggle with understanding and processing complex user queries or nuanced language, such as idioms, sarcasm, and subtle emotional cues.
Limited Knowledge Base: The quality and breadth of the data used to train chatbots directly impact their performance. Biases or gaps in the data can lead to inappropriate or inaccurate responses.
Contextual Awareness: Due to lack of context, responses may seem disconnected from the user's intent. While chatbots can simulate empathetic responses, they lack genuine empathy and understanding.

Some chatbots try to solve these problems

Perplexity AI

Better contextualization and better access to the knowledge base.

Zapier Chat

Allows you to make your own chatbots using your own knowledge base.

Pi AI

More empathetic, emotionally intelligent AI

Hume AI

An empathic AI that detects contextual cues from your voice.

Further, there are many applications other than chatbots

Previous Session: How do we build a Large Language Model?

Next Session: How can AI be used in the legal field?