Large Language Models, Artificial Intelligence and the Future of Law
Session 4: How do you train an LLM chatbot?
The problem with pre-trained large language models
Language models like GPT (Generative Pre-trained Transformer) are fundamentally designed to produce text in response to given inputs, leveraging vast amounts of data they were trained on to generate further content. However, in this raw, pre-trained state, these models are not particularly useful for human tasks.
To transform these sophisticated text generators into practical tools, a layer of interfacing—often involving additional training or fine-tuning with specific datasets and implementing user query understanding capabilities—is essential.
How do we convert generative text models into helpful, harmless and honest chatbots?
Reinforcement Learning Through Human Feedback (RLHF)
Some chatbots based on Foundational LLMs
Helpful
Relevance: Does the chatbot provide information that is directly relevant to the user's query?
Clarity: Is the information presented in a clear and understandable manner?
Efficiency: Does the chatbot help users achieve their goals with minimal effort?
HARMLESS
Private: Does the chatbot protect user privacy and handle personal data responsibly?
Inoffensive: Does the chatbot avoid generating content that could be harmful, offensive, or inappropriate?
Legal: Does the chatbot refrain from providing information that could be used for illegal or harmful purposes?
Image Credits: https://www.riskinsight-wavestone.com/en/2023/10/language-as-a-sword-the-risk-of-prompt-injection-on-ai-generative/
HONEST
Accuracy: Is the information provided by the chatbot accurate and up-to-date?
Transparency: Is the chatbot transparent about what it can and cannot do?
Veracity: Does the chatbot avoid generating false or misleading information that it presents as fact?
Balancing these criteria is tricky and even the biggest companies often go wrong:
Google's extra "woke" image generation vs. Microsoft's fascist tweet generator
Additionally, many users have now found jailbreaks that can get through the safeguards created through system prompts and RLHF.
Human?
Empathy: Does the chatbot demonstrate understanding and sensitivity to the user's emotional state or context?
Personalization: Can the chatbot tailor its responses based on the user's preferences, history, or specific needs?
Engagement: Does the chatbot maintain an engaging and natural conversation flow that mimics human interaction?
Chatbots are trained to appear artificial. They often remind the user that they are a chatbot. They also use sophisticated language and give longwinded answers without attempting to have a genuine conversation.
Moreover Chatbots continue to have the following limitations
Understanding Complexity: Chatbots often struggle with understanding and processing complex user queries or nuanced language, such as idioms, sarcasm, and subtle emotional cues.
Limited Knowledge Base: The quality and breadth of the data used to train chatbots directly impact their performance. Biases or gaps in the data can lead to inappropriate or inaccurate responses.
Contextual Awareness: Due to lack of context, responses may seem disconnected from the user's intent. While chatbots can simulate empathetic responses, they lack genuine empathy and understanding.
Some chatbots try to solve these problems
Better contextualization and better access to the knowledge base.
Allows you to make your own chatbots using your own knowledge base.
More empathetic, emotionally intelligent AI
An empathic AI that detects contextual cues from your voice.
Further, there are many applications other than chatbots