To address the competency of "human-centred mindset", the lesson plan "Evaluating LLMs" has been developed, drawing inspiration from an activity featured on Edutopia. In this lesson, students begin by learning how AI chatbots, or large language models (LLMs), function. They then interact with an LLM and evaluate its responses using a provided rubric. The goal is to help students understand both the strengths and limitations of these tools, encouraging critical thinking about when and how LLMs can be used effectively.
You can download the lesson plan pdf here.
Grade Level: 10-12
Time Required: 45-60 minutes
By the end of this lesson, students will be able to:
Craft effective prompts to interact with an AI chatbot.
Analyze AI-generated responses for accuracy, bias, and relevance.
Critically evaluate AI as an information source.
Reflect on ethical considerations when using AI tools.
Access to a chatbot (such as chatGPT, SchoolAI, or another AI platform)
A video from code.org: How Chatbots and Large Language Models Work
A rubric for evaluating chatbot responses (provided below)
Student devices with internet access
Whiteboard or shared document for group discussion
Activity: Watch the video on How Chatbots and Large Language Models Work
Discussion Questions:
How do chatbots like ChatGPT, Google Bard, or SchoolAI generate responses?
Why is it important to critically evaluate AI-generated responses?
Mini-Lecture:
Introduce the concept of prompt engineering – the practice of refining a question to get better answers.
Highlight potential biases in AI models based on their training data.
Step 1: Give Students a Prompt
Provide students with a general topic or let them choose one.
Example topics:
“Explain climate change to a 10-year-old.”
“Summarize the causes of World War II.”
“Find the best way to solve a Rubik’s Cube.”
Step 2: Initial Interaction with the Chatbot
Have students enter their prompt into the chatbot and record the response.
Ask students to note:
Was the response clear and helpful?
Did it contain any factual errors?
Was there any noticeable bias?
Step 3: Iterating on the Prompt
Give students time to adjust their prompt to try to get a better response.
Encourage experimentation with:
More specific wording
Different perspectives (e.g., "Explain as if I were a beginner.")
Requesting sources or citations
Step 4: Evaluating the AI Responses with a Rubric
Provide students with a rubric to assess the chatbot’s response (provided below)
Students score their chatbot’s response and reflect on their findings.
Students share their results:
What changes to their prompt made a difference?
Were there any patterns in how the AI responded?
Did anyone find clear bias or misinformation?
Discussion Questions:
What does this tell us about using AI for research?
How can AI be used responsibly in school and work?
What ethical concerns arise from AI-generated content?
Exit Ticket: Students write a short response to:
“How can you ensure AI-generated responses are trustworthy?”