This lesson simulates a language model. The data being used to train the model is sentence strips and predefined relationships between words in the sentence. Just as a large language model (LLM) generates text based on the relationships between tokens (words and parts of words) and the probability of what the next token will be, our language model will also use the relationships between our tokens (words in the containers) and probability to “generate” sentences based on the data we add to our model.
Prior to Lesson: If students are not familiar with the book, read or watch: Green Eggs & Ham
Warm Up Activity: Complete the sentence
Whole Class Activity: Train an unplugged AI language model using sentences from Green Eggs and Ham
Green Eggs and Ham is being used because of the repetitive nature of the text in the book. Much of what the character, who is not Sam, says starts with the word “I” and those are the sentences we will be using to build our language model.
Each student will:
Read the sentence to be added to the model
Cut the sentence strip into 5 parts
Add the word(s) of the sentence strip to containers 1 through 4
If part 5 of the sentence strip is not blank
Add the word(s) of the sentence strip to container 5
Each student will:
Use the fill-in-the-blank sentence strips to write their own sentence in the style of Green Eggs and Ham
Read their sentence
Cut the sentence strip into 5 parts
Add the word(s) of the sentence strip to containers 1 through 4
If part 5 of the sentence strip is not blank
Add the word(s) of the sentence strip to container 5
Small Group Activity: Generate & illustrate sentences using model
Divide students into pairs or small groups of 4-6. Students could also work individually.
Each student/pair/group will:
Generate a sentence using the language model the class built.
After sentences are built, each student will document the sentence generated and “illustrate” the sentence in some way:
Write or record their sentence and draw a picture of their sentence (on paper or on a platform like Seesaw).
Code their sentence in ScratchJr.
Record themselves reading the sentence and/or type the sentence into the label for a scene.
Use ScratchJr characters to act out the sentence.
Act out their sentence and, optionally, have other students guess what their sentence is.
Return the words of the sentence to the proper containers.
Teacher Materials
5 containers (buckets, bowls, boxes, etc.) labeled with the shape and number for each of the five parts of the sentence: 1-circle, 2-heart, 3-triangle, 4-square, 5-star.
Scissors
Cardstock (for printing)
Teacher device and printer
Optional:
Student Materials
Sentence strips from Green Eggs and Ham, preferably printed on cardstock, cut into individual sentence strips (1 per student)
There are 35 sentences in total. If you have less than 35 students, do one of the following:
Give some students more than one sentence
Use less sentences
Fill-in-the-blank sentence strips, preferably printed on cardstock, cut into individual sentence strips (1 per student)
Pages 1 & 2 of the fill-in-the-blank document contain 10 fill-in-the-blank sentences appropriate for all grade levels, K-2. Print as many copies as needed to provide each student with one sentence strip to customize.
Pages 3 & 4 of the fill-in-the-blank document is most appropriate for grade 2 or above. Students must fill in the missing word with a plural pronoun or plural noun for proper verb agreement.
If you are using this activity for upper elementary or beyond, page 5 of the fill-in-the-blank document would potentially be appropriate. It is important that students follow the structure of the sentences from Green Eggs and Ham where:
Print out of Algorithms To Train Our Language Models (1 per table group for 2nd grade & above)
Paper, blank sentence strips, or personal whiteboards for recording generated sentences
Pencils (1 per student)
Once this lesson has been completed, the trained language model can make a good station for students to visit independently to generate one or more sentences and illustrate them.
Background Information
Large Language Models (LLM) are trained with huge amounts of text using a neural network architecture called a transformer (the T in GPT - Generative Pretrained Transformer) to make sense of the data. LLMs are really good at finding patterns in how words and phrases relate to each other and in making predictions about what words should come next. LLMs do not look at just one word or short phrase at a time like autocorrect but add words in context with what was previously generated. It seems like the AI is having a conversation with you. Because of this, people often talk about the output from an LLM as being thoughtful and creative. BUT … LLMs do not really “know” anything and they are not creative; they are just very good at figuring out which word follows another very quickly.
Additional Resources:
1A-DA-07 Identify and describe patterns in data visualizations, such as charts or graphs, to make predictions.
1B-DA-07 Use data to highlight or propose cause-and-effect relationships, predict outcomes, or communicate an idea.
Students will be looking for patterns in the data added to the model to predict words that are likely to occur.
1A-AP-08 Model daily processes by creating and following algorithms (sets of step-by-step instructions) to complete tasks.
Students will be following an algorithm to build and use the model.
K-2.4-A-i Demonstrate knowledge of the structure of language through tasks such as (a) generating plausible and implausible novel words, or (b) reordering the words in a scrambled sentence so that it makes sense.