Large Language Models (LLMs)

Large Language Models (LLMs) at first glance seem complex and daunting. We have to parse through all the jargon and the industry-specific language that is associated with LLMs, and begin with two fundamental questions:

What exactly are Large Language Models, and how do they connect to the concept of Natural Language Processing (NLP)?
How does mathematics help us understand language?

At the core of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT-3) (Floridi & Chiriatti, 2020), are the fundamentals of Natural Language Processing (NLP). NLP exists at the intersection of linguistics and computer science, where the most groundbreaking advancements in AI has been the use of transformer architectures (Devlin et al., 2018) and the attention mechanism (Vaswani et al., 2017). These innovations have played a pivotal role in recent AI breakthroughs.

The natural language processing pipeline. Step 1 Sentence segmentation. Step 2 Word tokenization. Step 3 Stemming. Step 4 Lemmatization. Step 5 Stop word analysis. Step 6 Dependency Parsing. Step 7 Part-of-speech tagging.

A diagram showing the work flow of Natural Language Processing. Source: https://www.turing.com/kb/natural-language-processing-function-in-ai

In general terms, NLP focuses on developing methods to process human languages, such as English, French, Chinese, and Arabic, by converting them into digital and mathematical representations. This transformation enables automated machine processes to break down, tokenize, align, analyze, refine, and generate text outputs that mimic human-like responses.

Large Language Models leverage vast datasets, with Generative AI models often utilizing attention mechanisms, a deep learning technique, to prioritize the most relevant parts of the data within a dataset. The underlying mathematical computations involve converting sentences into word vectors, analyzing word frequency and probability to generate predictions, and transforming inupts and outputs through multiple layers to ensure coherence and contextual accuracy.

Activity

How can we use Large Language Models to empower High School students?

Listen to the podcast linked on the right and explore lesson ideas that integrate LLMs ethically in educational settings.

Browse the lesson resources and choose the ones that align with your teaching goals and interests.

Reflect

If you have read this far, it means that at some point, you had to learn language from the ground up!

How do you continue to use mental 'algorithms' to process new language - whether it's learning new vocabulary, jargon, or even adapting to different forms of communication in your daily life?

For further reading, check out Tim B. Lee's article on Large Language Models, explained for those without a technical background in math or computer science.

◄ Back: Machine Learning

Next: Data & Bias ►

Page updated

Google Sites

Report abuse