posted 1/17/25 by Kason Yoo
You want to look something up or you have a question you can’t solve, so your first instinct is to log onto ChatGPT. You type in your desired demand and press the arrow button, and almost immediately, the AI gives you paragraphs of information. Although it is controversial when it comes to academic integrity, it is undoubtedly fast and efficient, so how does it come up with so much so fast? How does ChatGPT work?
The first place to start is the AI’s name. GPT stands for Generative Pre-Trained Transformer, which is a large language model (LLM) for generative AI. Starting with the P, ChatGPT is pre-trained using supervised and unsupervised pre-training. Supervised pre-training is when the AI is given certain responses to answer certain prompts. Unsupervised pre-training is when the AI is trained on data with no specific output. The AI reads the underlying structure and patterns in the input without a task in mind, and then forms an output based on what it reads.
Next is the T. A transformer is a neural network that essentially transforms an input sequence into an output sequence, processing information through layers of interconnected nodes. Transformers have two main sublayers which are the self-attention layer and the feedforward layer. The self-attention layer looks at all the words in the prompt, deciding what the words mean, what the relationships between the words are, and which words are most relevant. Sentences are broken up into tokens, which are basic units that can be encoded, and once the sentence is encoded, you’re left with a block of data that represents an input. Meanwhile, the feedforward layer transforms the data through a neural network, allowing it to learn patterns and make predictions on the input data. These layers help ChatGPT make a much more accurate and colloquial response.
To make ChatGPT as effective as possible, it uses something called Natural Language Processing (NLP), which allows the AI to understand, interpret, and generate human language. NLP uses multiple applications such as chatbots, speech recognition, and translation so that it can learn the syntax of human language and create algorithms to represent them. NLP technologies work by breaking down texts and analyzing their meanings and relationships to generate responses.
That is a quick overview of how ChatGPT works, but that only scratches the surface of this groundbreaking AI. Loads of technology are implemented, and although it may be complicated, it makes the the application much simpler and faster to use.
Sources:
https://www.zdnet.com/article/how-does-chatgpt-work/