OpenAI’s brainchild, ChatGPT has taken the world by storm in a short span. It’s a name that’s being used in almost every workspace.
But a lot of you are not aware of the fact that there were 3 language models and an update, InstructGPT that were launched before landing on the powerful tool ChatGPT. Moreover, there is another version of the series, GPT-4 in the pipeline that is speculated to be more powerful and is much anticipated. The world has its eyes on it and soon after its launch the internet would be brimming with verdicts on the same.
This blog aims to highlight the evolution of the GPT models that have transformed the AI space in the last 5 years since the first GPT model that came out in 2018 to the coming model GPT-4 in 2023.
GPT Models:
The world has been introduced to some very powerful language models through Generative Pre-trained transformers (GPTs) before finally introducing the world to ChatGPT.
These models can perform a wide range of Natural Language Processing (NPL) tasks like question answering, textual entailment, text summarisation, etc., without any special or supervised training; outperforming the NPL models.
GPT-1
GPT-1 has 117M parameters and was launched back in 2018. Parameters are simply the characteristics that a language model studies for the purpose of comprehending multiple components of language.
This language model was superior to the one existing in and before 2018, which only performed tasks and required supervision learning to function. It handled problems like reading comprehension, common sense questions, and logical reasoning better and generated answers that were more sensible and superior.
GPT-1 performed tasks better than its previous models and unleashed the robustness of generative pre-training. It further paved way for other models like GPT-2, which could build up on this potential with larger datasets and more data points.
GPT-2
GPT-2 came with 1.5 billion parameters and was introduced in 2019. It was a bigger and better version than its predecessor. Further, it was trained on more than 10 times the data with additional model parameters to develop a robust language model.
The real power and utility of the GPT series were realized with the introduction of GPT-2 as it could produce more natural and ordinary-looking text.
The language model makes use of task conditioning, Zero-Shot Learning, and Zero Short Task Transfer to enhance model performance.
This model made OpenAI realize that the potential of a GPT model was further to be explored and building even larger language models would be the answer to ever-changing human needs. Hence, GPT-3 was introduced with a larger data set and more parameters, the two factors that largely enhanced the capability of language model to comprehend tasks and surpass the state-of-the-art of many tasks in zero shot settings.
Moreover, GPT-2 lacked contextual understanding, had limited memory and creativity that paved way for an updated model, GPT-3.
GPT-3
GPT-3 has 175 billion parameters or data points and came out in June 2020. It was an even more sophisticated and interactive model of the series.
The model was trained on an even larger dataset of 499 billion tokens. It is superior to complete a task that involves on-the-fly reasoning, or data adaptation. It performs tasks that make one wonder if they are done by humans or any other form of intelligence. These tasks include story writing, unscrambling words, SQL queries and Python scripts, language translation, and 3-digit mathematics.
However, some basic issues were being faced with GPT-3 that led to OpenAI coming out with an updated version of the same named InstructGPT.
GPT-3 could not follow instructions properly and hence was not fully served its purpose as a chatbot. It would give results with correct sentences, but that answer may not be entirely relevant to the question asked.
With InstructGPT, the text generated was more relevant and meaningful. This enhanced the usefulness of the language model for its users.
In the same year, after introducing InstructGPT in January 2022, OpenAI surprised the world with another powerful version of the GPT series in November, ChatGPT, the real OG of language models so far.
This AI model took human-like task-performing capabilities to another level. This version could write blog posts, and film scripts, write codes, and assist with YouTube video suggestions and interior design ideas. This updated version is more conversational and apt and superior for human use.
It entertains follow-up questions, admits to errors, and does not accept incorrect requests, making the process more coherent and reliable. This was viewed as a significant update over all previous GPT models.
However, there are some limitations of ChatGPT that experts came across in the recent past. The model is prone to manipulations by its users. For instance, users employes some back door tricks to bring the model to answer questions that it previously had rejected. This can have catastrophic consequences.
Furthermore, the model is trained on a large set of limited data till September 2021 and hence may contain biases or lack accuracy or knowledge of recent events or affairs. Moreover, the information generated is not always source checked.
Read through ChatGPT v/s Bard to know more about the OpenAI’s most widely used AI model and how Bard is challenging its supremacy in the marketplace.
While both InstructGPT and ChatGPT are internally updated to GPT 3.5, OpenAI has announced GPT-4 to overcome the shortcomings of previous models and updates.
Also read blog: How AI Chatbot are transforming technology
GPT-4
OpenAI’s much-anticipated language model, GPT-4 was launched on 14th March 2023. The company blog claims that the new version i.e., GPT4 is more creative and collaborative than its predecessors.
While ChatGPT, powered by GPT3.5 only accepted and responded to text inputs, GPT-4 is a large multimodal model that is capable of using images to generate captions and give analyses.
OpenAI further claims that ‘GPT-4 also exhibits human-level performance on various professional and academic benchmarks.’ For instance, it can pass the bar exam with marks that make the top scores. This model has a broader knowledge base and better problem-solving abilities.
Furthermore, this learning model can handle over 25,000 words of text. There would be a greater number of use cases including long-form content creation, looking up documents and analyzing them, and long conversations.
According to the OpenAI blog, this learning model is more factually accurate and has the best steerability of all the previous models. It comes with the highest number of tokens i.e., 32,768 tokens or around 64,000 words, and can hold long conversations, as opposed to ChatGPT which could only handle 8,000 words before going all over the place with its responses.
GPT-4 is equipped to understand more languages than just English and is yet to be made available free of cost. However, one way you can access it immediately for free is through Microsoft’s updated Bing search engine/chat that now runs on GPT-4, as announced by the big tech giant.
While we think that GPT-4 has lived up to the expectations, it is yet to come out in the public domain and be scrutinized by the masses. It has certainly come with powerful updates; however, some speculations/expectations were just so over the top to be matched.
2018: GPT-1
First transformative model
Trained on BooksCorpus
117 billion parameters, facilitated transfer learning and capable of performing various NLP tasks with very little fine-tuning
2019: GPT-2
Bigger and optimized version of GPT-1
Trained on WebText (scraped from Reddit)
1.5 billion parameters, usage of task conditioning, zero shot learning and zero short task transfer to enhance model performance
2020: GPT- 3
Robust version of GPT-2
Trained on 5 Corpus with respective weights: Books-1, Books-2, Common Crawl, WebText2, Wikipedia
175 billion parameters, capable of in-context learning, uses few-shot, one-shot and zero-shot setting
Updates: GPT 3.5
InstructGPT (January 2022)
ChatGPT (November 2022)
2023: GPT-4
Updated version of GPT-3
Speculated to have 100 trillion parameters
Text-only model
Conclusion
OpenAI through its GPT models has revolutionized industries and the way they create content. Producing human-like or natural responses, GPT models have certainly reduced human workload and effort. With the coming GPT-4 model, things are expected to be taken up a notch, overcoming the shortcomings of existing GPT models.
Hence, while all the GPT language models so far have served human purposes and paved way for bigger and better inventions in the AI space, it is imperative to use them with caution due to their susceptibility to being misused or exploited.
Related Blog: Common Blockchain Terms