Large Language Models and Beyond
“Today a reader, tomorrow a leader.”
― Margaret Fuller
By Stephen Wolfram (February 14, 2023)
A highly recommended article for a basic understanding of how ChatGPT works from the perspective of complex systems.
Seminal Papers about Large Language Models
"Improving Language Understanding by Generative Pre-Training" by Radford et al. (2018): This is the paper that introduced the first version of the GPT model. It laid the foundation for the use of transformer-based models in natural language processing.
"Language Models are Unsupervised Multitask Learners" by Radford et al. (2019): This paper presents GPT-2, an extension of the original GPT model, with significantly more parameters and trained on a larger dataset.
"Language Models are Few-Shot Learners" by Brown et al. (2020): This paper introduces GPT-3, the third iteration in the GPT series. It highlights the model's few-shot learning capabilities, where it performs tasks with minimal task-specific data.
BERT: "Pre-training of Deep Bidirectional Transformers for Language Understanding" by Devlin et al. (2018): While not a GPT paper, this work by researchers at Google is a seminal paper in the field of LLMs. BERT introduced a new method of pre-training language representations that was revolutionary in the field.
"Attention Is All You Need" by Vaswani et al. (2017): This paper, although not directly related to GPT, is crucial as it introduced the transformer architecture, which is the backbone of models like GPT-2 and GPT-3.
"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" by Raffel et al. (2019): This paper from Google researchers presents the T5 model, which treats every language problem as a text-to-text problem, providing a unified framework for various NLP tasks.
"XLNet: Generalized Autoregressive Pretraining for Language Understanding" by Yang et al. (2019): XLNet is another important model in the LLM domain, which outperformed BERT on several benchmarks by using a generalized autoregressive pretraining method.
"ERNIE: Enhanced Representation through Knowledge Integration" by Sun et al. (2019): Developed by Baidu, ERNIE is an LLM that integrates lexical, syntactic, and semantic information effectively, showing significant improvements over BERT in various NLP tasks.
Other seminal papers on language models and generative AI
The Utility of Large Language Models and Generative AI for Education Research: This paper explores the integration of NLP feature extraction techniques with machine learning models like SVMs and Decision Trees for educational applications like automated grading.
Science in the Age of Large Language Models: Published in Nature Reviews Physics, this article discusses the critical stage of generative AI (GenAI) in scientific research and the importance of integrating GenAI responsibly into scientific practice.
An editorial from MIT Press, "What Have Large-Language Models and Generative AI Got to Do With It?", delves into the implications of generative algorithms and the ethical use of AI-generated text in various contexts.
What ChatGPT and Generative AI Mean for Science: This Nature article provides insights into the role of ChatGPT and generative AI in the scientific community, highlighting potential impacts and considerations.
Fundamentals of Generative Large Language Models and Perspectives in Cyber-Defense: This paper discusses the text generation capabilities of LLMs, including various sampling strategies like maximum likelihood and top-K, crucial for understanding the functioning of these models.
Large Language Models for Generative Information Extraction: This survey looks at the application of LLMs in information extraction, showcasing various models and techniques in this domain.
Autonomous Chemical Research with Large Language Models: Published in Nature, this paper discusses the application of LLMs in automating chemical research, highlighting the integration of models like GPT-4 with robotic systems for laboratory task
Large Language Models and Robotics [2024]
Research Papers
AuRo special issue on large language models in robotics2: This special issue of Autonomous Robots focuses on the use of LLMs in robotics. It contains 8 papers covering a range of topics including robotic applications such as chemistry, robotic control, task planning, anomaly detection, and more2.
Creative Robot Tool Use with Large Language Models3: This paper discusses the use of LLMs in teaching robots complex skills3.
Incremental Learning of Humanoid Robot Behavior from Large Language Models4: This paper presents a system that deploys LLMs for high-level orchestration of the robot’s behavior4.
Blogs
Large Language Models-powered Human-Robotic Interactions5: This blog post discusses the potential of LLMs in enhancing human-robotic interactions5.
Making robots more helpful with language6: This blog post discusses how the use of LLMs can improve the performance of robots and enable them to execute more complex and abstract tasks6.
Exploring the Synergy of Large Language Models and Social Robots with Cognitive Models7: This blog explores the synergies between LLMs and social robots equipped with cognitive models7.
Eureka! NVIDIA Research Breakthrough Puts New Spin on Robot Learning8: This blog post discusses a new AI agent developed by NVIDIA Research that uses LLMs to teach robots complex skills8.
Using large language models to code new tasks for robots9: This blog post discusses the use of LLMs in coding new tasks for robots9.
Large Language Models and Cognitive Science [2024]
Research Papers
Turning large language models into cognitive models1: This paper discusses whether large language models can be turned into cognitive models. It finds that after fine-tuning them on data from psychological experiments, these models offer accurate representations of human behavior1.
Cognitive Effects in Large Language Models2: This work tested GPT-3 on a range of cognitive effects, which are systematic patterns usually found in human cognitive tasks. It found that LLMs are indeed prone to several human cognitive effects2.
Blogs
How Large Language Models Will Transform Science, Society, and AI4: This blog post discusses the impact of large language models on various fields, including cognitive science4.
Large Language Models: A Cognitive and Neuroscience Perspective5: This blog provides insights into the relationship between large language models and cognitive neuroscience5.
10 Exciting Projects on Large Language Models(LLM)6: This blog post explores 10 exciting projects that harness the power of LLMs6.
property of CIPAR TEAMS © - 2024