Introduction To LLMs

E.T. Focus Morning

Focus Morning

Your "Bog-standard" Large Language Models

You are probably familiar with chatting with AI. Most of what we engage with are LLMs (Large language Models) such as the following. LLMs are traine don massive amounts of data. As a result they can simulate:

Using natural human language to communicate i.e. the ability to converse.
being able to communicate about a huge range of topics i.e. the content of that communication.

You can play with the LLMs below. Most have a free option. You might just need to set up an account but you can just sign in with Google or Facebook or your email address and password.

Claude (Anthropic)
Copilot (Microsoft)
Deepseek (the Chinese model that caused a major stir)
- Very impressive but there are security concerns. To avoid using the Chinese website, you can access Deepseek:
  - By installing it on your own computer (e.g. via
  - In Perplexity
Gemini (Google)
GPT (Generative Pre-trained Transformers) (OpenAI)
Grok (xAI) - Grok 2 available online. Grok 3 only available to X users.
Llama (Meta)
- Go to https://www.meta.ai/ to use the models.
- Can be downloaded and run on your own computer.
Mistral

Most models have a website but also their own smartphone app.

An LLM by any other name

Some sites or tools use one of these models under the hood but provide their own way for people to interact with them. They use APIs (Application programming Interfaces) to access the LLM behind their website or app.

Perplexity is a little different in that it has its own model but combines that with other models for best performance.

Multiple LLMs

Other sites and tools simply make a number of the above models available, giving the user a choice.

Poe.com offers a large range of LLMs as well as allowing users to create their own bots. (See "Add In System Prompts" below.)

Have A Go

You can ask just about anything of any of the above LLMs, but here are some suggestions.

Retell the story of Queen Esther from the Bible is a way suitable for a 10-year old boy. Make it dramatic and weave in a subtle lesson we can learn from that story. Tell the story in the first person.
Please give me ideas for developing a relationship with my neighbour. He is roughly 50 and seems to want to keep to himself but I wonder if he is lonely.
Is it true that octopuses have 9 hearts? If so, why?
Help me write a letter to a friend who I feel is ignoring me and I don't know why.

There are more sample prompts on the Prompting page.

But Then There Are Multiple Models

Increasingly each company has multiple models that the user can choose from. For example, Gemini currently offers Gemini 2.0 Flash, Gemini 2.0 Flash Thinking and Gemini 2.0 Flash Thinking With Apps on its free tier and even more models for paying users. Each model has its own strengths and weaknesses.

And then there are options within each model.

Perplexity currently offers Auto, Pro, Pro Search, and Deep Research as well as Reasoning with Deepseek, and Reasoning with o3-mini. In Deepseek the user can turn on and off the Deepthink and Search options. OpenAI's GPT has search and reasoning options.

All of this is about finding the tool that best suits the task you are asking of it. It doesn't matter if you don't know although, obviously, if you want it to have the ability to search the internet so as to have up-to-date information (as opposed to being limited to the data it was trained on at some point in the past) you will have to choose a model that does that.

Have A Go

Go to some of the LLMs above and see if there are different models available.
Put the same prompt into two different models and see how the responses differ.

Add In System Prompts

"Prompting" is the act of telling the LLM what you want it to do. There is a skill in knowing how to write good prompts (see below).

However, some prompts are built into the system. They do not change the underlying nature of the LLM. They do not increase its knowledge; they do not further train it in any way, but they are additional prompts telling the LLM how to respond. They might, for example, say, "You are an expert in nuclear physics. You are able to explain difficult concepts in easy-to-understand ways, however, you can also get a little sarcastic if you think you are being asked similar questions repeatedly." A system prompt for a customer service chatbot might be ""You are a friendly and helpful customer service representative for Acme Corporation. Always be polite and professional, and strive to provide accurate and helpful information." System prompts describe aspects such as the LLMs level of expertise, style of response and personality.

Users can build their own bots on:

Poe.com
OpenAI (subscribers only)

On those sites you can also use other people's bots. There might be a cost.

Have A Go

Go to https://chatgpt.com/ or poe.com and try some of the bots there.
I have several bots on Poe.com you can play with (once you have an account)

- Several people with different personalities you can practise your conversation and evangelism skills on.
- Experts
  - AI
    - AI-Julia - an expert on AI.
    - AI In Church - the bot will ask you questions about your knowledge of using AI in a church, then add comments.
  - The Bible
    - Bible Language Prof - an expert on the Bible languages, able to parse words or passages
  - Church
    - Revitalisation Guru - a church consultant specialising in church revitalisation.
    - Church Consultant - asks questions about your church, then gives an assessment
    - Small Group Mentor - an expert on health small groups
- Grandads who tell Bible stories
  - Grandad's Bible Time - tells Bible stories for children
  - Bible Time For Teens - tells Bible stories for teenagers

Fine-Tuned LLMs

Fine-tuned LLMs are trained on additional specialist information. They retain the understanding of language and the ability to converse. They might retain all of the general knowledge that comes with an LLM, or they might not; they might be able to answer only from the material used to fine-tune them.

Examples include:

Expert chatbots in particular fields e.g. trained on medical literature.
Customer service chatbots on a company's (or church's) website that is trained on the information relevant to that company (internal documents, product or service information, FAQs etc). Users can ask for detailed information on that company because it is trained on that. A more general LLM would not have that knowledge.
- 10 Best Customer Service Chatbots in 2025

Examples

Robin AI - for legal work. Google. YouTube.
Tidio customer service chatbot. Google. YouTube.

Have A Go

Many websites now have a chat option which will be trained on that organisation's data.

Best 39 LLMs in 2025

Comparing LLMs

What To look For

When comparing LLMs, you might consider...

Able to access the internet. Those that cannot will have a cut-off date for the information they were trained on and will not be able to provide more recent information.
Proprietary vs. Open Source: Some LLMs, like GPT-4 and Gemini, are proprietary, meaning their code and training data are not publicly available. Others, like Llama 2 and Falcon, are open source, allowing researchers and developers to access and modify them.
Multimodal Capabilities: Some LLMs, like GPT-4 and Gemini, are multimodal, meaning they can process and understand different types of data, such as images and text.
Performance and Specialization: Different LLMs excel at different tasks. Some are better at text generation, while others are better at reasoning or code generation.
Accessibility and Cost: Some LLMs are available through APIs with usage-based pricing, while others are available for free.

Gemini 2.0 says...

Comparing LLMs is a complex task, as they vary significantly in their strengths and weaknesses. Here's a breakdown of key criteria you should use to evaluate and compare them:

1. Performance on Benchmarks:

Standardized Tests: LLMs are often evaluated on standardized benchmarks like GLUE, SuperGLUE, and MMLU (Massive Multitask Language Understanding). These tests assess various language tasks, including natural language inference, question answering, and common sense reasoning. Higher scores generally indicate better performance.
Domain-Specific Benchmarks: If you have a specific application in mind (e.g., medical text processing, legal document analysis), look for benchmarks relevant to that domain.

2. Capabilities:

Language Understanding: How well does the LLM understand nuanced language, including sarcasm, humor, and complex sentence structures?
Text Generation: How fluent, coherent, and creative is the text generated by the LLM? Does it avoid grammatical errors and generate contextually appropriate responses?
Reasoning and Problem-Solving: Can the LLM perform logical reasoning, solve math problems, and answer complex questions that require inference?
Knowledge and Information Retrieval: How much factual knowledge does the LLM possess? Can it accurately retrieve and synthesize information from various sources?
Code Generation: Can the LLM generate code in various programming languages? How well does it handle different coding styles and complex programming tasks?
Multilingualism: How well does the LLM perform in different languages? Can it translate text accurately and generate text in multiple languages?

3. Practical Considerations:

Computational Resources: How much computing power (GPU, memory) is required to run the LLM? Some LLMs are very resource-intensive.
Inference Speed: How quickly can the LLM generate responses? This is crucial for real-time applications like chatbots.
Cost: Some LLMs are available through APIs with usage-based pricing. Consider the cost of using the LLM for your specific needs.
Accessibility: Is the LLM readily available through an API or open-source? Are there any restrictions on its use?
Fine-tuning Options: Can the LLM be fine-tuned on a specific dataset to improve its performance on a particular task? How easy is the fine-tuning process?
Documentation and Support: How comprehensive is the documentation for the LLM? Is there good community support or dedicated support from the provider?

4. Ethical Considerations:

Bias: Does the LLM exhibit any biases in its responses? LLMs can inherit biases from the data they were trained on.
Safety: Can the LLM generate harmful or offensive content? How robust is it against adversarial attacks?
Explainability: Can you understand why the LLM generated a particular response? Some LLMs are more "black boxes" than others.

5. Emerging Capabilities:

Keep an eye on emerging capabilities like:

Multimodal Learning: LLMs that can process and generate not just text, but also images, audio, and video.
Interactive Learning: LLMs that can learn from interactions with users and adapt their behaviour over time.

How to Compare:

Direct Comparison: When possible, test different LLMs on the same task or dataset to directly compare their performance.
Read Research Papers: Look for research papers that evaluate and compare different LLMs.
Use Benchmarking Tools: Several tools are available that can help you evaluate and compare LLMs on various metrics.

It's important to note that no single LLM is best for all tasks. The ideal LLM for you will depend on your specific needs and priorities. Carefully consider the criteria above to make an informed decision.

There are many tests of the proficiency of LLMs, and leader boards that compare them.

Ranking LLMs

Page updated

Google Sites

Report abuse

Introduction To LLMs

See also

Your "Bog-standard" Large Language Models

Have A Go

But Then There Are Multiple Models

Have A Go

Add In System Prompts

Have A Go

Fine-Tuned LLMs

Examples

Have A Go

Comparing LLMs

What To look For