What is an LLM? Large language models (LLMs) are a category of foundation models trained on immense amounts of data making them capable of understanding and generating natural language and other types of content to perform a wide range of tasks.
Ollama is an open-source project that serves as a powerful and user-friendly platform for running LLMs on your local machine. Since you're hosting the LLM on the Pi, there is NO LIMIT to how many questions you can ask. Whoo hoo!
For these tests, it is recommended that you ssh from your laptop terminal (not VNC) for a faster experience.
Off-Site tutorial to install Ollama and Ollama models
There are many models to try. Search for some here!
Recommended Models to try:
Phi3 is an LLM developed by Microsoft to be super lightweight while retaining some of the quality results you expect from significantly larger models.
TinyLlama is super lightweight and one of the fastest LLMs you can run on the Raspberry Pi’s processor. While it might not produce as high-quality results as larger models like Phi3 or the heavyweight Llama 3, it is still more than capable of answering most basic questions. Additionally, what TinyLlama lacks in quality definitely makes up for with speed.
Llama3 is a heavyweight language model, especially compared to TinyLlama and Phi3. It is capable of some super high-quality results but is a model that you need to be patient with when running it on your Pi.
Helpful commands:
cntl + C to stop a chat midway.
Here's an example that uses Pygame to create a custom interface with TinyLlama. ollama_pygame_chat.py