Running Local LLMs on Android

Unlocking the Power of AI in Your Pocket

The ability to run Large Language Models (LLMs) directly on Android smartphones offers a new way to interact with AI technology. This approach brings some practical benefits to users, such as enhanced privacy and instantaneous availability. Local processing means you can use AI capabilities offline, which is useful when you don't have internet access.

In the following guide, we'll set up Termux, Ollama, and a compact LLM like Llama 3.2 1B on an Android device.

Step 1: Termux installation

Termux is a terminal emulator and Linux environment application for Android devices. It provides a command-line interface similar to what you'd find on a Linux computer, but directly on your Android smartphone or tablet. Termux enables users to run programming languages, execute scripts, manage files, and use network tools without requiring root access.

For this guide, install Termux directly from the Google Play Store.

Termux interface on Android.

After installation, update Termux and install the latest packages by running:

$ pkg update && pkg upgrade

Enter y to accept the installation.

Now on Termux, we will install Ollama dependencies by running:

$ pkg install git cmake golang

Step 2: Install and Compile Ollama

Ollama is an open-source tool that enables users to run LLMs locally on their own devices. It provides a streamlined way to deploy, manage, and interact with various AI models without relying on cloud services.

First, clone Ollama's repository:

$ git clone --depth 1 https://github.com/ollama/ollama.git

And build it:

$ cd ollama
$ go generate ./...
$ go build .

After building it, copy Ollama's binary file to a global path:

$ cp ollama /data/data/com.termux/files/usr/bin/

And remove unnecessary files:

$ chmod -R 700 ~/go
$ rm -rf ~/go

Step 3: Running Llama 3.2 1B on Ollama

Llama 3.2 1B, released on September 25, 2024, is a smaller version of Meta's language model. It is designed to be more resource-efficient, making it suitable for deployment on devices with limited resources.

Let's create a script to automate the download and execution of this LLM. First, create a file named start_ollama.sh:

$ nano ~/start_ollama.sh

Next, copy these exact lines and paste them into the script you've just created:

# !/data/data/com.termux/files/usr/bin/bash
# Kill any existing Ollama server processes
echo "Killing Ollama..."
pkill -x ollama
# Start Ollama server in the background, detached from this shell
nohup ollama serve > ~/ollama_server.log 2>&1 &
# Wait for the server to initialize
sleep 2
echo "Starting Ollama..."
# Run the model
ollama run llama3.2:1b

Once you've finished editing the script, save it and close the file. To make it executable, you'll need to change its permissions:

$ chmod +x ~/start_ollama.sh

To download and start using the LLM, execute the script:

$ ./start_ollama.sh

Every time you run it, the script will restart the Ollama process and load the Llama 3.2 1B model, providing a fresh instance for each use.

Llama 3.2 1B is one of the smaller LLM models available (but not necessarily the best). If you're interested in exploring other models like me, I recommend visiting these two websites, which offer a variety of models for download:

- https://ollama.com/search
- https://huggingface.co/models

However, keep in mind that larger models will require more resources from your device, leading to slower performance or even incompatibility.
The IBM Granite-3.1-2B-Instruct model has been my top choice for running on my Samsung Galaxy S21. It offers a great balance between performance and size, making it ideal for this device.

Conclusion

Congratulations! You've successfully set up a powerful language model on your Android device. With Termux and Ollama, you now have a versatile tool for experimentation and development. Enjoy exploring the capabilities of the Llama 3.2 1B model.

Google Sites

Report abuse