Welcome to iRun! Here are quick instructions to get you going:
For LLMs (Coding, chatting, debating, thinking, analyzing, vision, etc) download models through the "HuggingFace" interface provided in the app. If you must do it manually, then go to (https://huggingface.co/) and download a text-based model, then find it on your device (once downloaded) and place it in (Finder>"iRun Docs">LLM-models).
For Image models, currently users must download them manually from HuggingFace.
For quick links (Image Models), use these:
Generally you should tailor your experience (through settings), although most models will function properly regardless of the settings.
Step 1: Identify Ram Amount (go into iRun> Settings (Gear Icon)> Scroll to see total ram amount)
Step 2: Downloading & Loading (50% of Ram Amount is a good size for LLMs), try sticking with Q4_K_M or Q8_K_M (Quantizations).
Q8_0 quants are good for accuracy and preserving contexts
Q4_0 quants are good for a balance between speed and accuracy, they retain about 80-90% of the accuracy while being much smaller
Choosing the right amount of Parameters:
350M? 1B? 35 Billion? How many parameters?
Generally, you should go for higher parameter count with a lower quant (Q4_0 for example), because it will have much more depth and knowledge than smaller, more "precise" AI-models. Go for the one which fits and runs nicely within your device's memory.
General recommendations for parameters:
8gb total ram - stick to a 4B-9B model at (Q4_0)
16gb total ram - stick to a 9B-14B model at (Q4_0)
32gb total ram - stick to a 9B-14B model at (Q8_0) or 14B-35B models at (Q4_0)
Newer models (latest ones) usually beat older models (even those with higher quants and parameters), so use the latest models for the best experience.
Chat/General purposes: (GGUF Format only!!! MLX experimental)
Google's Gemma models work really well for this (Gemma2, Gemma3, Gemma4)
Qwen works good in most cases (Qwen3, Qwen3.5, Qwen3.6)
Llama is arguably the "simplest", it does as asked but is not very creative or chatty (Llama2, Llama2.1, Llama3)
Bonsai 8B by PrismML is also supported (even on iOS), though this model tends to think a lot and generally takes longer to respond. It also struggles with intensive tasks, since it is not trained at all for tool usage or tool calling.
Image Gen: (GGUF for iOS/iPad, but does also support safetensors format on MacOS)
Stable Diffusion is great for this (SD1.5) but not all models are compatible!!! Many models are ComfyUI-specific conversions, so read carefully before installing one!
SDXL is also compatible (SDXL 1.0)
Turbo-SD/Turbo-SDXL are supported (For iOS/iPadOS this is the best option for quality and speed)
Coding/Research/Heavy Tasks:
Google's Latest Gemma models have proven to be the most capable so far (Gemma3, Gemma4) even at low parameter and quant count. (Great for 8gb ram devices). These offer great overall skills (like a swiss knife), though they struggle with specific, detailed tasks.
Qwen offers more "Agentic" work, specifically the Qwen3.6 models have proven to be great at analysis, handling MOE, tailored responses, and overall code completion/bugfixing. Good for specific, detailed tasks.
DeepSeek offers great accuracy, but at the cost of speed.
You can tailor how the AI responds in three distinct ways:
Fast/Thinking/Pro - this is a free feature, allowing more thinking/tool usage, so you can choose the mode depending on the complexity of the task.
Response sliders - This is a rough budget that the AI allocates to each section (thinking, analysis, chat, code, vision, etc). It just tells the AI "Do this more, this less).
Manager (Pro Feature) - Custom personalities, with custom instructions and per-model profiles (you can even have unlimited personalities for the same AI, and only load the one you currently need).
Themes:
Currently the app has 4 themes (including default). Some have animated backgrounds/objects (which can be paused for optimization).
Manager (Pro) - Create and manage custom personalities and agents
Response Tailoring
We tried to make the app as cross-platform compatible as possible, but the "Heavy" features need to be reserved for MacOS. Here is a list of the current MacOS-only features:
Tool Usage - Agent receives access to tools
Agent Mode (Beta) - Very comprehensive RAG tools, light-weight system prompts, Agentic Harness, Revolver Mode (AI alternates between plan, build and bughunt, automatically).
DeepSearch (Testing/Early Access) - A research tab, finds sources and compiles reports (pdf format).
Workflows (Coming Soon) - Ability to create custom workflows using agents/personalities. More information will be available when the feature is nearing completion.
Your Privacy
iRun is designed to run Al models entirely on your device. When using local models (GGUF, MLX), your conversations, prompts, uploaded files, and generated content never leave your device. No analytics, telemetry, or usage data is collected or transmitted.
Cloud Al
If you optionally enable cloud Al services (e.g., Google Gemini via AP! key), your prompts and attached content are sent to that third-party provider's servers. Each provider has its own data handling policies. iRun has no control over how third parties process your data.
API Keys
API keys you provide are stored securely in your device's Keychain and are sent only to the corresponding provider.
Remote & Network Features
iRun's remote control feature uses local network (Multipeer Connectivity) to connect your devices. No data is sent to external servers.
Your Data is Yours
Chat history, generated images, and all app data are stored locally on your device. You can delete them at any time through Settings or by removing the app.
Contact
For questions about this policy, contact us at: irunapp@proton.me
Please read the following terms carefully before using our software:
1. Acceptance
By using iRun, you agree to these terms. If you do not agree, do not use the app. I hereby certify I am over the legal age limit of this app within my region.
2. Local Al Processing
iRun runs Al models locally on your device. Core features (Chat, IdeaGen, Agent, Workflows) work offline with downloaded models. You are responsible for any models you download and import.
3. Third-Party Cloud Services
Optional cloud Al features require you to provide your own API key. iRun is not responsible for the availability, accuracy, security, or data practices of third-party services.
4. Acceptable Use
You agree not to use iRun to generate illegal content, violate laws, infringe on others' rights, or distribute malware. The user is solely responsible for their usage of all tools and functionalities within the app. Going against these terms may result in legal consequences, termination of this contract, and restrictions from iRun.
5. Al Limitations
Large language models (LLMs) can hallucinate or make errors. Always verify critical outputs. iRun provides Al-powered assistance, but the user should exercise judgment and not rely solely on Al-generated content for important decisions.
6. User Responsibility
The user is solely responsible for any settings, models, prompts, and all outputs that the Al models produce on his/her behalf. This includes ensuring compliance with applicable laws and respecting the rights of others.
7. Disclaimer
iRun is provided "as is" without warranty of any kind. The developers are not liable for any damages arising from the use of this software.
8. Changes
These terms may be updated. Continued use after changes constitutes acceptance of the new terms.
Thank You
iRun does not handle the subscription process, Apple does. Here is what you need to do:
On your Apple Device, ensure you are signed into your Apple ID with the account holding the membership.
Go to your device's settings and click on your username/email card.
In the account menu, look for "Media & Purchases" (App-Store icon). Click "Manage Subscriptions".
Follow Apple's steps to cancel your subscription.
You're welcome back anytime!
Due to how LLMs behave and operate, we can't be certain they will always remain safe or child-friendly. This app is made for hobbyists and should not be used by children, (at least without the supervision of a guardian).
App Store Age Rating -> 16+