Contents
Tags
The local AI space in 2026 is overwhelming. There are hundreds of models on HuggingFace, dozens of quantization formats, and no single place that just tells you what will actually run on your machine — until now. This guide + runyard.dev will get you from zero to running the right LLM in under 15 minutes.
Three numbers determine everything: your GPU VRAM, your system RAM, and whether you have a CUDA-capable NVIDIA GPU, ROCm AMD GPU, or Apple Silicon. VRAM is the most critical — it determines the maximum model size you can run at speed.
You can also go to runyard.dev and select your GPU from the dropdown — Runyard auto-fills the VRAM for every GPU in the catalog, no manual lookup needed.

Go to runyard.dev. Select your CPU, GPU, and VRAM. The Model Radar scores every open-source LLM across three factors: VRAM fit (40%), memory bandwidth/speed (35%), and benchmark quality (25%). Models drop into S/A/B/C/D tiers. S-tier models are your best bets — they run fast, fit with headroom, and score well on benchmarks.
Quantization compresses model weights so they fit in less VRAM. The most useful formats in 2026:
Model: Llama 3.1 8B
Q4_K_M → ~4.7 GB VRAM (recommended for 8GB cards)
Q5_K_M → ~5.7 GB VRAM
Q8_0 → ~8.5 GB VRAM (needs 12GB card)
FP16 → ~16 GB VRAM (needs 24GB card)# 1. Install Ollama (macOS / Linux)
curl -fsSL https://ollama.ai/install.sh | sh
# Windows: download from https://ollama.ai/download/windows
# 2. Pull the model Runyard recommended
ollama pull llama3.1:8b # 8GB VRAM
ollama pull mistral:7b # 8GB VRAM
ollama pull qwen2.5-coder:7b # coding, 8GB VRAM
ollama pull gemma2:9b # 8GB VRAM, best reasoning at size
# 3. Chat immediately
ollama run llama3.1:8bRunyard shows the exact ollama model tag to use for each model. Click a card in the Tier List at runyard.dev and copy the run command directly — no manual research needed.
The fastest path: go to runyard.dev, enter your specs, look at your S-tier models, filter by your use case, and copy the Ollama command. That's it. What used to take 2 hours of forum-reading now takes 2 minutes.
Tools
Find AI models that fit your exact hardware. Enter your specs and get a ranked list instantly.
Newsletter