Comprehensive curated list of AI models, tools, and resources for developers and researchers. From frontier proprietary models to self-hostable open-source alternatives, from AI-powered IDEs to automation frameworks.
- Awesome AI Models Matrix 🧠
- Table of Contents
- Models 🧠
- Development Tools 🛠️
- Automation 🤖
- Guides 📚
- Reference 📖
- License
Comprehensive documentation of Large Language Models (LLMs), Small Language Models (SLMs), and specialized AI models available today.
State-of-the-art proprietary AI models with cutting-edge capabilities from leading AI labs.
| Model | Company | Context | Key Features | Pricing |
|---|---|---|---|---|
| Claude Opus 4.6 | Anthropic | 1M | Agent teams, enhanced coding/reasoning | $5 / $25 |
| Claude Sonnet 4.6 | Anthropic | 1M | Near-Opus performance, Sonnet price | $3 / $15 |
| GPT-5.3-Codex | OpenAI | 400K | Agentic coding, 128K output | TBD |
| Gemini 3.1 Pro | 1M | 77.1% ARC-AGI-2, 2x reasoning boost | $2 / $12 | |
| Gemini 3 Deep Think | 1M+ | 84.6% ARC-AGI-2, science/research | Ultra subscription | |
| GLM-5 | Zhipu AI | 200K | Agentic engineering, long-horizon tasks | $1.00 / $3.20 |
| MiniMax-M2.5 | MiniMax | 200K | Coding/refactoring, tool calling, long context | $0.30 / $1.20 |
| Kimi K2.5 | Moonshot AI | 256K | Native multimodal, thinking & agent tasks | $0.60 / $3.00 |
| DeepSeek-V4 | DeepSeek | 1M+ | Engram memory, coding focus | Pay-per-token |
| Qwen3.5-Max | Alibaba | 128K | Hybrid attention, native VLM | Pay-per-token |
| Gemini 3 Pro | 1M+ | PhD-level reasoning, agentic tool-use | Tiered pricing | |
| Gemini 3 Flash | 10M | Pro-grade reasoning, Flash speed | $0.30 / $2.50 | |
| GPT-5 | OpenAI | 400K | Thinking & Instant variants | $1.25 / $10.00 |
| GPT-5 mini | OpenAI | 128K | Cheap reasoning | $0.25/1M |
| Mistral Large 3 | Mistral AI | 128K | 675B params, MoE, Open-weight | Varies |
| Claude Sonnet 4.5 | Anthropic | 200K | SWE-bench leader, best coding | $3 / $15 |
| Llama 4 Scout | Meta | 10M | Open-weight context king | Free (self-host) |
| Llama 4 Maverick | Meta | 128K | 400B params, multimodal | Free (self-host) |
| Grok 4 | xAI | 128K | First-principles reasoning | $3 / $15 |
| Grok 4 Fast | xAI | 128K | Cost-efficient variant | $0.20 / $1.50 |
| Category | #1 | #2 | #3 |
|---|---|---|---|
| Coding | Claude Opus 4.6 | GPT-5.3-Codex | Claude Sonnet 4.5 |
| Reasoning | Gemini 3 Deep Think | Qwen3-Max-Thinking | o3 |
| Open Source | DeepSeek-V4 | Qwen3.5-Max | Llama 4 |
| Cost Efficiency | DeepSeek-V3.1 | Grok 4 Fast | GLM-4.7-FlashX |
| Context Window | Gemini 3 Flash (10M) | Llama 4 Scout (10M) | Claude Opus 4.6 (1M) |
| Model | Company | Release Date | Key Features | Category |
|---|---|---|---|---|
| DeepSeek R1 | DeepSeek | January 2026 | State-of-the-art reasoning, math, coding; 671B params | Reasoning |
| NVIDIA Alpamayo | NVIDIA | January 5, 2026 | Open AI models for autonomous vehicles; human-like reasoning for self-driving cars | Specialized |
| TranslateGemma | January 16, 2026 | Multilingual translation models for mobile, laptops, cloud; supports 55 languages | Specialized | |
| Kimi K2.5 | Moonshot AI | January 29, 2026 | Native multimodal; thinking/non-thinking; dialogue + agent tasks | Frontier |
| Model | Company | Release Date | Key Features | Category |
|---|---|---|---|---|
| Gemini 3.1 Pro | February 19, 2026 | 77.1% ARC-AGI-2, 2x reasoning boost, 1M context | Frontier | |
| Claude Sonnet 4.6 | Anthropic | February 17, 2026 | Near-Opus performance at Sonnet price, 1M context | Frontier |
| Claude Opus 4.6 | Anthropic | February 5, 2026 | Agent teams, enhanced coding/reasoning, 1M context | Frontier |
| GPT-5.3-Codex | OpenAI | February 5, 2026 | Most capable agentic coding model, 400K context, 128K output | Coding |
| Gemini 3 Deep Think | February 12, 2026 | 84.6% ARC-AGI-2, science/research/engineering focus | Reasoning | |
| GLM-5 | Zhipu AI | February 12, 2026 | Agentic LLM, long-range agent tasks, 200K context | Frontier |
| DeepSeek-V4 | DeepSeek | February 2026 | Engram memory, 1M+ context, coding focus | Open Source |
| Qwen3.5-Max | Alibaba | February 2026 | Hybrid attention, native VLM, multimodal | Open Source |
| Model | Company | Latest Updated | Notes | Official Site |
|---|---|---|---|---|
| Gemini 3.1 Pro | 2026-02-19 00:00 UTC ⭐ | 77.1% ARC-AGI-2, 2x reasoning boost | 🔗 | |
| Claude Sonnet 4.6 | Anthropic | 2026-02-17 00:00 UTC ⭐ | Near-Opus performance at Sonnet price | 🔗 |
| Gemini 3 Deep Think | 2026-02-12 00:00 UTC ⭐ | ARC-AGI-2 result highlighted | 🔗 | |
| GLM-5 | Zhipu AI | 2026-02-12 00:00 UTC ⭐ | Agentic engineering, long-horizon tasks | 🔗 |
| Claude Opus 4.6 | Anthropic | 2026-02-05 00:00 UTC ⭐ | Agent teams, enhanced coding/reasoning | 🔗 |
| GPT-5.3-Codex | OpenAI | 2026-02-05 00:00 UTC ⭐ | Agentic coding focus | 🔗 |
| Kimi K2.5 | Moonshot AI | 2026-02-02 00:00 UTC ⭐ | Models & pricing published | 🔗 |
| DeepSeek-V4 | DeepSeek | 2026-02-17 00:00 UTC | Release window announced | 🔗 |
| Qwen3.5-Max | Alibaba | 2026-02 | February 2026 release window | 🔗 |
Self-hostable models with permissive licenses or open weights for privacy, cost control, and customization.
| Model | Company | Params | Context | License |
|---|---|---|---|---|
| DeepSeek-V4 | DeepSeek | 671B | 1M+ | MIT |
| Qwen3.5-Max | Alibaba | 1T+ | 128K | Apache 2.0 |
| Qwen3-Max-Thinking | Alibaba | 1T+ | 128K | Apache 2.0 |
| Mistral Large 3 | Mistral AI | 675B (MoE) | 128K | Apache 2.0 |
| Llama 4 Scout | Meta | 109B | 10M | Community |
| Llama 4 Maverick | Meta | 400B | 128K | Community |
| GPT-OSS-120B | OpenAI | 117B | 128K | Apache 2.0 |
| GPT-OSS-20B | OpenAI | 21B | 128K | Apache 2.0 |
| Qwen3-Coder | Alibaba | 480B | 128K | Apache 2.0 |
| GLM-4.7 | Zhipu AI | 400B+ MoE | 128K | Open Weight |
| Phi-4 | Microsoft | 14B | 128K | MIT |
| Granite 4.0 | IBM | 8B-3B | 128K | Apache 2.0 |
| DeepSeek-Coder-V2 | DeepSeek | 236B | 128K | MIT |
| Yi-Coder | 01.AI | 9B/1.5B | 128K | Apache 2.0 |
Local Inference Tools:
- Ollama - Easy local deployment
- LM Studio - User-friendly GUI
- llama.cpp - Efficient CPU inference
- vLLM - High-throughput serving
- SGLang - Structured generation
Cloud Deployment:
- Hugging Face Inference - Managed deployment
- AWS SageMaker - Full control
- Google Cloud Vertex - Integrated
- RunPod - GPU rental
Specialized AI models optimized for software development tasks.
| Rank | Model | Company | Score |
|---|---|---|---|
| 🥇 #1 | Claude Opus 4.6 | Anthropic | SOTA |
| 🥈 #2 | GPT-5.3-Codex | OpenAI | Agentic leader |
| 🥉 #3 | Claude Sonnet 4.5 | Anthropic | ~92% |
| #4 | GPT-OSS-120B | OpenAI | 91.4% AIME |
| #5 | Kimi K2.5 | Moonshot AI | Excellent |
| Model | Developer | Pricing | Best For |
|---|---|---|---|
| Claude Opus 4.6 | Anthropic | $5 / $25 per 1M | Agentic coding, complex tasks |
| GPT-5.3-Codex | OpenAI | TBD | Agentic coding, 7+ hour autonomy |
| GLM-5-Code | Zhipu AI | $1.20 / $5.00 per 1M | Code generation, refactoring |
| MiniMax-M2.5 | MiniMax | $0.30 / $1.20 per 1M | Code generation, refactoring |
| Claude Sonnet 4.5 | Anthropic | $3 / $15 per 1M | Code review, refactoring |
| Codestral | Mistral AI | $0.30 / $0.90 | Real-time completion |
| Grok Code Fast | xAI | $0.20 / $1.50 | Most used (50% share) |
| Model | Developer | License | Hardware |
|---|---|---|---|
| GPT-OSS-120B | OpenAI | Apache 2.0 | 80-160 GB VRAM |
| Qwen3-Coder | Alibaba | Apache 2.0 | 160-320 GB VRAM |
| DeepSeek-Coder-V2 | DeepSeek | MIT | 48-80 GB VRAM |
| GLM-4.6 | Zhipu AI | Open Weight | 80-160 GB VRAM |
| Phi-4 | Microsoft | MIT | 24-48 GB VRAM |
Models optimized for step-by-step reasoning, mathematical problem-solving, and complex logical inference.
| Rank | Model | Score | Notes |
|---|---|---|---|
| 🥇 #1 | Gemini 3 Deep Think | 84.6% ARC-AGI-2 | Science/research focus |
| 🥈 #2 | Qwen3-Max-Thinking | 100% | Perfect AIME score |
| 🥉 #3 | GPT-5 Pro (with tools) | 100% | With Python tools |
| #4 | GPT-OSS-120B | 91.4% | Open-source leader |
| #5 | o3 | ~96.5% | OpenAI reasoning |
| #6 | DeepSeek-R1 | 81% | Pure RL-based |
| Model | Type | Context | Pricing |
|---|---|---|---|
| Gemini 3 Deep Think | Reasoning | 1M+ | Ultra subscription |
| Qwen3-Max-Thinking | Reasoning/Coding | 128K | $1.20 / $6.00 |
| o3 / o1-Pro | Reasoning | 128K | $2-150 / $8-600 |
| Gemini 3 Pro | General/Multimodal | 1M+ | $2 / $12 |
| DeepSeek-R1 | Reasoning | 128K | $0.50 / $2.15 |
| Claude Sonnet 4.5 | Hybrid | 200K | $3 / $15 |
- Mathematical Problem Solving: Qwen3-Max-Thinking, GPT-5 Pro, Gemini 3 Pro
- Scientific Analysis: Claude Opus 4.6, GPT-5, Gemini 3 Pro
- Strategic Planning: o3/o1-Pro, Claude Sonnet 4.5, DeepSeek-R1
- Code Debugging: Claude Sonnet 4.5, GPT-5.3-Codex, DeepSeek-V3.1
Models capable of processing and generating multiple types of content: text, images, audio, and video.
| Model | Developer | Context | Key Features |
|---|---|---|---|
| GPT-5 | OpenAI | 400K | Unified multimodal, audio |
| Gemini 3 Pro | 1M+ | Native multimodal, video | |
| Claude Sonnet 4.5 | Anthropic | 200K | Document understanding |
| Llama 4 Maverick | Meta | 128K | Open multimodal |
| Model | MMMU | MathVista | DocVQA |
|---|---|---|---|
| Gemini 3 Pro | SOTA | SOTA | SOTA |
| GPT-5 | Excellent | Excellent | Excellent |
| Claude Sonnet 4.5 | Strong | Strong | Excellent |
| Llama 4 Maverick | Good | Good | Good |
| Model | Speech-to-Text | Text-to-Speech | Video Input |
|---|---|---|---|
| Gemini 3 Pro | ✅ | ✅ | ✅ |
| GPT-5 | ✅ | ✅ | |
| Whisper v3 | ✅ | ❌ | ✅ |
| Model | Developer | License | Best For |
|---|---|---|---|
| Flux.1 | Black Forest Labs | Apache 2.0 | High-fidelity art |
| Stable Diffusion 3.5 | Stability AI | Community License | Fine-tuning |
| GLM-Image | Zhipu AI (Z.ai) | API | Fast image generation |
| CogView-4 | Zhipu AI (Z.ai) | API | Creative image generation |
Comprehensive hardware specifications for self-hosting AI models.
| Model | Params | Q4 Size | Min VRAM | Rec VRAM | Min RAM |
|---|---|---|---|---|---|
| Phi-4 | 14B | 8 GB | 24 GB | 48 GB | 32 GB |
| GPT-OSS-20B | 21B | 12 GB | 24 GB | 48 GB | 32 GB |
| Llama 4 Scout | 109B | 66 GB | 48 GB | 80 GB | 96 GB |
| GPT-OSS-120B | 117B | 70 GB | 80 GB | 160 GB | 128 GB |
| DeepSeek-Coder-V2 | 236B | 143 GB | 48 GB | 80 GB | 192 GB |
| Llama 4 Maverick | 400B | 242 GB | 160 GB | 320 GB | 320 GB |
| DeepSeek-V4 | 671B | 404 GB | 80 GB | 320 GB | 512 GB |
| Qwen3-Max-Thinking | 1T+ | 600+ GB | 160 GB | 640 GB | 768 GB |
Consumer/Entry Level (24-48 GB VRAM):
- Phi-4, GPT-OSS-20B, Yi-Coder, Qwen2.5-Coder
- Recommended GPUs: RTX 3090 (24GB), RTX 4090 (24GB)
Professional (80-160 GB VRAM):
- Llama 4 Scout, GPT-OSS-120B, DeepSeek-Coder-V2
- Recommended GPUs: A100 80GB, 2x A100 40GB
Enterprise (320+ GB VRAM):
- Llama 4 Maverick, GLM-4.7, DeepSeek-V4, Qwen3-Max-Thinking
- Recommended GPUs: 4x A100 80GB, 8x A100 80GB
| Level | Bits | Size vs FP16 | Quality | Use Case |
|---|---|---|---|---|
| FP16/BF16 | 16 | 100% | Best | Training |
| Q8_0 | 8 | ~50% | Excellent | High-quality inference |
| Q4_K_M | 4 | ~25% | Good | Recommended for deployment |
| Q3_K_M | 3 | ~19% | Fair | Limited resources |
AI-powered tools for software development, from IDEs and CLI tools to API providers and IDE extensions.
Integrated Development Environments with built-in AI capabilities.
| IDE | Platform | Version | Release Date | Pricing | Key Features | GitHub |
|---|---|---|---|---|---|---|
| Firebase Studio | Web | - | - | Free (3 workspaces, up to 30 with Google Developer Program) | Cloud-based, Gemini, MCP | ❌ |
| Lingma IDE (通义灵码) | Windows, macOS | - | - | Free (download) | Built-in agent, MCP tool use, terminal command execution | ❌ |
| Tonkotsu | Windows, macOS | - | - | Free (during early access) | Team of agents, workflow | ❌ |
| OpenCode | Windows, macOS, Linux | - | - | Free (OSS) | Terminal, desktop, IDE extension, multi-provider | 🔗 |
| Visual Studio | Windows, macOS | 17.14.12+, 18.1.0+ | January 6, 2026 | Free / $250/yr | Gemini 3 Flash integration, faster performance, zero-migration upgrades, real-time profiler agent | ❌ |
| IntelliJ IDEA | Windows, macOS, Linux | 2025.3.2 | January 2026 | Free / $149/yr | Java 24 support, Kotlin K2 mode, performance/memory improvements | ❌ |
| Editor | Platform | Version | Release Date | Pricing | Key Features | GitHub |
|---|---|---|---|---|---|---|
| Zed | macOS, Windows, Linux | 0.225.0 | February 18, 2026 | Free (OSS) + Copilot $10/mo | Fast, collaboration, Gemini/Claude, Zeta AI, agent thread history, edit prediction providers | 🔗 |
| Dyad | Windows, macOS, Linux | - | - | Free (OSS) | Local generation, BYO keys | 🔗 |
| Memex | macOS, Windows | - | - | Freemium (Free + $10/mo) | Agentic, browser↔desktop | ❌ |
| IDE | Platform | Version | Release Date | Pricing | Autonomous | MCP | GitHub |
|---|---|---|---|---|---|---|---|
| Cursor | Windows, macOS, Linux | 0.46+ | February 12, 2026 | Freemium (Free + Pro $19/mo or $39/mo) | ✅ | ❌ | ❌ |
| Windsurf | Windows, macOS, Linux | 1.9552+ | February 12, 2026 | Freemium (Free + Pro) | ✅ | ✅ | ❌ |
| Trae | macOS, Windows | - | - | Free | ❌ | ❌ | ❌ |
| PearAI | Windows, macOS, Linux | - | - | Free (OSS) | ✅ | ❌ | ❌ |
| Void | Windows, macOS, Linux | - | - | Free (OSS) | ✅ | ✅ | ❌ |
| Google Antigravity | Windows, macOS, Linux | - | - | Free | ✅ | ❌ | ❌ |
| Kiro | Windows, macOS, Linux | - | - | Free (Preview) | ✅ | ✅ | ❌ |
| Platform | Platform | Version | Release Date | Pricing | Self-Hostable | Best For | GitHub |
|---|---|---|---|---|---|---|---|
| Replit 3 | Web | - | - | Free Starter, Core $20-25/mo, Pro $100/mo | ❌ | Learning/Prototyping | ❌ |
| Bolt.new | Web | - | - | Free, Pro $20-25/mo, Teams $200/mo | ❌ | Quick apps | ❌ |
| Bolt.diy | Self-hosted | - | - | Free (MIT), bring your own API | ✅ | Self-hosted | 🔗 |
| Lovable | Web | - | - | Free (5 credits/day), Pro $25/mo, Business $50/mo | ❌ | UI/Full-stack | ❌ |
| v0 | Web | - | - | Free ($5 credits/mo), Premium $20/mo, Teams $30/user | ❌ | React components | ❌ |
| Gitpod | Web | - | - | Free + Paid | ❌ | Cloud dev environments | ❌ |
| Rork | Web | - | - | Free & Paid (credits) | ❌ | Mobile apps (iOS/Android) | ❌ |
Command-line AI tools for autonomous coding and terminal enhancement.
| Tool | Platform | Pricing | Key Features | GitHub |
|---|---|---|---|---|
| Aider | Windows, macOS, Linux | Free | Gold standard, Architect mode, thinking tokens | 🔗 |
| Claude Code 2.1+ | macOS, Linux, Windows | Free + API | Fast mode for Opus 4.6, simple mode file editing, Unicode fix | 🔗 |
| Codex CLI | Windows, macOS, Linux | Included | Sandbox, approval modes | 🔗 |
| Goose | Windows, macOS, Linux | Free (Apache-2.0) | MCP, extensible, desktop app, 25+ providers | 🔗 |
| GPT-Pilot | Windows, macOS, Linux | Free | Full dev team simulation | 🔗 |
| OpenHands | Windows, macOS, Linux | Free | Cloud agents, MCP | 🔗 |
| Mentat | Windows, macOS, Linux | Free | Multi-file coordination | 🔗 |
| Tool | Developer | Pricing | Best For |
|---|---|---|---|
| Gemini CLI | Free | Google ecosystem | |
| Cursor CLI | Cursor | Free tier | Terminal + IDE bridge |
| Qwen Code | Alibaba | Free | Qwen optimization |
| Qodo CLI | Qodo | Free tier | Testing and review |
| Tool | Platform | Pricing | Key Features |
|---|---|---|---|
| Warp Terminal | macOS, Linux, Windows | Free | AI Agents, workflow sharing |
| Fig | macOS, Linux | Free | Autocomplete, AI suggestions |
Extensions and plugins that add AI capabilities to existing IDEs.
| Add-on | Platform | Pricing | Context | Best For | GitHub |
|---|---|---|---|---|---|
| GitHub Copilot | VS Code, JetBrains, Vim | Free / $10/mo / $39/mo | Large | General coding | ❌ |
| Supermaven | VS Code, JetBrains, Neovim | Free / $10/mo | 1M | Large codebases | ❌ |
| Codeium | VS Code, JetBrains, Vim | Free / $15/mo / $60/mo | Medium | Free alternative | ❌ |
| Continue | VS Code, JetBrains | Free (OSS) | Custom | Self-hosted | 🔗 |
| Cody | VS Code, JetBrains, Web | Free (discontinued) / Enterprise Starter $19/mo / Enterprise $59/mo | Enterprise | Code search | 🔗 |
| Tabnine | VS Code, JetBrains, VS, Eclipse | Free / $39/mo | Local | Privacy | ❌ |
| Add-on | Platform | Release Date | Key Features |
|---|---|---|---|
| Gemini 3 Flash Integration | VS Code, JetBrains, Xcode, Eclipse | January 6, 2026 | Access to Google's latest Gemini 3 Flash model directly from IDE; fast response times |
| JetBrains AI Assistant | All JetBrains IDEs | January 2026 | Enhanced AI capabilities, Claude Agent integration, better context understanding |
| Add-on | Pricing | Autonomous | MCP | Best For | GitHub |
|---|---|---|---|---|---|
| Codex | Free (with ChatGPT Plus $20/mo or Pro $200/mo) | ✅ | ✅ | OpenAI's official coding agent | 🔗 |
| Cline | Free | ✅ | ✅ | Full agent | 🔗 |
| GitHub Copilot (Agent Mode) | $0 / $10 / $39/mo | ❌ | Guided agent workflows | ❌ | |
| RooCode | Free/Pro | ❌ | Complex tasks | ❌ | |
| Keploy | OSS/Enterprise | ❌ | ❌ | Testing | ❌ |
| Add-on | Pricing | Claude Agent | Best For |
|---|---|---|---|
| JetBrains AI Assistant | $10/mo (Pro), $249/yr (Ultimate) | ✅ | Deep IDE integration |
| JetBrains Claude Agent | Included in subscription | ✅ | Native agent |
Services for accessing AI models via API.
| Provider | Models | Pricing |
|---|---|---|
| OpenAI | GPT-5, o3, Codex | Pay-per-token |
| Anthropic | Claude 4.6 | Pay-per-token |
| Google AI Studio | Gemini 3 | Free / Pay |
| Z.ai (Zhipu AI) | GLM-5, GLM-5-Code, GLM-4.7 | Pay-per-token |
| MiniMax | MiniMax-M2.5/M2.1/M2 | Pay-per-token |
| Cohere | Command, Embed, Rerank | Pay-per-token |
| AI21 Labs | Jamba | Pay-per-token |
| Perplexity | Sonar / Sonar Pro / Sonar Reasoning Pro | Pay-per-token + request fees |
| Moonshot AI | Kimi (kimi-k2.5, kimi-k2-thinking) | Pay-per-token |
| ByteDance (Volcengine) | Doubao | Pay-per-token |
| Tencent (Hunyuan) | Hunyuan | Pay-per-token |
| Baidu (ERNIE) | ERNIE | Pay-per-token |
| DeepSeek | DeepSeek-V4/R1 | Pay-per-token |
| Mistral AI | Mistral Large 3 | Pay-per-token |
| xAI | Grok-4 | Pay-per-token |
| Provider | Models | Key Features |
|---|---|---|
| OpenRouter | 200+ | Crypto/fiat, rankings |
| Hugging Face | Thousands | Serverless inference |
| Provider | Specialization | Speed |
|---|---|---|
| Together AI | Llama/Qwen/Mistral | Fast |
| Fireworks AI | FireAttention | Low-latency |
| Groq | LPU | >500 T/s |
| Cerebras | Wafer-Scale | >2000 T/s |
| Provider | Type | Best For |
|---|---|---|
| RunPod | GPU Rental | Flexibility |
| Replicate | Model-as-a-Service | Quick deployment |
| Vultr | Global Cloud | Hourly |
| Hyperbolic | Decentralized | Crypto/Fiat |
AI-powered tools for automating browser and desktop tasks.
Tools and frameworks for AI-powered browser automation.
| Browser | Pricing | Open Source | Local AI | Best For | GitHub |
|---|---|---|---|---|---|
| BrowserOS | Free | ✅ | ✅ | Privacy-focused | ❌ |
| Brave Leo | Freemium (Free + Premium) | ❌ | Privacy-focused AI | ❌ | |
| Fellou | Freemium (Free for 4 tasks, $20/mo Plus) | ❌ | ❌ | True agentic browser | ❌ |
| Perplexity Comet | Free (with Pro $20/mo) or $5/mo | ❌ | ❌ | Research | ❌ |
| Dia | Freemium (Free limited, $20/mo Pro) | ❌ | ❌ | Arc replacement | ❌ |
| Opera Neon | $19.90/mo | ❌ | ❌ | Agentic browsing | ❌ |
| Opera One (Aria) | Free | ❌ | ❌ | Built-in AI assistant | ❌ |
| Edge Copilot | Free (Copilot Pro $20/mo) | ❌ | ❌ | Enterprise AI browser | ❌ |
| Extension | Pricing | Free | Multi-Agent | Best For | GitHub |
|---|---|---|---|---|---|
| Harpa AI | Free | ✅ | ❌ | Automation recipes | ❌ |
| MultiOn | Free/Paid | ✅ | Complex tasks | ❌ | |
| NanoBrowser | Free | ✅ | ✅ | Local control | ❌ |
| Library | Language | Best For | GitHub |
|---|---|---|---|
| Browser-use | Python | Agentic automation | 🔗 |
| Stagehand | TypeScript | Web apps | 🔗 |
| LaVague | Python | NL to code | 🔗 |
| Skyvern | Python | CV-based automation | 🔗 |
| Service | Platform | Pricing | Best For | GitHub |
|---|---|---|---|---|
| Skyvern Cloud | Cloud API | Paid | Resilient automation | 🔗 |
| Browserbase | Cloud API | Paid | Stealth mode, session recording | ❌ |
Platforms and runtimes for running or connecting AI agents.
| Project | Type | Self-Hostable | Best For | Official |
|---|---|---|---|---|
| OpenClaw | Personal AI assistant | ✅ | Always-on assistant across chat channels | 🔗 |
| Moltbook | Agent social network | ❌ | Discovering and pairing with AI agents | 🔗 |
AI agents and tools for automating desktop tasks and OS-level interactions.
| Agent | Platform | Vision-Based | Cross-Platform | Best For | GitHub |
|---|---|---|---|---|---|
| Agent S | Cross-platform | ✅ | ✅ | Research/SOTA | 🔗 |
| Bytebot | Linux (Docker) | ✅ | ✅ | Self-hosted | ❌ |
| UFO | Windows | ✅ | ❌ | Windows automation | 🔗 |
| Open-Interface | Cross-platform | ✅ | ✅ | General use | 🔗 |
| Anthropic Computer Use | API | ✅ | ✅ | Beta capability | ❌ |
| Tool | Platform | Best For |
|---|---|---|
| Ui.Vision RPA | Windows, macOS, Linux | Visual automation |
| OmniParser V2 | Cross-platform | Screen parsing |
| Tool | Platform | Key Features | GitHub |
|---|---|---|---|
| PyAutoGUI | Cross-platform | Simple API, fail-safe | ❌ |
| Nut.js | Cross-platform | Visual search, image matching | ❌ |
| OpenAdapt | Windows, macOS | Learning from demonstration | 🔗 |
Tutorials, how-tos, and in-depth guides for getting the most out of AI models and tools.
A beginner-friendly introduction to AI models and how to start using them effectively.
| Concept | Description |
|---|---|
| Parameters | Size of model (B = billions). More = more capable |
| Context Window | How much text model can process (128K standard) |
| Tokens | Basic units of text (~0.75 words per token) |
| Method | Best For | Setup Difficulty |
|---|---|---|
| Web Interfaces | Quick experiments | Easiest |
| API Access | Building applications | Easy |
| Self-Hosting | Privacy, no API costs | Medium-Hard |
| IDE Integration | Daily coding | Easy |
| Task | Free Option | Premium Option |
|---|---|---|
| Chat | Llama 4 (self-hosted) | GPT-5, Claude |
| Coding | DeepSeek-Coder-V2 | Claude Opus 4.6 |
| Reasoning | DeepSeek-R1 | Gemini 3 Deep Think, o3 |
| Long docs | Llama 4 Scout | Gemini 3 Flash |
| Vision | Llama 4 Maverick | GPT-5, Gemini 3 |
A comprehensive guide to choosing the right AI model for your specific needs.
| Need | 🆓 Free / Self-Host | 💎 Best Quality | ⚡ Fast / Autonomous |
|---|---|---|---|
| 💻 Coding | DeepSeek-Coder-V2 | Claude Opus 4.6 | GPT-5.3-Codex |
| 🧠 Reasoning / Math | DeepSeek-R1 | Gemini 3 Deep Think | o3 |
| 💬 General Chat | Llama 4 (self-hosted) | GPT-5, Claude Opus 4.6 | Gemini 3 Flash |
| 🎨 Vision | Llama 4 Maverick | GPT-5, Gemini 3 Pro | Gemini 3 Flash |
| 🖥️ Self-Hosting | Phi-4 | DeepSeek-V4 | vLLM / SGLang (serving) |
| Budget | Options |
|---|---|
| Free | Self-hosted (Llama 4, Qwen3, Mistral) |
| $0-10/mo | API entry tiers, Gemini Flash |
| $10-50/mo | Copilot, Claude API, GPT-5 API |
| $50+/mo | Heavy usage, multiple models |
A comprehensive guide to running AI models on your own hardware.
| Benefit | Description |
|---|---|
| Privacy | Data never leaves your infrastructure |
| Cost Control | No per-token API costs for unlimited usage |
| Customization | Fine-tune models for specific needs |
| No Rate Limits | Process as much as hardware allows |
| Offline Access | Work without internet |
For installation and usage instructions, refer to the official Ollama documentation.
Recommended apps (local-first):
- Ollama - Simple local runtime with a local HTTP API
- LM Studio - Desktop UI for downloading and running models locally
- llama.cpp - Fast local inference (CPU/GPU), great for quantized models
- Open WebUI - Optional local web UI (pairs well with local runtimes)
If you want “server-style” hosting (advanced):
Practical setup (works for both desktop and laptop):
- Install the latest NVIDIA drivers (enable GPU acceleration in your chosen app)
- Start with smaller quantized models (Q4 is a common “best default”)
- Keep context windows realistic for local hardware (lower context = faster, less memory)
- Watch VRAM first, then system RAM; reduce model size or quantization if either saturates
- Prefer running locally on
localhostand only expose to LAN if you understand firewall rules
What fits on your hardware (quick rules):
| Hardware | Good starting point | Notes |
|---|---|---|
| RTX 5090 desktop GPU | 14B–70B quantized | Best experience for coding agents and longer contexts |
| Laptop, 64 GB RAM | 7B–14B quantized | Great for offline chat/coding; keep context moderate |
| Option | Best For | Pros | Cons |
|---|---|---|---|
| Local Machine | Personal use | Simple, no latency | Limited hardware |
| Dedicated Server | Team use | Full control | Maintenance |
| Cloud GPU Rental | Experimentation | On-demand | Hourly costs |
| Kubernetes | Enterprise | Scalable | Complex |
Comprehensive pricing comparisons and cost calculations.
| Tier | Price Range | Models |
|---|---|---|
| 🆓 Free | $0 | Self-hosted, free tiers |
| 💵 Budget | $0.07 - $0.50/1M | GLM-4.7-FlashX, GLM-4-32B-0414-128K, Yi-Lightning, DeepSeek-V3.1, MiniMax-M2.5 |
| 💰 Mid-range | $0.60 - $15.00/1M | Kimi K2.5, Sonar, GLM-5, GPT-5, Claude Sonnet |
| 💎 Premium | $15.00 - $600.00/1M | Claude Opus, o1-Pro |
AI chat apps
| Product | Plans (USD) | Notes | Official Source |
|---|---|---|---|
| ChatGPT | Go $8, Plus $20, Pro $200, Business $25/seat (annual) or $30/seat (monthly), Enterprise (contact sales) | Consumer prices are US-listed; Go is localized in some markets | 🔗 |
| Claude | Pro $20, Max $100 (5×) or $200 (20×), Team/Enterprise (see pricing) | Prices shown exclude applicable taxes; availability varies by region | 🔗 |
| Google AI (Gemini) | Plus $7.99, Pro $19.99, Ultra $249.99 | US pricing; some regions/local pricing differ | 🔗 |
Coding assistants
| Tool | Plans (USD) | Notes | Official Source |
|---|---|---|---|
| GitHub Copilot | Free $0, Pro $10, Pro+ $39, Business $19/user, Enterprise $39/user | Annual options available for Pro/Pro+ | 🔗 |
| Model | Input | Output | Best For |
|---|---|---|---|
| GLM-4.7-FlashX | $0.07 | $0.40 | Fast budget tasks |
| GLM-4-32B-0414-128K | $0.10 | $0.10 | Budget chat/coding |
| DeepSeek-V3.1 | $0.27 | $0.41 | Everything |
| Gemini 3 Flash | $0.30 | $2.50 | Long context |
| MiniMax-M2.5 | $0.30 | $1.20 | Coding, long context |
| GLM-4.6 | $0.60 | $2.20 | General purpose |
| Kimi K2.5 | $0.60 | $3.00 | Multimodal + agent tasks |
| GLM-5 | $1.00 | $3.20 | Agentic engineering |
| Perplexity Sonar | $1.00 | $1.00 | Web-grounded chat (request fees apply) |
| GPT-5 | $1.25 | $10.00 | General purpose |
| Claude Sonnet 4.5 | $3.00 | $15.00 | Best coding |
| Perplexity Sonar Reasoning Pro | $2.00 | $8.00 | Reasoning + search (request fees apply) |
| Perplexity Sonar Pro | $3.00 | $15.00 | Higher quality + search (request fees apply) |
| Claude Opus 4.6 | $5.00 | $25.00 | Agentic coding |
Note: Some search-grounded models charge both token rates and per-request search/context fees. See Perplexity’s official pricing for details: https://docs.perplexity.ai/docs/getting-started/pricing
| Usage Level | Self-Host (A100) | API (GPT-5) | Winner |
|---|---|---|---|
| Light (1M tokens) | $300 (rental) | $10 | API |
| Medium (100M tokens) | $300 | $1,000 | Self-host |
| Heavy (1B tokens) | $300 | $10,000 | Self-host |
| Enterprise (10B+ tokens) | $2,000 (owned) | $100,000+ | Self-host |
Reference materials including glossary, comparison tables, and data sources.
Definitions of common terms used throughout the documentation.
| Term | Definition |
|---|---|
| Agent | AI system that autonomously performs tasks and interacts with environments |
| API | Interface for programmatically accessing AI models |
| Attention Mechanism | Neural network component focusing on relevant input parts |
| Benchmark | Standardized test measuring model performance |
| Chain-of-Thought (CoT) | Prompting technique showing step-by-step reasoning |
| Term | Definition |
|---|---|
| Fine-Tuning | Adapting pre-trained model to specific tasks |
| Frontier Model | State-of-the-art proprietary model |
| GPU | Hardware accelerator essential for ML |
| LLM | Large Language Model |
| LoRA | Efficient fine-tuning method |
| Term | Definition |
|---|---|
| MCP | Model Context Protocol for tool interaction |
| MMLU | Massive Multitask Language Understanding benchmark |
| MoE | Mixture of Experts architecture |
| Multimodal | Processing multiple input types |
| RAG | Retrieval-Augmented Generation |
| Term | Definition |
|---|---|
| Self-Hosting | Running models on own infrastructure |
| SLM | Small Language Model |
| SWE-bench | Benchmark for real GitHub issue resolution |
| Token | Basic unit of text processing |
| VRAM | GPU memory for model storage |
Side-by-side comparisons of AI models sorted by various criteria.
| 🏢 Company | 🤖 Model | 📦 Version | 📅 Release Date | 🔄 Latest Updated | 💻 Coding | 📊 Benchmarks | 💰 Price | 🖥️ Self-Host | 🔗 Official Site |
|---|---|---|---|---|---|---|---|---|---|
| 🔬 DeepSeek | DeepSeek | V4 | 2026-02-17 00:00 UTC | 2026-02-17 00:00 UTC | ✅ | N/A | Pay-per-token | ✅ | 🔗 |
| 🌐 Google DeepMind | Gemini 3 | Deep Think | 2026-02-12 00:00 UTC | 2026-02-12 00:00 UTC ⭐ | ✅ | 84.6% ARC-AGI-2 | Ultra subscription | ❌ | 🔗 |
| 🇨🇳 Zhipu AI | GLM | 5 | 2026-02-12 00:00 UTC | 2026-02-12 00:00 UTC ⭐ | ✅ | SWE-bench 77.8 | $1.00 / $3.20 | ✅ | 🔗 |
| 🤖 Anthropic | Claude | Opus 4.6 | 2026-02-05 00:00 UTC | 2026-02-05 00:00 UTC ⭐ | ✅ | SWE-bench SOTA | $5 / $25 | ❌ | 🔗 |
| 🤖 OpenAI | GPT-5 | 5.3-Codex | 2026-02-05 00:00 UTC | 2026-02-05 00:00 UTC ⭐ | ✅ | Agentic leader | TBD | ❌ | 🔗 |
| � Moonshot AI | Kimi | K2.5 | 2026-01-29 00:00 UTC | 2026-02-02 00:00 UTC ⭐ | ✅ | N/A | $0.60 / $3.00 | ❌ | 🔗 |
| � Company | 🤖 Model | 📅 Release Window | Notes | 🔗 Official Site |
|---|---|---|---|---|
| 🧠 MiniMax | MiniMax M2.5 | 2026-02 | $0.30 / $1.20 | 🔗 |
| 🇨🇳 Alibaba/Qwen | Qwen 3.5-Max | 2026-02 | Open-source release window | 🔗 |
| 🌐 Google DeepMind | Gemini 3 Pro | 2026-01 | Tiered pricing | 🔗 |
| 🤖 OpenAI | GPT-5 5.3 | 2026-01 | $1.25 / $10.00 | 🔗 |
| 💻 Mistral AI | Mistral Large 3 | 2026-01 | Open-weight | 🔗 |
| Rank | Model | Input | Output | License |
|---|---|---|---|---|
| 1 | Self-hosted | $0 | $0 | Various |
| 2 | GLM-4.7-Flash | $0 | $0 | Free |
| 3 | GLM-4.7-FlashX | $0.07 | $0.40 | API |
| 4 | GLM-4-32B-0414-128K | $0.10 | $0.10 | API |
| 5 | Yi-Lightning | $0.14 | $0.42 | Apache 2.0 |
| 6 | DeepSeek-V3.1 | $0.27 | $0.41 | MIT |
| 7 | Gemini 3 Flash | $0.30 | $2.50 | Proprietary |
| 8 | MiniMax-M2.5 | $0.30 | $1.20 | Proprietary |
| Rank | Model | HumanEval | Self-Host |
|---|---|---|---|
| 1 | Claude Sonnet 4.5 | ~92% | ❌ |
| 2 | GPT-OSS-120B | ~89% | ✅ |
| 3 | DeepSeek-Coder-V2 | ~92% | ✅ |
| 4 | Qwen3-Coder | ~92% | ✅ |
| 5 | DeepSeek-V3.1 | 82%+ | ✅ |
| Rank | Model | Context | Best For |
|---|---|---|---|
| 1 | Gemini 3 Flash | 10M | Entire libraries |
| 2 | Llama 4 Scout | 10M | Long-document RAG |
| 3 | Gemini 3 Pro | 1M+ | Research papers |
| 4 | Kimi K2.5 | 256K | Large codebases |
Attribution, verification sources, and methodology.
| Company | Source | URL |
|---|---|---|
| OpenAI | Official Documentation | openai.com |
| OpenAI | ChatGPT subscriptions (Go/Plus/Pro) | openai.com |
| OpenAI | ChatGPT Business pricing | help.openai.com |
| Anthropic | Claude Documentation | anthropic.com |
| Anthropic | Claude Pro pricing | anthropic.com |
| Anthropic | Max plan pricing | anthropic.com |
| Gemini Documentation | deepmind.google | |
| Google AI Plus pricing | blog.google | |
| Google AI Pro pricing | one.google.com | |
| Google AI Ultra pricing | blog.google | |
| GitHub | Copilot plans & pricing | github.com |
| Zhipu AI (Z.ai) | Developer Documentation | docs.z.ai |
| MiniMax | Developer Documentation | platform.minimax.io |
| MiniMax | Pricing (Pay‑as‑you‑go) | platform.minimax.io |
| Moonshot AI | Developer Documentation | platform.moonshot.ai |
| Moonshot AI | Models & Pricing | platform.moonshot.ai |
| Cohere | Developer Documentation | docs.cohere.com |
| AI21 Labs | Developer Documentation | docs.ai21.com |
| Perplexity | Developer Documentation | docs.perplexity.ai |
| ByteDance (Volcengine) | Developer Documentation | volcengine.com |
| Tencent (Hunyuan) | Cloud Documentation | cloud.tencent.com |
| Baidu (ERNIE) | AI Studio Documentation | ai.baidu.com |
| DeepSeek | Official Website | deepseek.com |
| Meta | Llama Documentation | llama.meta.com |
| Benchmark | Source | Description |
|---|---|---|
| HumanEval | OpenAI | 164 Python programming problems |
| SWE-bench | Princeton | Real GitHub issue resolution |
| MMLU | UC Berkeley | 57 subjects, multi-task |
| AIME | MAA | American Invitational Math Exam |
| ARC-AGI | ARC Prize | Abstract reasoning challenge |
- Primary Source Review - Check official documentation
- Cross-Validation - Compare multiple sources
- Timestamp Verification - All data includes verification date
- Update Tracking - Monitor official channels
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License - see the LICENSE file for details.
Last Updated: 2026-02-24 00:36 UTC
Maintained by: ReadyPixels LLC & AI Models Matrix Contributors
Made with ❤️ by ReadyPixels LLC