Gemma 4 26B A4B maker logoGemma 4 26B A4B - NVIDIA RTX 4090

ollama/gemma4:26b
Deploy
gemma4:26bReleased April 2, 202623K context
Best forReasoningAgentic Coding
Best fit competitor·Claude Sonnet 4.5

Why teams pick Gemma 4 26B A4B over Claude Sonnet 4.5

Run it on your own GPU with predictable flat pricing — no per-token API meter running in the background.

Apache 2.0 weights you can fine-tune, audit, and keep inside your network instead of routing prompts through a hosted API.

MoE architecture activates only 3.8B parameters per token, so you get strong reasoning quality without paying for a full dense 27B+ API bill.

About

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

Compare

Model Cost Across Durations

Live pack pricing vs typical API estimates from 11 hours through 1 month.

API estimates for GPT-4.1 and Claude Sonnet 4.5 vs Gemma 4 26B A4B on RTX 4090.

Time pack

Gemma 4 26B A4B on RTX 4090

24 hours cost

$22

Lowest

GPT-4.1

24 hours cost

$33.87

Save $11.87 vs our model

Claude Sonnet 4.5

24 hours cost

$58.06

Save $36.06 vs our model

Models in chart

  • Gemma 4 26B A4B on RTX 4090
  • GPT-4.1
  • Claude Sonnet 4.5

At a glance

Release
April 2, 2026
Parameters
25.8B (reported)
Quantization
Q4_K_M
Size
18GB
Context
23K

Benchmarks

Performance metrics for Gemma 4 26B A4B (Reasoning). Source: Artificial Analysis.

Performance indexes

31.2
Artificial Analysis
Intelligence Index
Better than 65% of models compared
22.4
Artificial Analysis
Coding Index
Better than 57% of models compared
32.1
Artificial Analysis
Agentic Index
Better than 58% of models compared

Benchmark scores

GPQA Diamond
i
Graduate-level scientific reasoning
79.2%
HLE
i
Humanity's Last Exam
18.3%
IFBench
i
Instruction-following benchmark
72.4%
τ²-Bench Telecom
i
Conversational AI agents in dual-control scenarios
43.6%
AA-LCR
i
Long context reasoning evaluation
55.7%
GDPval-AA
i
Economically valuable tasks
25.7%
CritPt
i
Research-level physics reasoning
0.0%

Apps & integrations

Choose an app below. Each guide shows how to point the app at your OpenAI-compatible endpoint.

FAQ

Frequently asked questions

Common questions about Gemma 4 26B A4B, deployment, and using it on OpenLLM Buddy.

6 questions

Ready to try it? Deploy Gemma 4 26B A4B · Browse models