Deployed on dedicated GPU
NVIDIA RTX 4090
Model + GPU instance
Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B q…
- Release
- April 2, 2026
- Parameters
- 25.8B (reported)
- Quantization
- Q4_K_M
- Size
- 18GB
- Context
- 23K