Gemma 4 Master Prompts (June 2026)
Optimized prompts for Google Gemma 4 12B — the encoder-free any-to-any multimodal model. Handles text, image, audio, and video with 256K context. Apache 2.0 open weights. Laptop-class deployment.
📋 Prompt
/* GEMMA-4 MASTER PROMPT VERSION: 1.0.0 CAPABILITIES: Any-to-Any Multimodal, 256K Context, 140+ Languages ARCHITECTURE: 12B Encoder-Free, Apache 2.0 */ **Task Type:** [analysis | translation | generation | coding | creative] **Input Modality:** [text | image | audio | video | mixed] **Output Format:** [markdown | JSON | code | plain text] **Language:** [TARGET_LANGUAGE] (Gemma 4 supports 140+ natively) **Context (use the full 256K):** [DETAILED_CONTEXT — include reference docs, examples, specifications] **Instructions:** 1. [STEP_1 — modality-specific analysis] 2. [STEP_2 — processing] 3. [STEP_3 — output formatting] **Constraints:** - [CONSTRAINT_1] - [CONSTRAINT_2] Key Gemma 4 capabilities: - Encoder-free design: no separate vision/audio encoders — unified processing - 256K context window: provide extensive reference material - 140+ languages: specify target language explicitly - Apache 2.0: fully open for commercial use
💡 Tips
- Gemma 4's any-to-any design means you can mix image, audio, and text in a single prompt
- Always specify output language — 140+ supported, but defaults to input language
- 256K context is generous — include reference docs and examples directly in the prompt
- For coding tasks, provide full file context rather than snippets — the model excels at large-context understanding
- Apache 2.0 license means no usage restrictions for commercial deployment
Gemma 4 Prompt Guide
Gemma 4 12B (released June 2026) is Google’s encoder-free any-to-any multimodal model — a single unified architecture that processes text, images, audio, and video without separate modality-specific encoders. It ships with Apache 2.0 open weights, making it the most deployable multimodal open model available.
Key Capabilities
| Feature | Specification |
|---|---|
| Architecture | 12B encoder-free any-to-any |
| Context Window | 256,000 tokens |
| Languages | 140+ natively supported |
| Modalities | Text, image, audio, video |
| License | Apache 2.0 (fully open) |
| Deployment | Laptop-class (ONNX + MLX ready) |
Prompting Strategy
- Declare modalities upfront — Tell Gemma 4 what types of input you’re providing
- Use the full context — 256K tokens lets you include entire documents, codebases, or transcripts
- Specify output format — Gemma 4 responds well to structured output format directives
- Explicit language selection — For multilingual tasks, name the target language explicitly
- Sequential analysis for mixed content — Break complex multi-modal tasks into ordered steps
Deployment
Weights available via Hugging Face. QAT (Quantization-Aware Training) enables INT4/FP8 deployment on consumer hardware. ONNX and MLX ports available for Apple Silicon.
Related Prompts
Optimized prompts for NVIDIA Nemotron 3 Ultra — the first open-weight 550B hybrid Mamba-MoE model. 55B active parameters, 1M context window, 89.1 MMLU. Datacenter-scale agentic reasoning.