Gemma 4 Master Prompts (June 2026)

📁 LlmMultimodal 🤖 Gemma-4 📊 Intermediate 📅 Jun 12, 2026

Optimized prompts for Google Gemma 4 12B — the encoder-free any-to-any multimodal model. Handles text, image, audio, and video with 256K context. Apache 2.0 open weights. Laptop-class deployment.

📋 Prompt

/* GEMMA-4 MASTER PROMPT
   VERSION: 1.0.0
   CAPABILITIES: Any-to-Any Multimodal, 256K Context, 140+ Languages
   ARCHITECTURE: 12B Encoder-Free, Apache 2.0 */

**Task Type:** [analysis | translation | generation | coding | creative]
**Input Modality:** [text | image | audio | video | mixed]
**Output Format:** [markdown | JSON | code | plain text]
**Language:** [TARGET_LANGUAGE] (Gemma 4 supports 140+ natively)

**Context (use the full 256K):**
  [DETAILED_CONTEXT — include reference docs, examples, specifications]

**Instructions:**
  1. [STEP_1 — modality-specific analysis]
  2. [STEP_2 — processing]
  3. [STEP_3 — output formatting]

**Constraints:**
  - [CONSTRAINT_1]
  - [CONSTRAINT_2]

Key Gemma 4 capabilities:
- Encoder-free design: no separate vision/audio encoders — unified processing
- 256K context window: provide extensive reference material
- 140+ languages: specify target language explicitly
- Apache 2.0: fully open for commercial use

💡 Tips

  • Gemma 4's any-to-any design means you can mix image, audio, and text in a single prompt
  • Always specify output language — 140+ supported, but defaults to input language
  • 256K context is generous — include reference docs and examples directly in the prompt
  • For coding tasks, provide full file context rather than snippets — the model excels at large-context understanding
  • Apache 2.0 license means no usage restrictions for commercial deployment

Gemma 4 Prompt Guide

Gemma 4 12B (released June 2026) is Google’s encoder-free any-to-any multimodal model — a single unified architecture that processes text, images, audio, and video without separate modality-specific encoders. It ships with Apache 2.0 open weights, making it the most deployable multimodal open model available.

Key Capabilities

FeatureSpecification
Architecture12B encoder-free any-to-any
Context Window256,000 tokens
Languages140+ natively supported
ModalitiesText, image, audio, video
LicenseApache 2.0 (fully open)
DeploymentLaptop-class (ONNX + MLX ready)

Prompting Strategy

  1. Declare modalities upfront — Tell Gemma 4 what types of input you’re providing
  2. Use the full context — 256K tokens lets you include entire documents, codebases, or transcripts
  3. Specify output format — Gemma 4 responds well to structured output format directives
  4. Explicit language selection — For multilingual tasks, name the target language explicitly
  5. Sequential analysis for mixed content — Break complex multi-modal tasks into ordered steps

Deployment

Weights available via Hugging Face. QAT (Quantization-Aware Training) enables INT4/FP8 deployment on consumer hardware. ONNX and MLX ports available for Apple Silicon.

Related Prompts

llm nemotron-3 nvidia Nemotron-3-Ultra

Optimized prompts for NVIDIA Nemotron 3 Ultra — the first open-weight 550B hybrid Mamba-MoE model. 55B active parameters, 1M context window, 89.1 MMLU. Datacenter-scale agentic reasoning.

View