Google
Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability.

The model supports a 1M token context window and multimodal inputs including text, images, audio, video, and PDFs, with text output. It includes configurable reasoning via thinking levels (minimal, low, medium, high), structured output, tool use, and automatic context caching. Gemini 3 Flash Preview is optimized for users who want strong reasoning and agentic behavior without the cost or latency of full scale frontier models.

Model details

Context window1,048,576 tokens

Max completion size59 tokens

Prompt cost / 1K tokens$0.0000005

Completion cost / 1K tokens$0.000003

Usage pricing
Prompt	$0.0000005
Completion	$0.000003
Request	FREE
Image	FREE
Web Search	FREE
Internal Reasoning	FREE
Input Cache Read	FREE
Input Cache Write	FREE

placement

Browse all LLMs

Model details

Benchmark performanceAll scores have maximum of 100 points.

Overall

Cost

Logic

Speed

Scoring

Tool Use

Hallucination

Classification

Structured Output

Pricing

Grok 4 Fast

Qwen3 VL 235B A22B Instruct

Grok 4.1 Fast

GPT-5.1 Chat

GPT-5.1-Codex

Claude Haiku 4.5

Benchmark performance