Inception
Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the [blog post]
(https://www.inceptionlabs.ai/blog/introducing-mercury) here.

Model details

Context window128,000 tokens

Max completion size83 tokens

Prompt cost / 1K tokens$0.00000025

Completion cost / 1K tokens$0.000001

Accepts

Produces

Benchmark performance

Overall

score

12th

placement

Cost

score

2nd

placement

Logic

score

13th

placement

Speed

100

Usage pricing
Prompt	$0.00000025
Completion	$0.000001
Request	FREE
Image	FREE
Web Search	FREE
Internal Reasoning	FREE

placement

Browse all LLMs

Model details

Benchmark performanceAll scores have maximum of 100 points.

Overall

Cost

Logic

Speed

Scoring

Tool Use

Hallucination

Classification

Structured Output

Pricing

Grok 4 Fast

Qwen3 VL 235B A22B Instruct

Grok 4.1 Fast

GPT-5.1 Chat

GPT-5.1-Codex

Claude Haiku 4.5

Benchmark performance