Titan-v2-Max

Optimized for complex reasoning and long-context windows up to 200k tokens.

LATENCY: 140ms $0.02 / 1k
Flash-v4-Nano

Sub-millisecond inference for simple tasks and conversational UI.

LATENCY: 12ms $0.002 / 1k