anthropic LLM Benchmarks – Performance & Latency

Provider Snapshot

Models Tracked

Avg Tokens / Second

25.74

Avg Time to First Token (ms)

1284.00

Last Updated

Feb 8, 2026

5 anthropic models are actively benchmarked with 1179 total measurements across 1179 benchmark runs.
claude-haiku-4.5 leads the fleet with 51.50 tokens/second, while claude-4-opus delivers 18.10 tok/s.
Performance varies by 184.5% across the anthropic model lineup, indicating diverse optimization strategies for different use cases.
Avg time to first token across the fleet is 1284.00 ms, showing moderate responsiveness for interactive applications.
The anthropic model fleet shows varied performance characteristics (50.1% variation coefficient), reflecting diverse model architectures.

Provider	Model	Avg Toks/Sec	Min	Max	Avg TTF (ms)
anthropic	claude-haiku-4.5	51.50	15.50	79.80	620.00
anthropic	claude-4-sonnet	20.40	8.52	31.10	1570.00
anthropic	claude-opus-4.5	19.70	11.80	29.10	1720.00
anthropic	Claude Opus 4.1	19.00	5.42	24.50	1320.00
anthropic	claude-4-opus	18.10	4.37	23.80	1190.00

Complete list of all anthropic models tracked in the benchmark system. Click any model name to view detailed performance data.

Provider	Model	Avg Toks/Sec	Min	Max	Avg TTF (ms)
anthropic	claude-haiku-4.5	51.50	15.50	79.80	620.00
anthropic	claude-opus-4.5	19.70	11.80	29.10	1720.00
anthropic	Claude Opus 4.1	19.00	5.42	24.50	1320.00
anthropic	claude-4-sonnet	20.40	8.52	31.10	1570.00
anthropic	claude-4-opus	18.10	4.37	23.80	1190.00