Cloud BenchmarksLocal Benchmarks
API Status

☁️ Cloud Benchmarks ☁️

I run cron jobs to periodically test the token generation speed of different cloud LLM providers. The chart helps visualize the distributions of different speeds, as they can vary somewhat depending on the loads. For readability not all models are shown, but you can see the full results in the table below.

Every provider and model now has a dedicated landing page with narrative insights, SEO-friendly metadata, and structured data for search engines. Click any provider or model in the table to explore performance in depth.

I am working daily to add more providers and models, looking anywhere that does not require purchasing dedicated endpoints for hosting (why some models may appear to be missing). If you have any more suggestions let me know on GitHub!! 😊

Pick A Path In 10 Seconds

Quick recommendations from the latest 7-day benchmark slice. Use one path, jump into full results, then drill into provider/model pages.

Loading 7-day recommendations…

Fastest Models Right Now (updated <24h)

#ModelProviderSpeed
1llama-3.1-8bgroq288 tok/s
2qwen-3-32bgroq201 tok/s
3llama-3.1-8bcerebras190 tok/s
4llama-4-scoutgroq185 tok/s
5llama-3.3-70bgroq175 tok/s

πŸ“Š Speed Distribution πŸ“Š

πŸ“š Full Results πŸ“š

Showing 94 of 94 modelsFlagged statuses: likely_deprecated, deprecated, failing, stale, never_succeeded, disabled
Status
groqllama-3.1-8bActive1h ago288.00130424100.00
groqllama-4-maverickActive24d ago226.0013021470.00
groqqwen-3-32bActive1h ago201.0011284210.00
cerebrasllama-3.1-8bActive1h ago190.0013531060.00
groqllama-4-scoutActive1h ago185.007333290.00
cerebrasgpt-oss-120bActive20d ago177.0013481670.00
groqllama-3.3-70bActive1h ago175.0040340180.00
togetherllama-3.1-8bActive19d ago147.0041228220.00
groqkimi-k2Active1h ago129.0012211350.00
bedrocknova-microActive24m ago124.0064152260.00
openaio3 MiniNever Succeeded(Medium)1h ago110.0081690.00
bedrockllama-4-maverickActive24m ago105.001145530.00
openaio3-mini-2025-01-31Active1h ago105.00151600.00
bedrocknova-liteActive24m ago98.9020132300.00
bedrockllama-4-scoutActive24m ago98.203130310.00
bedrockllama-3.3-70bActive24m ago94.202128310.00
openaiGPT-5.4-nanoActive1h ago91.2042134400.00
openaiGPT-5.1-codex-maxActive1h ago90.70141171130.00
deepinframistral-7bStale(Medium)1h ago89.1010148490.00
togetherqwen-2.5-7bActive1h ago88.001139530.00
openaiGPT-5.4-nano-2026-03-17Active1h ago86.7036125450.00
openaio1Active1h ago82.102114740.00
deepinfradevstral-smallNever Succeeded(Medium)1h ago77.309140540.00
bedrocknova-proActive24m ago76.8019118390.00
openaiGPT-5.4-miniActive1h ago76.8016111480.00
googlegemini-2.5-flash-liteActive1h ago74.7010117530.00
fireworksmixtral-8x22bActive1h ago74.7028111330.00
openaigpt-4.1-nanoActive1h ago74.5018139420.00
openaiGPT-5.4-mini-2026-03-17Active1h ago73.809119530.00
openaigpt-3.5-turboActive1h ago73.004125520.00
googlegemini-2.5-flashNever Succeeded(Medium)1h ago64.6061051020.00
togetherdeepseek-r1Active1h ago64.405113570.00
openaigpt-4oActive1h ago59.4051421580.00
togethermixtral-8x7bActive1h ago58.208114190.00
fireworksllama-3.3-70bActive1h ago57.0011081270.00
openaigpt-4.1-miniActive1h ago53.4018109390.00
openaiGPT-5-chat-latestActive1h ago52.301383550.00
openaio4-mini-2025-04-16Active1h ago52.3028770.00
togetherllama-3.3-70bActive1h ago51.902121930.00
openaio4 MiniNever Succeeded(Medium)1h ago50.504770.00
togetherllama-3.2-3bActive27d ago48.80101091460.00
anthropicclaude-haiku-4.5Active1h ago48.10373680.00
bedrockllama-3.2-90bActive24m ago46.60250380.00
deepinfraQwen 2.5 Coder 32BNever Succeeded(Medium)1h ago45.901842120.00
deepinfrallama-3-8bStale(Medium)1h ago45.301869320.00
openaigpt-4.1Active1h ago43.401585510.00
bedrockmistral-largeActive24m ago40.90247530.00
deepinfrallama-3.2-1bStale(Medium)1h ago40.803100750.00
deepinfrallama-3.2-3bStale(Medium)1h ago40.10399750.00
googlegemini-2.5-proNever Succeeded(Medium)1h ago40.102651620.00
openaio3-2025-04-16Active1h ago40.009710.00
openaigpt-4o-miniActive1h ago39.00764430.00
openaio3Active1h ago38.809690.00
bedrockclaude-haiku-4.5Active24m ago38.703651200.00
deepinfrallama-3.1-8bStale(Medium)1h ago35.60278760.00
openaiGPT-5.1-2025-11-13Active1h ago33.901062840.00
openaiGPT-5.1Active1h ago33.80264940.00
openaigpt-4-turboActive7d ago33.00149520.00
bedrockclaude-3-5-haikuActive24m ago32.50538660.00
bedrockclaude-3-5-sonnetActive2d ago31.90146800.00
bedrockclaude-3-7-sonnetActive24m ago31.80242800.00
deepinfrallama-3.2-90bStale(Medium)1h ago31.60482810.00
openaiGPT-5.4Active1h ago30.10945760.00
openaiGPT-5.4-2026-03-05Active1h ago30.10842700.00
openaiGPT-5.2Active1h ago29.70447820.00
openaiGPT-5.2-2025-12-11Active1h ago29.501643760.00
deepinfrallama-3-70bStale(Medium)1h ago29.10451570.00
deepinfrallama-2-70bStale(Medium)1h ago29.00452630.00
deepinfraqwen-2.5-72bStale(Medium)1h ago28.601462470.00
openaiGPT-5.1-codexActive1h ago28.601521170.00
openaiGPT-5.1-chat-latestActive1h ago28.60352950.00
openaiGPT-5.1-codex-miniActive1h ago26.001511140.00
openaigpt-4Active1h ago25.80446640.00
openaiGPT-5.3-codexActive1h ago25.60740840.00
deepinfrallama-3.2-11bStale(Medium)1h ago24.601811470.00
deepinfrallama-3.1-405bStale(Medium)1h ago21.801391120.00
anthropicclaude-opus-4.5Active1h ago21.304311570.00
bedrockclaude-sonnet-4.5Active24m ago20.901291800.00
deepinfrallama-3.1-70bStale(Medium)1h ago20.801421080.00
anthropicclaude-4-sonnetActive1h ago19.007321920.00
deepinfrallama-3.3-70bNever Succeeded(Medium)1h ago18.701432210.00
bedrockclaude-opus-4.5Active24m ago18.201272460.00
anthropicClaude Opus 4.1Active1h ago17.807251460.00
anthropicclaude-4-opusActive1h ago17.108221330.00
openaigpt-5.2-codexActive1h ago17.101371430.00
openaiGPT-5.2-chat-latestActive1h ago10.701271530.00
openaio1-proLikely Deprecated(Medium)1h ago9.88118700.00
openaiGPT-5.2-proActive1h ago9.114144550.00
openaiGPT-5-codexActive7h ago8.131231940.00
openaio3-proActive1h ago7.58115370.00
openaio3-pro-2025-06-10Active1h ago7.42214480.00
deepinfraqwen-3-235bNever Succeeded(Medium)1h ago6.911536080.00
openaiGPT-5-proActive1h ago4.06180.00
openaiGPT-5.2-pro-2025-12-11Active7h ago1.91148010.00
Lifecycle snapshot
Loading status summary…

πŸ“ˆ Time Series πŸ“ˆ