v2.4.1 — data refreshed hourly
Models Tracked
247
+12 this week
Total Evaluations
4.8M
+3.2%
Avg MMLU-Pro
72.4
+1.8 pts MoM
Active Users
38.2K
+7.1%
LLM Leaderboard — May 2026
weighted: MMLU-Pro 25% · HumanEval 20% · MATH 15% · GPQA 15% · IFEval 10% · Arena ELO 15%
#ModelProviderMMLU-ProHumanEvalMATHGPQAArena ELOOverall
1
Op
Claude Opus 4.7
Anthropic89.295.193.771.41328 90.6NEW
2
4o
GPT-4o (2026-04)
OpenAI88.794.392.169.81314 89.3+1.2
3
Ge
Gemini 2.5 Pro
Google DeepMind87.993.090.468.31302 87.8
4
D
DeepSeek V3.1
DeepSeek85.291.789.364.91287 85.6+0.8
5
Qw
Qwen 3-Max
Alibaba83.889.587.963.11268 83.5
6
Ll
Llama 4 Maverick
Meta AI81.988.284.860.71244 81.3
7
Gr
Grok 3
xAI80.687.383.559.41228 79.8
8
Mi
Mistral Large 3
Mistral AI78.485.981.257.81206 77.6
AI Company Rankings
by research output + product influence
#CompanyPapers (12mo)Flagship ModelsScore
1Google DeepMind184Gemini, Gemma, AlphaFold94.2
2OpenAI37GPT-4o, o3, Sora92.8
3Anthropic56Claude Opus, Sonnet, Haiku89.5
4Meta AI122Llama 4, SAM, Code Llama85.1
5DeepSeek28DeepSeek V3, R1, Coder82.7
6Alibaba91Qwen 3, Qwen-VL, Tongyi79.3
7Mistral AI31Mistral Large, Codestral74.6
Trending AI Repos — This Week
GitHub ★
#RepositoryStarsΔ
1
deepseek-ai/DeepSeek-V3
Official impl + weights
78.4K+12.3K HOT
2
anthropics/claude-code
CLI agentic coding tool
64.1K+9.8KHOT
3
langchain-ai/langgraph
Agent orchestration framework
38.2K+4.1K
4
QuivrHQ/quivr
OSS RAG second brain
41.7K+3.6K
5
microsoft/autogen
Multi-agent conversation framework
45.0K+3.2K
6
THUDM/ChatGLM-5
Bilingual open LLM
52.3K+2.9K
7
openai/whisper
Robust speech recognition
77.9K+2.4KTOP
AI Dev Tools — Top Rated
community votes
#ToolCategoryRating
1
Claude Code
Anthropic
Agentic IDE 9.6
2
Cursor
Anysphere
AI Editor 9.4
3
GitHub Copilot
Microsoft
Code Completion 9.1
4
v0 by Vercel
Vercel
UI Generation 8.9
5
Aider
Paul Gauthier
CLI Pair Programmer 8.7
6
Continue
Continue Dev
IDE Extension 8.5
7
Windsurf
Codeium
AI Editor 8.4
Benchmark Progress — Last 18 Months
MMLU-Pro top score trend
DateTop ModelMMLU-ProHumanEvalGPQA
May 2026Claude Opus 4.789.295.171.4
Mar 2026GPT-4o (2026-01)87.393.568.9
Jan 2026Claude Opus 4.585.892.766.3
Nov 2025Gemini 2.0 Pro83.190.863.7
Sep 2025GPT-4o (2025-08)81.289.461.0
Jul 2025Claude Sonnet 3.578.687.958.2
May 2025Llama 3.1 405B75.384.654.8