LMArena
ai_benchmark
Overview
Developed byLMSYS Org
Open source✓ Open Source
Use casecrowdsourced LLM evaluation through blind pairwise human preference voting
Also see
Alternative to
Knowledge graph stats
Claims8
Avg confidence97%
Avg freshness99%
Last updatedUpdated 18h ago
Trust distribution
100% unverified
Governance
Not assessed
LMArena
product — also known as: LM Arena
Platform for crowdsourced LLM evaluation through blind pairwise comparisons, formerly Chatbot Arena
Compare with...used by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| Anthropic | ○Unverified | High | Fresh | 1 |
alternative to
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| LMSYS Chatbot Arena | ○Unverified | High | Fresh | 1 |
evaluates
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| overall LLM quality via Elo ratings from human preference | ○Unverified | High | Fresh | 1 |
open source
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| true | ○Unverified | High | Fresh | 1 |
primary use case
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| crowdsourced LLM evaluation through blind pairwise human preference voting | ○Unverified | High | Fresh | 1 |
first released
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| 2023 | ○Unverified | High | Fresh | 1 |
developed by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| LMSYS Org | ○Unverified | High | Fresh | 1 |
created by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| UC Berkeley | ○Unverified | High | Fresh | 1 |