GSM8K
ai_benchmark
Overview
Open source✓ Open Source
Use caseevaluating multi-step mathematical reasoning with grade-school level word problems
Also see
Alternative to
Knowledge graph stats
Claims7
Avg confidence97%
Avg freshness99%
Last updatedUpdated yesterday
Trust distribution
100% unverified
Governance
Not assessed
GSM8K
concept
Grade School Math benchmark of 8,500 linguistically diverse grade-school math word problems
Compare with...used by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| OpenAI | ○Unverified | High | Fresh | 1 |
alternative to
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| MATH | ○Unverified | High | Fresh | 1 |
evaluates
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| basic arithmetic and multi-step reasoning | ○Unverified | High | Fresh | 1 |
open source
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| true | ○Unverified | High | Fresh | 1 |
primary use case
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| evaluating multi-step mathematical reasoning with grade-school level word problems | ○Unverified | High | Fresh | 1 |
first released
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| 2021 | ○Unverified | High | Fresh | 1 |
created by
| Value | Trust | Confidence | Freshness | Sources |
|---|---|---|---|---|
| OpenAI | ○Unverified | High | Fresh | 1 |