MMLU
conceptai_benchmark
Try in Playground →RSS
Overview
Open source✓ Open Source
Use casemeasuring language model knowledge across 57 academic subjects from STEM to humanities
Also see
Alternative to
Knowledge graph stats
Claims8
Avg confidence97%
Avg freshness99%
Last updatedUpdated 15h ago
Trust distribution
100% unverified
Governance

MMLU

concept — also known as: Massive Multitask Language Understanding

Massive Multitask Language Understanding benchmark covering 57 academic subjects

Compare with...

used by

ValueTrustConfidenceFreshnessSources
GoogleUnverifiedHighFresh1
OpenAIUnverifiedHighFresh1

alternative to

ValueTrustConfidenceFreshnessSources
GPQAUnverifiedHighFresh1

evaluates

ValueTrustConfidenceFreshnessSources
world knowledge and problem-solving abilityUnverifiedHighFresh1

open source

ValueTrustConfidenceFreshnessSources
trueUnverifiedHighFresh1

primary use case

ValueTrustConfidenceFreshnessSources
measuring language model knowledge across 57 academic subjects from STEM to humanitiesUnverifiedHighFresh1

created by

ValueTrustConfidenceFreshnessSources
Dan Hendrycks et al.UnverifiedHighFresh1

first released

ValueTrustConfidenceFreshnessSources
2020UnverifiedHighFresh1

Alternatives & Similar Tools

alternative to
Compare →

Related entities

Claim count: 8Last updated: 4/10/2026Edit history