evaluates

[published]static · preferred

end-to-end web navigation and task completion by AI agents

ConfidenceRankTemporalMethod
High (97%)preferredstatichuman_curated

Sources

SourceDomainScoreAI
evaluates