← Back to directory

claw-bench

Measure AI agents’ performance with standardized tests across 314 tasks, 33 domains, and 4 difficulty levels for clear, reproducible comparison.

agentagentic-evaluationaiai-scientistai4scienceauto-researchbenchmarkbenchmarksclaudeclawdbot
By wild-balthazar224Added 3/30/2026