claw-bench

Name: claw-bench
Author: wild-balthazar224

Measure AI agents’ performance with standardized tests across 314 tasks, 33 domains, and 4 difficulty levels for clear, reproducible comparison.

agentagentic-evaluationaiai-scientistai4scienceauto-researchbenchmarkbenchmarksclaudeclawdbot

By wild-balthazar224Added 3/30/2026