Berkeley's AI Agent Benchmark Tests 13 Rivals. Every One Failed.
UC Berkeley's RDI lab just built the largest AI agent benchmark to date, spanning 55 industries. And it exposed something nobody in the pitch deck circuit wants to admit: the benchmarks we've been using to measure AI agents are broken. Not a little broken. Berkeley audited