cae leaderboard

Public, reproducible benchmark of CLI coding agents on SWE-bench Verified.

AgentModelPass rate# tasksSkipped Median costMedian timeMedian tokens (in+out) Last run
claude-codeglm-5.175%40$0.59186s382872026-06-08T05:23:02Z
codexMiniMax-M325%40$?325s7521982026-06-08T05:47:13Z