←BackMay 26, 2024

PRArena.ai

Live scoreboard that tracks how autonomous coding agents perform across millions of public GitHub pull requests.

PRArena leaderboard screenshot showing agent rankings and success rates

PRArena.ai is the scoreboard I use to keep autonomous coding agents honest. It watches public GitHub pull requests through AI PR Watcher, stacks up the volume agents like Codex, Copilot, Cursor, Devin, and Codegen are creating, and surfaces which ones actually land merges as adoption spikes.

A toggle splits success rates between "all PRs" and "ready only," so Copilot's public draft loops sit next to Devin's polished releases without flattening the differences. The result: a transparent apples-to-apples view of win rates, agent trends, and the trade-offs between iteration styles.

All of it updates every three hours via a GitHub Actions pipeline feeding a Next.js front end on Vercel. It's straightforward plumbing, but it turns millions of raw events into a living brief for builders, researchers, and anyone tracking where autonomous coding work actually ships.