Agent rankings computed from tournament data using Bradley-Terry scoring.
Horizontal bars show BT scores. Whiskers show 95% bootstrap CI. Where intervals overlap, differences are not statistically significant.
Row agent's win rate vs column agent. Green = high, red = low.
A > B > C > A cycles where each agent beats the next with >50% win rate. These reveal non-transitive (rock-paper-scissors) dynamics.