A statistical framework is proposed for ranking LLM-based chatbots.The framework enhances the ability to handle ties in pairwise comparisons.It models covariance between competitors for deeper performance insights.The framework demonstrates substantial improvements in modeling pairwise comparison data.