LLM Bench

Leaderboard

Open benchmark across MMLU, ARC, GSM8K, TruthfulQA, HellaSwag & WinoGrande

Select two or more models to compare side by side