What will be true of the first model to cross 1400 on lmarena.ai?
➕
Plus
16
Ṁ3824
Apr 1
1.9%
Gemini Exp
2%
ChatGPT 4o
1.3%
o1
1.9%
Gemini 2.0
1.5%
Claude 3.5 Opus
1.7%
Claude 4
99%
Grok
1.8%
OpenAI model code named Orion
1%
GPT 5

Will resolve if a model stays at or above 1400 for a week and has a 95% CI with a lower bound of at least 1395 at the end of that week (somewhat arbitrary criteria to ensure the score is based on a sufficient amount of votes)

Will N/A if they change the scoring significantly so that a current model passes 1400.

Current rankings (11/22/24):

  1. Gemini Exp 1121: 1365

  2. ChatGPT 4o Latest (2024-11-20): 1360

  3. Gemini Exp 1114: 1343

  4. o1 preview: 1334

  5. o1 mini: 1308

  6. Gemini 1.5 Pro-002: 1301

  7. Grok 2 0813: 1289

  8. Yi Lightning: 1287

  9. GPT 4o 2024-05-13: 1285

  10. Claude 3.5 Sonnet (20241022): 1282

  • Update 2025-24-01 (PST): - If a Deepseek model is first to cross 1400, all will resolve to NO (AI summary of creator comment)

Get
Ṁ1,000
and
S3.00
Sort by:

Resolution criteria

"stays at or above 1400 for a week and has a 95% CI with a lower bound of at least 1395 at the end of that week"

early Grok 3 is over 1400 as of 02-16, so will need to maintain that rating until 02-22 to resolve to yes

bought Ṁ50 NO

If a Deepseek model is first to cross 1400, all of these will resolve NO

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules