What will be true of the first model to cross 1400 on lmarena.ai?

Plus

Ṁ3824

Apr 1

1.9%

Gemini Exp

ChatGPT 4o

1.3%

1.9%

Gemini 2.0

1.5%

Claude 3.5 Opus

1.7%

Claude 4

99%

Grok

1.8%

OpenAI model code named Orion

GPT 5

Will resolve if a model stays at or above 1400 for a week and has a 95% CI with a lower bound of at least 1395 at the end of that week (somewhat arbitrary criteria to ensure the score is based on a sufficient amount of votes)

Will N/A if they change the scoring significantly so that a current model passes 1400.

Current rankings (11/22/24):

Gemini Exp 1121: 1365
ChatGPT 4o Latest (2024-11-20): 1360
Gemini Exp 1114: 1343
o1 preview: 1334
o1 mini: 1308
Gemini 1.5 Pro-002: 1301
Grok 2 0813: 1289
Yi Lightning: 1287
GPT 4o 2024-05-13: 1285
Claude 3.5 Sonnet (20241022): 1282

Update 2025-24-01 (PST): - If a Deepseek model is first to cross 1400, all will resolve to NO (AI summary of creator comment)

This question is managed and resolved by Manifold.

#️ Technology

#AI

#OpenAI

#Technical AI Timelines

#LLMs

Get

1,000

and

3.00

3 Comments

15 Holders

111 Trades

Sort by:

Resolution criteria

"stays at or above 1400 for a week and has a 95% CI with a lower bound of at least 1395 at the end of that week"

early Grok 3 is over 1400 as of 02-16, so will need to maintain that rating until 02-22 to resolve to yes

bought Ṁ50 NO

If a Deepseek model is first to cross 1400, all of these will resolve NO

@ChinmayTheMathGuy also o3

Related questions

Related questions