Will o1 (not preview) achieve a better score on LiveBench coding than Claude 3.5 Sonnet 10/22?
Basic
1
Ṁ75Jan 1
75%
chance
1D
1W
1M
ALL
Per LiveBench.ai Claude 3.5 Sonnet achieves 67.13 while o1-preview gets only 50.85.
Resolves when o1 is added to the LiveBench leaderboard
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?
Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on LiveBench?
55% chance
What SimpleBench percentile range will full o1 achieve?
Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard?
32% chance
Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?
68% chance
Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on Simple Bench?
55% chance
Will I judge GPT-5 to be smarter than o1 (not preview) after both are released?
77% chance
Will GPT-5 perform better than o1 (not preview) at AIME 2024, Codeforces elo, GPQA, or the 2024 ioi?
66% chance
Is Claude 3.5 Sonnet a distilled or quantized version of a larger model?
43% chance
Will there be a reasoning model more powerful than o1-preview, and cheaper and >10x faster than o1-mini, by Nov 12 2025?
74% chance