Will o1 (not preview) achieve a better score on LiveBench coding than Claude 3.5 Sonnet 10/22? | Manifold

Will o1 (not preview) achieve a better score on LiveBench coding than Claude 3.5 Sonnet 10/22?

Basic

1

Ṁ75

Jan 1

75%

chance

1D

1W

1M

ALL

Per LiveBench.ai Claude 3.5 Sonnet achieves 67.13 while o1-preview gets only 50.85.

Resolves when o1 is added to the LiveBench leaderboard

This question is managed and resolved by Manifold.

#Chatbot Arena Leaderboard

Get

1,000

and

3.00

Related questions

What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?

What SimpleBench percentile range will full o1 achieve?

Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?

Will I judge GPT-5 to be smarter than o1 (not preview) after both are released?

Is Claude 3.5 Sonnet a distilled or quantized version of a larger model?

Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on LiveBench?

Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard?

Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on Simple Bench?

Will GPT-5 perform better than o1 (not preview) at AIME 2024, Codeforces elo, GPQA, or the 2024 ioi?

Will there be a reasoning model more powerful than o1-preview, and cheaper and >10x faster than o1-mini, by Nov 12 2025?

Related questions

What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?

Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on LiveBench?

What SimpleBench percentile range will full o1 achieve?

Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard?

Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?

Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on Simple Bench?

Will I judge GPT-5 to be smarter than o1 (not preview) after both are released?

Will GPT-5 perform better than o1 (not preview) at AIME 2024, Codeforces elo, GPQA, or the 2024 ioi?

Is Claude 3.5 Sonnet a distilled or quantized version of a larger model?

Will there be a reasoning model more powerful than o1-preview, and cheaper and >10x faster than o1-mini, by Nov 12 2025?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules