
Will o1 score ≥60% on the REBUS benchmark?
Plus
5
Ṁ1905Feb 28
89%
chance
1D
1W
1M
ALL
Update 2024-22-12 (PST): This market refers to the REBUS benchmark as described in the paper "REBUS: A Benchmark to Evaluate the Rationality of Language Models" (AI summary of creator comment)
This question is managed and resolved by Manifold.
Get
1,000and
3.00
Sort by:
@derikk after looking at the examples and not getting any correct and then seeing 83% as the human baseline I felt really bad till I read that humans were allowed to Google and use reverse image search.
Related questions
Related questions
Will any AI score 30% or more on Humanity's Last Exam benchmark before Ramadan 2025?
10% chance
Will an AI score over 80% on FrontierMath Benchmark in 2025
38% chance
Before what year will Al achieve 95% or higher score on the Humanity’s Last Exam benchmark?
What will be the best normalized score achieved on the original 7 RE-Bench tasks by December 31st 2025?
Will any AI get a score of at least 45% on Humanity’s Last Exam benchmark before March 11, 2025?
12% chance
Will o3's score on the Last Exam be above 30%?
22% chance
Will there be a score of 80% or higher on Humanity's Last Exam before April 1, 2025?
4% chance
What will be o3's score on Humanity's Last Exam?
Will any model get above human level on the Simple Bench benchmark before September 1st, 2025.
68% chance
Will Al achieve 85% or higher on the Humanity's Last Exam benchmark before 2030?
77% chance