Human-machine intelligence parity achieved before 2028

Plus

Ṁ5210

2028

62%

chance

ALL

(Metaculites created this question with resolution date 2040.)

"A team of three expert interviewers will interact with a candidate machine system (MS) and three humans (3H). The humans will be graduate students in each of physics, mathematics and computer science from one of the top 25 research universities (per some recognized list), chosen independently of the interviewers. The interviewers will electronically communicate (via text, image, spoken word, or other means) an identical series of exam questions of their choosing over a period of two hours to the MS and 3H, designed to advantage the 3H. Both MS and 3H have full access to the internet, but no party is allowed to consult additional humans, and we assume the MS is not an internet-accessible resource. The exam will be scored blindly by a disinterested third party." The experts may come up with new questions to ask while administering the test.

If such a test is passed before 2028, then this resolves Yes. If such a test is conducted on a state-of-the-art AI in 2027, and the AI fails, then this resolves No. If neither criterion is met, then this will resolve to my credence that such a test could be passed by an existing AI system. I will not participate in this market.

EDIT: No other AI systems should be consulted. Systems which use an AI instrumentally (e.g. as in Google search results) are ok, but the test adminstrator should do their best to redact direct AI content e.g. the AI-generated QA panels at the top of certain Google queries.

This question is managed and resolved by Manifold.

#AI

#Technical AI Timelines

Get

1,000

and

3.00

8 Comments

47 Holders

146 Trades

Sort by:

Given 2 hours constraint I think o3 can do it already. Especially if there are many questions, like e.g. 30. Although "designed to advantage the 3H" is a bit vague, it must be possible to design some very adversarial questions (in the lieu of "what number is larger, 9.9 or 9.11") but it will be silly

As stated it's left open whether AI are allowed to be consulted by both sides. If they were, then this ends up being a question on the gap between the best and second best available AI system at times of testing.

I propose adding a clause that no other AI systems should be consulted. Systems which use an AI instrumentally (e.g. as in Google search results) are ok, but the test adminstrator should do their best to redact direct AI content e.g. the AI-generated QA panels at the top of certain Google queries.

If no one objects within a week, I will add this to the question text. I'm very open to debate here, since I think this is a significant ambiguity in the resolution criterion as stated.

@JacobPfau I made this change.

I've added the text "The experts may come up with new questions to ask while administering the test" to clarify.

This tests only a very limited area of intelligence which favours AI. Add an additional test that the humans and AI have to navigate autonomously from the other side of the city to the examination site for a fair fight.

predictedYES

@Toby96 This is simply a clone of a pre-existing Metaculus-bot question/market. The market was resolved N/A on Manifold because it was deleted or otherwise unfindable on Metaculus.

You're right that it's a fairly narrow area of intelligence, but if it were expanded I don't think your suggestion is very good since that is more of a test of robotics/sensory technology advancements than AI advancements. Other expansions (ideally in a different market) could include doing economic tasks, expanding the relevant knowledge domain to areas in a larger slice of STEM or outside of STEM fields, or other things that don't require robotics or other non-AI technical advances.

This is beyond parity. I'm not a graduate student from a top 25 research university, and most people aren't. Plus the questions will be designed to advantage the humans.

Comment hidden

Related questions

Related questions