Benchmark Gap #6: Once we have a transfer model that achieves human-level sample efficiency on many major RL environments, how many months will it be before we have a non-transfer model that achieves the same? | Manifold

Benchmark Gap #6: Once we have a transfer model that achieves human-level sample efficiency on many major RL environments, how many months will it be before we have a non-transfer model that achieves the same?

Basic

1

Ṁ20

2050

12

expected

1D

1W

1M

ALL

Transfer model criteria:

The model can include pretrained non-RL components (e.g. it can include a language or image model (effort should have been made to avoid including states from the RL environments in the training set for any pretrained components, but this doesn't have to be perfect)).
The model can train for any amount of time on the training set of RL environments
Once transferred it must achieve mean human performance with human level sampling efficiency on >=75% of the test environments

Non-transfer model:

Can include pretrained components in the same way
Must achieve mean human performance with human level sampling efficiency on >= 75% of all the environments (there are no training vs test environments)

This question is managed and resolved by Manifold.

#Technical AI Timelines

Get

1,000

and

3.00

Related questions

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models at the end of 2025?

Will any transfer learning model, trained for any amount of time on one Atari environment, outperform the median human learning curve on most other Atari environments when transferred by 2026?

By 2026 will any RL agent with learned causal models of its environment achieve superhuman performance on >=10 Atari environments?

Will any AI model score above 95% on GRAB by the end of 2025?

Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?

Will OpenAI models achieve ≥90% on SimpleBench by the end of 2025?

Benchmark Gap #2: Once we have an algorithm with human level sample efficiency for major RL benchmarks, how many years will it be before there is an algorithm with human level sample efficiency on essentially all AAA video game tasks?

Benchmark Gap #1: Once we have a language model that achieves expert human performance on all *current* major NLP benchmarks, how many years will it be before we have an AI with human-level language skills?

Will a single model achieve superhuman performance on all OpenAI gym environments by 2025?

Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?

Related questions

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models at the end of 2025?

Will OpenAI models achieve ≥90% on SimpleBench by the end of 2025?

Will any transfer learning model, trained for any amount of time on one Atari environment, outperform the median human learning curve on most other Atari environments when transferred by 2026?

Benchmark Gap #2: Once we have an algorithm with human level sample efficiency for major RL benchmarks, how many years will it be before there is an algorithm with human level sample efficiency on essentially all AAA video game tasks?

By 2026 will any RL agent with learned causal models of its environment achieve superhuman performance on >=10 Atari environments?

Benchmark Gap #1: Once we have a language model that achieves expert human performance on all *current* major NLP benchmarks, how many years will it be before we have an AI with human-level language skills?

Will any AI model score above 95% on GRAB by the end of 2025?

Will a single model achieve superhuman performance on all OpenAI gym environments by 2025?

Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?

Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules