Will any open-source model achieve GPT-4 level performance on MMLU through 2024?
➕
Plus
23
Ṁ1795
Jan 1
83%
chance

GPT-4 currently leads the Multi-task language understanding benchmark [1] at 86.4% [2]. Will any open-source language model achieve at least 86.4% on MMLU average?

A leaderboard of open-source models can be found here.

Get
Ṁ1,000
and
S3.00
Sort by:

technically this has already been done (through clear data contamination) Should I assume this only resolves yes if there is no evidence of data contamination? Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

What counts as open-source? If hackers steal a model and put it on torrent, is it now open-source? What if a corporation releases the weights but only for research purposes not commercial purposes?

predictedYES

@DanielKokotajlo
I’ll count a model as open source if the model weights are accessible by people outside the organization.

Llama was originally released for researchers, and I would count this as open source for the purposes of this question.

If hackers put it on torrent, that’s open source too.

I realize this deviates from the definition of open source used in OSS communities. The spirit of the question is focused on malicious use and proliferation potential.

@mattt OK, thanks for the clarification. In that case this question is pretty much equivalent to mine I think: GPT4 or better model available for download by EOY 2024? | Manifold

@DanielKokotajlo yep pretty much :)

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules