Will an AI image generation model successfully generate a proper chess board by July 31, 2025
32
Ṁ2027
Aug 1
25%
chance

This market assesses whether a general-purpose AI image-generation model can consistently produce a chess board with pieces in their correct starting positions by July 31, 2025.

Resolution criteria:

  • Success is defined as generating a proper chess board with all pieces in their correct starting positions 9 out of 10 times using a single prompt

  • Prompt engineering is allowed, but must be a single prompt fed directly into the model with the first output being considered one attempt

  • No other layers or post-processing allowed

  • The model must be a general-purpose AI image generation or multimodal model (not specifically designed for chess)

References:

  • Update 2025-06-23 (PST) (AI summary of creator comment): The creator has clarified that models like Claude 3.7 will not be considered valid for this market, as they do not qualify as an image generation model in this context.

  • Update 2025-06-25 (PST) (AI summary of creator comment): The creator is considering whether a prompt that includes an image is a valid method for satisfying the market criteria. This would be for use with a multimodal model that accepts both text and image inputs.

  • Update 2025-06-29 (PST) (AI summary of creator comment): The creator has clarified that for the market to resolve to Yes, the prompt used must be a text prompt only. Prompts that include an image are not considered valid, even if used with a multimodal model.

Get
Ṁ1,000
and
S3.00
Sort by:

Okay, I have a question….

I have a prompt that gives 10/10 perfect chessboards. It satisfies all the rules of this market. BUT… some may argue that it’s against the spirit of the market, so I want everyone’s take.

The prompt is fed into ChatGPT image gen, however the prompt is more complex than a standard text prompt. The prompt includes an image (since 4o supports images in the prompt).

So, does anyone argue against this satisfying the requirements of the market?

(Of course I would post the prompt here before resolving the market, so everyone can confirm the results are reproducible)

@Hazel That seems against the spirit of the market to me because generating an image from an image is pretty easy, I could pretty easily reconfigure a reverse diffusion model that I made for a class to generate an accurate image of a chess board from accurate images of chess board by not modifying the images much. But I am also biased as a no holder.

sold Ṁ32 NO

@Hazel hmm, starting with an image does seem to cut away most of the spirit of "generation".

But IMHO you could quite fairly consider a percentage resolution if it's still relevant in a month.

@Hazel I think it depends on how the model works. If the model is just "modifying" the image, then this must be disqualifying in order for this market to have any meaning. Otherwise I could make an "AI" that does arbitrarily intelligent or unintelligent processing and outputs the results as a map of pixels changed by 1 shade of color, apply it to an input chessboard, and the output is guaranteed to be a chessboard.

If the model is acting fully transformatively; accepting the board as input, transforming it into a distinct and highly compressed internal representation, and then turning that representation into a new chessboard, then arguably that should count. This is what's happening when you get stuff like ChatGPT generating a picture of Mario when you ask for an Italian plumber.

But those seem to always include som variation on the original image, so I think more modern image-editing AIs are including extra tooling. Like phone apps that let you supply an English description of what changes to make to an input image, and the output image is identical except for the changes, that must be handled by some sort of external rule-based program that allows the AI to only modify a portion of it in some representation-space.

To test this, I took this picture from Google and asked ChatGPT and Gemini to add a duck to it.

They returned these results:

Which are not pixel-perfect copies in the non-duck areas, but are much closer to the input image than I would expect to get if I precisely described in words the position of each leaf and branch. So maybe they really are compressing the image and recreating it? Or maybe they've learned to "memorize" images in a way that's much more faithful to the original than they can manage when constructing a de-novo representation from text? Or maybe there's something weird going on with file formats and non-AI compression across the internet? I don't really know how these work.

@spiderduckpig Well the requirement of this market is a general purpose image generation model, so I don’t think a purely reverse diffusion model would count?

However, I agree with all the points above. The prompt must be a text prompt for this to resolve yes.

bought Ṁ100 YES

Claude 3.7 sonnet, with the prompt: Generate an image of a chess board, with all the pieces in their starting positions

@VedangManerikar Tbf generating a SVG seems pretty different from generating an image as a pixel raster

@spiderduckpig Especially because SVGs can just define a shape like a pawn in the text-representation format and repeatedly display it, solving the issue of consistency, and it can also just calculate the locations of all the pieces mathematically instead of implementing that in the image generation layer

In essence, generating an SVG version is about the same as asking a LLM to output a formatted text description of the locations of lines and icons that would be required for a chess board, which is ofc already possible

@VedangManerikar the title of this clearly says “image generation model.” Claude 3.7 is not.

I’m very close, but best I can get is 3/4, not 9/10

https://sora.com/g/gen_01jq7fmm26eg2v1gfx29yt4rh1
prompt:
chess board initial position, view from above

no numeration

queen and king pieces clearly distinguishable

chess.com style illustration

very close apart from inconsistent queen style

@AndreiVlasenko yes, very close but imo the queen still isn’t clearly a queen given that it looks closer to the king than the white queen

@AndreiVlasenko

Got it! Now I just need to get it 9/10 attempts.

bought Ṁ20 YES

gpt-4o native image generation (https://sora.com/explore/images) is extremely close
prompt: chess board initial position, view from above
https://sora.com/g/gen_01jq7e3g8vew6baekjxd4zt0gg

@AndreiVlasenko this one's not great, tbh, the king and queen are indistinguishable

The row numbers are also incorrect

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules